linux-kernel - Re: [PATCH v2 3/3] PM / EM: Skip inefficient OPPs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20210525092137.GA369979@e124901.cambridge.arm.com>
Date:   Tue, 25 May 2021 10:21:37 +0100
From:   Vincent Donnefort <vincent.donnefort@....com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     rjw@...ysocki.net, viresh.kumar@...aro.org,
        vincent.guittot@...aro.org, qperret@...gle.com,
        linux-kernel@...r.kernel.org, ionela.voinescu@....com,
        lukasz.luba@....com, dietmar.eggemann@....com
Subject: Re: [PATCH v2 3/3] PM / EM: Skip inefficient OPPs

On Tue, May 25, 2021 at 10:48:23AM +0200, Peter Zijlstra wrote:
> On Fri, May 21, 2021 at 05:54:24PM +0100, Vincent Donnefort wrote:
> > diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
> > index 4f09afd..5a91a2b 100644
> > --- a/kernel/sched/cpufreq_schedutil.c
> > +++ b/kernel/sched/cpufreq_schedutil.c
> > @@ -10,6 +10,7 @@
> >  
> >  #include "sched.h"
> >  
> > +#include <linux/energy_model.h>
> >  #include <linux/sched/cpufreq.h>
> >  #include <trace/events/power.h>
> >  
> > @@ -153,6 +154,9 @@ static unsigned int get_next_freq(struct sugov_policy *sg_policy,
> >  
> >  	freq = map_util_freq(util, freq, max);
> >  
> > +	/* Avoid inefficient performance states */
> > +	freq = em_pd_get_efficient_freq(em_cpu_get(policy->cpu), freq);
> > +
> >  	if (freq == sg_policy->cached_raw_freq && !sg_policy->need_freq_update)
> >  		return sg_policy->next_freq;
> >  
> 
> This seems somewhat unfortunate, it adds a loop over the OPPs only to
> then call into cpufreq to do the exact same thing again :/

Indeed, but it would be complicated to avoid the double loop:

It is possible to register OPPs (and by extension perf_states) for a
frequency for which, the cpufreq table entry is marked with
CPUFREQ_ENTRY_INVALID. It would probably be an issue that would have to be
fixed in the driver, but it is currently allowed.

More importantly, while resolving the frequency, we also cache the index in
cached_resolved_idx. Some drivers, such as qcom-cpufreq-hw rely on this
value for their fastswitch support.

Notice though, we would iterate over the EM only in the case where the
performance state has found inefficiencies.