linux-kernel - Re: [PATCH 6/6] cpufreq: schedutil: New governor based on scheduler utilization data

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Thu, 3 Mar 2016 17:37:35 +0100
From:	Peter Zijlstra <peterz@...radead.org>
To:	"Rafael J. Wysocki" <rafael@...nel.org>
Cc:	Vincent Guittot <vincent.guittot@...aro.org>,
	"Rafael J. Wysocki" <rjw@...ysocki.net>,
	Linux PM list <linux-pm@...r.kernel.org>,
	Juri Lelli <juri.lelli@....com>,
	Steve Muckle <steve.muckle@...aro.org>,
	ACPI Devel Maling List <linux-acpi@...r.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>,
	Viresh Kumar <viresh.kumar@...aro.org>,
	Michael Turquette <mturquette@...libre.com>
Subject: Re: [PATCH 6/6] cpufreq: schedutil: New governor based on scheduler
 utilization data

On Thu, Mar 03, 2016 at 05:24:32PM +0100, Rafael J. Wysocki wrote:
> On Thu, Mar 3, 2016 at 1:20 PM, Peter Zijlstra <peterz@...radead.org> wrote:
> > On Wed, Mar 02, 2016 at 11:49:48PM +0100, Rafael J. Wysocki wrote:
> >> >>> +       min_f = sg_policy->policy->cpuinfo.min_freq;
> >> >>> +       max_f = sg_policy->policy->cpuinfo.max_freq;
> >> >>> +       next_f = util > max ? max_f : min_f + util * (max_f - min_f) / max;
> >
> >> In case a more formal derivation of this formula is needed, it is
> >> based on the following 3 assumptions:
> >>
> >> (1) Performance is a linear function of frequency.
> >> (2) Required performance is a linear function of the utilization ratio
> >> x = util/max as provided by the scheduler (0 <= x <= 1).
> >
> >> (3) The minimum possible frequency (min_freq) corresponds to x = 0 and
> >> the maximum possible frequency (max_freq) corresponds to x = 1.
> >>
> >> (1) and (2) combined imply that
> >>
> >> f = a * x + b
> >>
> >> (f - frequency, a, b - constants to be determined) and then (3) quite
> >> trivially leads to b = min_freq and a = max_freq - min_freq.
> >
> > 3 is the problem, that just doesn't make sense and is probably the
> > reason why you see very little selection of the min freq.
> 
> It is about mapping the entire [0,1] interval to the available frequency range.

Yeah, but I don't see why that makes sense..

> I till overprovision things (the smaller x the more), but then it may
> help the race-to-idle a bit in theory.

So, since we also have the cpuidle information, could we not make a
better guess at race-to-idle?

> > Suppose a machine with the following frequencies:
> >
> >         500, 750, 1000
> >
> > And a utilization of 0.4, how does asking for 500 + 0.4 * (1000-500) =
> > 700 make any sense? Per your point 1, it should should be asking for
> > 0.4 * 1000 = 400.
> >
> > Because, per 1, at 500 it runs exactly half as fast as at 1000, and we
> > only need 0.4 times as much. Therefore 500 is more than sufficient.
> 
> OK, but then I don't see why this reasoning only applies to the lower
> bound of the frequency range.  Is there any reason why x = 1 should be
> the only point mapping to max_freq?

Well, everything that goes over the second to last freq would end up at
the last (max) freq.

Take again the 500,750,1000 example, everything that's >750 would end up
at 1000 (for relation_l, >875 for _c).

But given the platform's cpuidle information, maybe coupled with an avg
idle est, we can compute the benefit of race-to-idle and over provision
based on that, right?

> If not, then I think it's reasonable to map the middle of the
> available frequency range to x = 0.5 and then we have b = 0 and a =
> (max_freq + min_freq) / 2.

So I really think that approach falls apart on the low util bits, you
effectively always run above min speed, even if min is already vstly
over provisioned.