linux-kernel - Re: [RFC PATCH v4 0/6] sched/cpufreq: Make schedutil energy aware

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20200214125249.GL14879@hirez.programming.kicks-ass.net>
Date:   Fri, 14 Feb 2020 13:52:49 +0100
From:   Peter Zijlstra <peterz@...radead.org>
To:     Douglas Raillard <douglas.raillard@....com>
Cc:     linux-kernel@...r.kernel.org, rjw@...ysocki.net,
        viresh.kumar@...aro.org, juri.lelli@...hat.com,
        vincent.guittot@...aro.org, dietmar.eggemann@....com,
        qperret@...gle.com, linux-pm@...r.kernel.org
Subject: Re: [RFC PATCH v4 0/6] sched/cpufreq: Make schedutil energy aware

On Thu, Feb 13, 2020 at 05:49:48PM +0000, Douglas Raillard wrote:
> On 2/10/20 1:21 PM, Peter Zijlstra wrote:

> > assuming cs[].cost ~ f^3, and given our cost_margin ~ f, that leaves a
> > factor f^2 on the table.
> 
> I'm guessing that you arrived to `cost_margin ~ f` this way:
> 
> cost_margin = util - util_est_enqueued
> cost_margin = util - constant
> 
> # with constant small enough
> cost_margin ~ util
> 
> # with util ~ 1/f
> cost_margin ~ 1/f
> 
> In the case you describe, `constant` is actually almost equal to `util`
> so `cost_margin ~! util`, and that series assumes frequency invariant
> util_avg so `util !~ 1/f` (I'll probably have to fix that).

Nah, perhaps already clear from the other email; but it goes like:

  boost = util_avg - util_est
  cost_margin = boost * C = C * util_avg - C * util_est

And since u ~ f (per schedutil construction), cost_margin is a function
linear in either u or f.

> > So the higher the min_freq, the less effective the boost.
> 
> Yes, since the boost is allowing a fixed amount of extra power. Higher
> OPPs are less efficient than lower ones, so if min_freq is high, we
> won't speed up as much as if min_freq was low.
> 
> > Maybe it all works out in practise, but I'm missing a big picture
> 
> Here is a big picture :)
> 
> https://gist.github.com/douglas-raillard-arm/f76586428836ec70c6db372993e0b731#file-ramp_boost-svg
> 
> The board is a Juno R0, with a periodic task pinned on a big CPU
> (capa=1024):
> * phase 1:  5% duty cycle (=51 PELT units)
> * phase 2: 75% duty cycle (=768 PELT units)
> 
> Legend:
> * blue square wave: when the task executes (like in kernelshark)
> * base_cost = cost of frequency as selected by schedutil in normal
> operations
> * allowed_cost = base_cost + cost_margin
> * util = util_avg
> 
> note: the small gaps right after the duty cycle transition between
> t=4.15 and 4.25 are due to sugov task executing, so there is no dequeue
> and no util_est update.

I'm confused by the giant drop in frequency (blue line) around 4.18

schedutil shouldn't select f < max(util_avg, util_est), which is
violated right about there.

I'm also confused by the base_cost line; how can that be flat until
somewhere around 4.16. Sadly there is no line for pure schedutil freq to
compare against.

Other than that, I can see the green line is consistent with
util_avg>util_est, and how it help grow the frequency (blue).