lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 22 Feb 2021 15:01:51 +0000
From:   Vincent Donnefort <vincent.donnefort@....com>
To:     Quentin Perret <qperret@...gle.com>
Cc:     peterz@...radead.org, mingo@...hat.com, vincent.guittot@...aro.org,
        dietmar.eggemann@....com, linux-kernel@...r.kernel.org,
        patrick.bellasi@...bug.net, valentin.schneider@....com
Subject: Re: [PATCH] sched/fair: Fix task utilization accountability in
 cpu_util_next()

On Mon, Feb 22, 2021 at 12:23:04PM +0000, Quentin Perret wrote:
> On Monday 22 Feb 2021 at 11:36:03 (+0000), Vincent Donnefort wrote:
> > Here's with real life numbers.
> > 
> > The task: util_avg=3 (1) util_est=11 (2)
> > 
> > pd0 (CPU-0, CPU-1, CPU-2)
> > 
> >  cpu_util_next(CPU-0, NULL): 7
> >  cpu_util_next(CPU-1, NULL): 3
> >  cpu_util_next(CPU-2, NULL): 0 <- Most capacity, try to place task here.
> > 
> >  cpu_util_next(CPU-2, task): 0 + 11 (2)
> > 
> > 
> > pd1 (CPU-3):
> > 
> >  cpu_util_next(CPU-3, NULL): 77
> > 
> >  cpu_util_next(CPU-3, task): 77 + 3 (1)
> > 
> > 
> > On pd0, the task contribution is 11. On pd1, it is 3.
> 
> Yes but that accurately reflects what the task's impact on frequency
> selection of those CPUs if it was enqueued there, right?
> 
> This is an important property we should aim to keep, the frequency
> prediction needs to be in sync with the actual frequency request, or
> the energy estimate will be off.

You mean that it could lead to a wrong frequency estimation when doing
freq = map_util_freq() in em_cpu_energy()?

But in any case, the computed energy, being the product of sum_util with the
OPP's cost, it is directly affected by this util_avg/util_est difference.

In the case where the task placement doesn't change the OPP, which is often the
case, we can simplify the comparison and end-up with the following:

  delta_energy(CPU-3): OPP3 cost * (cpu_util_avg + task_util_avg - cpu_util_avg)
  delta_energy(CPU-2): OPP2 cost * (cpu_util_est + task_util_est - cpu_util_est)

  => OPP3 cost * task_util_avg < task_util_est * OPP2 cost

With the same example I described previously, if you add the scaled OPP cost of
0.76 for CPU-3 and 0.65 for CPU-2 (real life OPP scaled costs), we have:

  2.3 (CPU-3) < 7.15 (CPU-2)

The task is placed on CPU-3, while it would have been much more efficient to use
CPU-2.

> 
> > When computing the energy
> > deltas, pd0's is likely to be higher than pd1's, only because the task
> > contribution is higher for one comparison than the other.
> 
> You mean the contribution to sum_util right? I think I see what you mean
> but I'm still not sure if this really is an issue. This is how util_est
> works, and the EM stuff is just consistent with that.
> 
> The issue you describe can only happen (I think) when a rq's util_avg is
> larger than its util-est emwa by some margin (that has to do with the
> ewma-util_avg delta for the task?). But that means the ewma is not to be
> trusted to begin with, so ...

cfs_rq->avg.util_est.ewma is not used. cpu_util() will only return the max
between ue.enqueued and util_avg.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ