lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20240619095312.hsessn7layxpeyzj@e127648.arm.com>
Date: Wed, 19 Jun 2024 10:53:12 +0100
From: Christian Loehle <christian.loehle@....com>
To: Ulf Hansson <ulf.hansson@...aro.org>
Cc: Qais Yousef <qyousef@...alina.io>, linux-pm@...r.kernel.org,
	linux-kernel@...r.kernel.org, rafael@...nel.org,
	vincent.guittot@...aro.org, peterz@...radead.org,
	daniel.lezcano@...aro.org, anna-maria@...utronix.de,
	kajetan.puchalski@....com, lukasz.luba@....com,
	dietmar.eggemann@....com
Subject: Re: [PATCH 1/6] cpuidle: teo: Increase util-threshold

On Mon, Jun 10, 2024 at 11:57:55AM +0200, Ulf Hansson wrote:
> On Mon, 10 Jun 2024 at 00:47, Qais Yousef <qyousef@...alina.io> wrote:
> >
> > On 06/06/24 10:00, Christian Loehle wrote:
> > > Increase the util-threshold by a lot as it was low enough for some
> > > minor load to always be active, especially on smaller CPUs.
> > >
> > > For small cap CPUs (Pixel6) the util threshold is as low as 1.
> > > For CPUs of capacity <64 it is 0. So ensure it is at a minimum, too.
> > >
> > > Fixes: 9ce0f7c4bc64 ("cpuidle: teo: Introduce util-awareness")
> > > Reported-by: Qais Yousef <qyousef@...alina.io>
> > > Reported-by: Vincent Guittot <vincent.guittot@...aro.org>
> > > Suggested-by: Kajetan Puchalski <kajetan.puchalski@....com>
> > > Signed-off-by: Christian Loehle <christian.loehle@....com>
> > > ---
> > >  drivers/cpuidle/governors/teo.c | 11 +++++------
> > >  1 file changed, 5 insertions(+), 6 deletions(-)
> > >
> > > diff --git a/drivers/cpuidle/governors/teo.c b/drivers/cpuidle/governors/teo.c
> > > index 7244f71c59c5..45f43e2ee02d 100644
> > > --- a/drivers/cpuidle/governors/teo.c
> > > +++ b/drivers/cpuidle/governors/teo.c
> > > @@ -146,13 +146,11 @@
> > >   * The number of bits to shift the CPU's capacity by in order to determine
> > >   * the utilized threshold.
> > >   *
> > > - * 6 was chosen based on testing as the number that achieved the best balance
> > > - * of power and performance on average.
> > > - *
> > >   * The resulting threshold is high enough to not be triggered by background
> > > - * noise and low enough to react quickly when activity starts to ramp up.
> > > + * noise.
> > >   */
> > > -#define UTIL_THRESHOLD_SHIFT 6
> > > +#define UTIL_THRESHOLD_SHIFT 2
> > > +#define UTIL_THRESHOLD_MIN 50
> > >
> > >  /*
> > >   * The PULSE value is added to metrics when they grow and the DECAY_SHIFT value
> > > @@ -671,7 +669,8 @@ static int teo_enable_device(struct cpuidle_driver *drv,
> > >       int i;
> > >
> > >       memset(cpu_data, 0, sizeof(*cpu_data));
> > > -     cpu_data->util_threshold = max_capacity >> UTIL_THRESHOLD_SHIFT;
> > > +     cpu_data->util_threshold = max(UTIL_THRESHOLD_MIN,
> > > +                             max_capacity >> UTIL_THRESHOLD_SHIFT);
> >
> > Thanks for trying to fix this. But I am afraid this is not a solution. There's
> > no magic number that can truly work here - we tried. As I tried to explain
> > before, a higher util value doesn't mean long idle time is unlikely. And
> > blocked load can cause problems where a decay can take too long.
> >
> > We are following up with the suggestions I have thrown back then and we'll
> > share results if anything actually works.
> >
> > For now, I think a revert is more appropriate. There was some perf benefit, but
> > the power regressions were bad and there's no threshold value that actually
> > works. The thresholding concept itself is incorrect and flawed - it seemed the
> > correct thing back then, yes. But in a hindsight now it doesn't work.
> >
>
> For the record, I fully agree with the above. A revert seems to be the
> best option in my opinion too.
>
> Besides for the above reasons; when using cpuidle-psci with PSCI OSI
> mode, the approach leads to disabling *all* of cluster's idle-states
> too, as those aren't even visible for the teo governor. I am sure,
> that was not the intent with commit 9ce0f7c4bc64.
>

Just for my understanding, what does OSI mode have to do with this?
My understanding is that util-awareness (and also intercepts) will
disable cluster idle for the entire cluster of the affected CPU.
Why would this not be the commit's intent? The assumption of
util-awareness is that this CPU will a) be woken up early anyway
and b) the exit latency of cluster idle will affect performance (of
the utilized CPU).

We can argue if the assumption of util-awareness is true, but I don't
quite follow your argument with OSI. The behavior is the same with PC
mode, isn't it?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ