lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 28 Oct 2022 17:04:48 +0200
From:   "Rafael J. Wysocki" <rafael@...nel.org>
To:     Kajetan Puchalski <kajetan.puchalski@....com>
Cc:     "Rafael J. Wysocki" <rafael@...nel.org>, daniel.lezcano@...aro.org,
        lukasz.luba@....com, Dietmar.Eggemann@....com, dsmythies@...us.net,
        yu.chen.surf@...il.com, linux-pm@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH v2 0/1] cpuidle: teo: Introduce optional util-awareness

On Fri, Oct 28, 2022 at 5:01 PM Kajetan Puchalski
<kajetan.puchalski@....com> wrote:
>
> On Fri, Oct 28, 2022 at 03:12:43PM +0200, Rafael J. Wysocki wrote:
>
> > > The result being that this util-aware TEO variant while using much less
> > > C1 and decreasing the percentage of too deep sleeps from ~24% to ~3% in
> > > PCMark Web Browsing also uses almost 2% less power. Clearly the power is
> > > being wasted on not hitting C1 residency over and over.
> >
> > Hmm.  The PCMark Web Browsing table in your cover letter doesn't indicate that.
> >
> > The "gmean power usage" there for "teo + util-aware" is 205, whereas
> > for "teo" alone it is 187.8.  This is still arguably balanced by the
> > latency difference (~100 us vs ~185 us, respectively), but this looks
> > like trading energy for performance.
>
> In this case yes, I meant 2% less compared to menu but you're right of
> course.
>
> [...]
>
> > Definitely it should not be changed if the previous state is a polling
> > one which can be checked right away.  That would take care of the
> > "Intel case" automatically.
>
> Makes sense, I already used the polling flag to implement this in this other
> governor I mentioned.
>
> >
> > > Should make it much less intense for Intel systems.
> >
> > So I think that this adjustment only makes sense if the current
> > candidate state is state 1 and state 0 is not polling.  In the other
> > cases the cost of missing an opportunity to save energy would be too
> > high for the observed performance gain.
>
> Interesting, but only applying it to C1 and only when C0 isn't polling would
> make it effectively not do anything on Intel systems, right?

Indeed.

> From what I've seen on Doug's plots even C1 is hardly ever used on his platform, most
> sleeps end up in the deepest possible state.

That depends a lot on the workload.  There are workloads in which C1
is mostly used and the deeper idle states aren't.

> Checking for the polling flag is a good idea regardless so I can send a
> v3 with that. If you'd like me to also restrict the entire mechanism to
> only working on C1 as you suggested then I'm okay with including that in
> the v3 as well. What do you think?

It would be good to do that and see if there are any significant
differences in the results.

> Thanks a lot for all your time & input,

No problem at all.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ