linux-kernel - Re: [RFC PATCH v2 0/1] cpuidle: teo: Introduce optional util-awareness

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Y1vum4BECMf2BXQW@e126311.manchester.arm.com>
Date:   Fri, 28 Oct 2022 16:00:43 +0100
From:   Kajetan Puchalski <kajetan.puchalski@....com>
To:     "Rafael J. Wysocki" <rafael@...nel.org>
Cc:     daniel.lezcano@...aro.org, lukasz.luba@....com,
        Dietmar.Eggemann@....com, dsmythies@...us.net,
        yu.chen.surf@...il.com, linux-pm@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH v2 0/1] cpuidle: teo: Introduce optional
 util-awareness

On Fri, Oct 28, 2022 at 03:12:43PM +0200, Rafael J. Wysocki wrote:

> > The result being that this util-aware TEO variant while using much less
> > C1 and decreasing the percentage of too deep sleeps from ~24% to ~3% in
> > PCMark Web Browsing also uses almost 2% less power. Clearly the power is
> > being wasted on not hitting C1 residency over and over.
> 
> Hmm.  The PCMark Web Browsing table in your cover letter doesn't indicate that.
> 
> The "gmean power usage" there for "teo + util-aware" is 205, whereas
> for "teo" alone it is 187.8.  This is still arguably balanced by the
> latency difference (~100 us vs ~185 us, respectively), but this looks
> like trading energy for performance.

In this case yes, I meant 2% less compared to menu but you're right of
course.

[...]

> Definitely it should not be changed if the previous state is a polling
> one which can be checked right away.  That would take care of the
> "Intel case" automatically.

Makes sense, I already used the polling flag to implement this in this other
governor I mentioned.

> 
> > Should make it much less intense for Intel systems.
> 
> So I think that this adjustment only makes sense if the current
> candidate state is state 1 and state 0 is not polling.  In the other
> cases the cost of missing an opportunity to save energy would be too
> high for the observed performance gain.

Interesting, but only applying it to C1 and only when C0 isn't polling would
make it effectively not do anything on Intel systems, right? From what I've
seen on Doug's plots even C1 is hardly ever used on his platform, most
sleeps end up in the deepest possible state.

Checking for the polling flag is a good idea regardless so I can send a
v3 with that. If you'd like me to also restrict the entire mechanism to
only working on C1 as you suggested then I'm okay with including that in
the v3 as well. What do you think?

Thanks a lot for all your time & input,
Kajetan