linux-kernel - Re: [RFC PATCH v2 0/1] cpuidle: teo: Introduce optional util-awareness

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <CAJZ5v0iGM84i72m2eJkF6k8PkFqHw4gTDQhYV-K9D_VPWNKW7A@mail.gmail.com>
Date:   Fri, 28 Oct 2022 17:08:26 +0200
From:   "Rafael J. Wysocki" <rafael@...nel.org>
To:     Kajetan Puchalski <kajetan.puchalski@....com>
Cc:     daniel.lezcano@...aro.org, lukasz.luba@....com,
        Dietmar.Eggemann@....com, dsmythies@...us.net,
        yu.chen.surf@...il.com, linux-pm@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH v2 0/1] cpuidle: teo: Introduce optional util-awareness

On Fri, Oct 28, 2022 at 5:04 PM Rafael J. Wysocki <rafael@...nel.org> wrote:
>
> On Fri, Oct 28, 2022 at 5:01 PM Kajetan Puchalski
> <kajetan.puchalski@....com> wrote:
> >
> > On Fri, Oct 28, 2022 at 03:12:43PM +0200, Rafael J. Wysocki wrote:
> >
> > > > The result being that this util-aware TEO variant while using much less
> > > > C1 and decreasing the percentage of too deep sleeps from ~24% to ~3% in
> > > > PCMark Web Browsing also uses almost 2% less power. Clearly the power is
> > > > being wasted on not hitting C1 residency over and over.
> > >
> > > Hmm.  The PCMark Web Browsing table in your cover letter doesn't indicate that.
> > >
> > > The "gmean power usage" there for "teo + util-aware" is 205, whereas
> > > for "teo" alone it is 187.8.  This is still arguably balanced by the
> > > latency difference (~100 us vs ~185 us, respectively), but this looks
> > > like trading energy for performance.
> >
> > In this case yes, I meant 2% less compared to menu but you're right of
> > course.
> >
> > [...]
> >
> > > Definitely it should not be changed if the previous state is a polling
> > > one which can be checked right away.  That would take care of the
> > > "Intel case" automatically.
> >
> > Makes sense, I already used the polling flag to implement this in this other
> > governor I mentioned.
> >
> > >
> > > > Should make it much less intense for Intel systems.
> > >
> > > So I think that this adjustment only makes sense if the current
> > > candidate state is state 1 and state 0 is not polling.  In the other
> > > cases the cost of missing an opportunity to save energy would be too
> > > high for the observed performance gain.
> >
> > Interesting, but only applying it to C1 and only when C0 isn't polling would
> > make it effectively not do anything on Intel systems, right?
>
> Indeed.
>
> > From what I've seen on Doug's plots even C1 is hardly ever used on his platform, most
> > sleeps end up in the deepest possible state.
>
> That depends a lot on the workload.  There are workloads in which C1
> is mostly used and the deeper idle states aren't.
>
> > Checking for the polling flag is a good idea regardless so I can send a
> > v3 with that. If you'd like me to also restrict the entire mechanism to
> > only working on C1 as you suggested then I'm okay with including that in
> > the v3 as well. What do you think?
>
> It would be good to do that and see if there are any significant
> differences in the results.

BTW, you may as well drop the extra #ifdeffery from the v3, I don't
think that it is particularly useful.