[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJZ5v0jyPb7Qy81RKsuCmwQrbdGMCO71OtzxaCSnwY4f9T-SQw@mail.gmail.com>
Date: Tue, 26 Nov 2024 12:45:59 +0100
From: "Rafael J. Wysocki" <rafael@...nel.org>
To: Patryk Wlazlyn <patryk.wlazlyn@...ux.intel.com>
Cc: "Rafael J. Wysocki" <rafael@...nel.org>, x86@...nel.org, linux-kernel@...r.kernel.org,
linux-pm@...r.kernel.org, rafael.j.wysocki@...el.com, len.brown@...el.com,
artem.bityutskiy@...ux.intel.com, dave.hansen@...ux.intel.com,
peterz@...radead.org, tglx@...utronix.de, gautham.shenoy@....com
Subject: Re: [RFC PATCH v4 4/8] x86/smp: Allow calling mwait_play_dead with
arbitrary hint
On Tue, Nov 26, 2024 at 12:37 PM Patryk Wlazlyn
<patryk.wlazlyn@...ux.intel.com> wrote:
>
> >> The MWAIT instruction needs different hints on different CPUs to reach
> >> the most specific idle states. The current hint calculation* in
> >> mwait_play_dead() code works in practice on current hardware, but it
> >> fails on a recent one, Intel's Sierra Forest and possibly some future ones.
> >> Those newer CPUs' power efficiency suffers when the CPU is put offline.
> >>
> >> * The current calculation is the for loop inspecting edx in
> >> mwait_play_dead()
> >>
> >> The current implementation for looking up the mwait hint for the deepest
> >> cstate, in mwait_play_dead() code works by inspecting CPUID leaf 0x5 and
> >> calculates the mwait hint based on the number of reported substates.
> >> This approach depends on the hints associated with them to be continuous
> >> in the range [0, NUM_SUBSTATES-1]. This continuity is not documented and
> >> is not met on the recent Intel platforms.
> >>
> >> For example, Intel's Sierra Forest report two cstates with two substates
> >> each in cpuid leaf 5:
> >>
> >> Name* target cstate target subcstate (mwait hint)
> >> ===========================================================
> >> C1 0x00 0x00
> >> C1E 0x00 0x01
> >>
> >> -- 0x10 ----
> >>
> >> C6S 0x20 0x22
> >> C6P 0x20 0x23
> >>
> >> -- 0x30 ----
> >>
> >> /* No more (sub)states all the way down to the end. */
> >> ===========================================================
> >>
> >> * Names of the cstates are not included in the CPUID leaf 0x5, they are
> >> taken from the product specific documentation.
> >>
> >> Notice that hints 0x20 and 0x21 are skipped entirely for the target
> >> cstate 0x20 (C6), being a cause of the problem for the current cpuid
> >> leaf 0x5 algorithm.
> >>
> >> Allow cpuidle code to provide mwait play dead loop with known, mwait
> >> hint for the deepest idle state on a given platform, skipping the cpuid
> >> based calculation.
> >>
> >> Signed-off-by: Patryk Wlazlyn <patryk.wlazlyn@...ux.intel.com>
> >
> > I'm going to risk saying that the changelog doesn't match the code
> > changes in the patch any more.
> >
> > The code changes are actually relatively straightforward: The bottom
> > half of mwait_play_dead() is split off, so it can be called from
> > multiple places.
> >
> > The other places from which to call it are cpuidle drivers
> > implementing :enter_dead() callbacks that may want to use MWAIT as the
> > idle state entry method. The ACPI processor_idle driver and
> > intel_idle will be updated by subsequent patches to do so.
> >
> > The reason for it is mostly consistency: If the cpuidle driver uses a
> > specific idle state for things like suspend-to-idle, it is better to
> > let it decide what idle state to use for "play_dead" because it may
> > know better.
> >
> > Another reason is what mwait_play_dead() does to determine the MWAIT
> > argument (referred to as a "hint"), but IMO it belongs in a changelog
> > of a later patch because this one doesn't actually do anything about
> > it. In fact, it is not expected to change the behavior of the code.
>
> The commit message here is to justify the change. I thought that giving
> some context is important. Do you suggest moving this under a
> different commit or don't mention the SRF and C6 states at all?
I would move it to the patch that actually makes a difference for SRF.
This one doesn't do that.
Powered by blists - more mailing lists