[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241112145618.GR22801@noisy.programming.kicks-ass.net>
Date: Tue, 12 Nov 2024 15:56:18 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: "Rafael J. Wysocki" <rafael@...nel.org>
Cc: Patryk Wlazlyn <patryk.wlazlyn@...ux.intel.com>, x86@...nel.org,
linux-kernel@...r.kernel.org, linux-pm@...r.kernel.org,
rafael.j.wysocki@...el.com, len.brown@...el.com,
artem.bityutskiy@...ux.intel.com, dave.hansen@...ux.intel.com,
gautham.shenoy@....com
Subject: Re: [PATCH v3 2/3] x86/smp native_play_dead: Prefer
cpuidle_play_dead() over mwait_play_dead()
On Tue, Nov 12, 2024 at 02:23:14PM +0100, Rafael J. Wysocki wrote:
> On Tue, Nov 12, 2024 at 12:47 PM Peter Zijlstra <peterz@...radead.org> wrote:
> >
> > On Fri, Nov 08, 2024 at 01:29:08PM +0100, Patryk Wlazlyn wrote:
> > > The generic implementation, based on cpuid leaf 0x5, for looking up the
> > > mwait hint for the deepest cstate, depends on them to be continuous in
> > > range [0, NUM_SUBSTATES-1]. While that is correct on most Intel x86
> > > platforms, it is not architectural and may not result in reaching the
> > > most optimized idle state on some of them.
> > >
> > > Prefer cpuidle_play_dead() over the generic mwait_play_dead() loop and
> > > fallback to the later in case of missing enter_dead() handler.
> > >
> > > Signed-off-by: Patryk Wlazlyn <patryk.wlazlyn@...ux.intel.com>
> > > ---
> > > arch/x86/kernel/smpboot.c | 4 ++--
> > > 1 file changed, 2 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
> > > index 44c40781bad6..721bb931181c 100644
> > > --- a/arch/x86/kernel/smpboot.c
> > > +++ b/arch/x86/kernel/smpboot.c
> > > @@ -1416,9 +1416,9 @@ void native_play_dead(void)
> > > play_dead_common();
> > > tboot_shutdown(TB_SHUTDOWN_WFS);
> > >
> > > - mwait_play_dead();
> > > if (cpuidle_play_dead())
> > > - hlt_play_dead();
> > > + mwait_play_dead();
> > > + hlt_play_dead();
> > > }
> >
> > Yeah, I don't think so. we don't want to accidentally hit
> > acpi_idle_play_dead().
>
> Having inspected the code once again, I'm not sure what your concern is.
>
> :enter.dead() is set to acpi_idle_play_dead() for all states in ACPI
> idle - see acpi_processor_setup_cstates() and the role of the type
> check is to filter out bogus table entries (the "type" must be 1, 2,
> or 3 as per the spec).
>
> Then cpuidle_play_dead() calls drv->states[i].enter_dead() for the
> deepest state where it is set and if this is FFH,
> acpi_idle_play_dead() will return an error. So after the change, the
> code above will fall back to mwait_play_dead() then.
>
> Or am I missing anything?
So it relies on there being a C2/C3 state enumerated and that being FFh.
Otherwise it will find a 'working' state and we're up a creek.
Typically I expect C2/C3 FFh states will be there on Intel stuff, but it
seems awefully random to rely on this hole. AMD might unwittinly change
the ACPI driver (they're the main user) and then we'd be up a creek.
Robustly we'd teach the ACPI driver about FFh and set enter_dead on
every state -- but we'd have to double check that with AMD.
At the same time, intel_idle should then also set enter_dead on all
states.
And then the mwait case is only ever reached if CPUIDLE=n.
Powered by blists - more mailing lists