[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJZ5v0ihEvOuUWoN++5qYx=KqrRnm4np7_j6hDd=aFnHkkFYZg@mail.gmail.com>
Date: Tue, 12 Nov 2024 16:00:21 +0100
From: "Rafael J. Wysocki" <rafael@...nel.org>
To: Peter Zijlstra <peterz@...radead.org>
Cc: "Rafael J. Wysocki" <rafael@...nel.org>, Patryk Wlazlyn <patryk.wlazlyn@...ux.intel.com>, x86@...nel.org,
linux-kernel@...r.kernel.org, linux-pm@...r.kernel.org,
rafael.j.wysocki@...el.com, len.brown@...el.com,
artem.bityutskiy@...ux.intel.com, dave.hansen@...ux.intel.com,
gautham.shenoy@....com
Subject: Re: [PATCH v3 2/3] x86/smp native_play_dead: Prefer
cpuidle_play_dead() over mwait_play_dead()
On Tue, Nov 12, 2024 at 3:56 PM Peter Zijlstra <peterz@...radead.org> wrote:
>
> On Tue, Nov 12, 2024 at 02:23:14PM +0100, Rafael J. Wysocki wrote:
> > On Tue, Nov 12, 2024 at 12:47 PM Peter Zijlstra <peterz@...radead.org> wrote:
> > >
> > > On Fri, Nov 08, 2024 at 01:29:08PM +0100, Patryk Wlazlyn wrote:
> > > > The generic implementation, based on cpuid leaf 0x5, for looking up the
> > > > mwait hint for the deepest cstate, depends on them to be continuous in
> > > > range [0, NUM_SUBSTATES-1]. While that is correct on most Intel x86
> > > > platforms, it is not architectural and may not result in reaching the
> > > > most optimized idle state on some of them.
> > > >
> > > > Prefer cpuidle_play_dead() over the generic mwait_play_dead() loop and
> > > > fallback to the later in case of missing enter_dead() handler.
> > > >
> > > > Signed-off-by: Patryk Wlazlyn <patryk.wlazlyn@...ux.intel.com>
> > > > ---
> > > > arch/x86/kernel/smpboot.c | 4 ++--
> > > > 1 file changed, 2 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
> > > > index 44c40781bad6..721bb931181c 100644
> > > > --- a/arch/x86/kernel/smpboot.c
> > > > +++ b/arch/x86/kernel/smpboot.c
> > > > @@ -1416,9 +1416,9 @@ void native_play_dead(void)
> > > > play_dead_common();
> > > > tboot_shutdown(TB_SHUTDOWN_WFS);
> > > >
> > > > - mwait_play_dead();
> > > > if (cpuidle_play_dead())
> > > > - hlt_play_dead();
> > > > + mwait_play_dead();
> > > > + hlt_play_dead();
> > > > }
> > >
> > > Yeah, I don't think so. we don't want to accidentally hit
> > > acpi_idle_play_dead().
> >
> > Having inspected the code once again, I'm not sure what your concern is.
> >
> > :enter.dead() is set to acpi_idle_play_dead() for all states in ACPI
> > idle - see acpi_processor_setup_cstates() and the role of the type
> > check is to filter out bogus table entries (the "type" must be 1, 2,
> > or 3 as per the spec).
> >
> > Then cpuidle_play_dead() calls drv->states[i].enter_dead() for the
> > deepest state where it is set and if this is FFH,
> > acpi_idle_play_dead() will return an error. So after the change, the
> > code above will fall back to mwait_play_dead() then.
> >
> > Or am I missing anything?
>
> So it relies on there being a C2/C3 state enumerated and that being FFh.
> Otherwise it will find a 'working' state and we're up a creek.
>
> Typically I expect C2/C3 FFh states will be there on Intel stuff, but it
> seems awefully random to rely on this hole. AMD might unwittinly change
> the ACPI driver (they're the main user) and then we'd be up a creek.
>
> Robustly we'd teach the ACPI driver about FFh and set enter_dead on
> every state -- but we'd have to double check that with AMD.
>
> At the same time, intel_idle should then also set enter_dead on all
> states.
>
> And then the mwait case is only ever reached if CPUIDLE=n.
So that's why I would prefer intel_idle, if configured, to give
mwait_play_dead() a hint on the MWAIT hint to use. Otherwise the
latter would just fall back to the current method.
This would not be bullet-proof, but it would take the opportunity to
work better if it could.
Powered by blists - more mailing lists