[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJZ5v0iYYYpg7MDf8_UmoUuzyiPMoPdjgSJmdBXGYCxVc4icWw@mail.gmail.com>
Date: Tue, 12 Nov 2024 15:56:22 +0100
From: "Rafael J. Wysocki" <rafael@...nel.org>
To: Peter Zijlstra <peterz@...radead.org>, artem.bityutskiy@...ux.intel.com
Cc: "Rafael J. Wysocki" <rafael@...nel.org>, Patryk Wlazlyn <patryk.wlazlyn@...ux.intel.com>, x86@...nel.org,
linux-kernel@...r.kernel.org, linux-pm@...r.kernel.org,
rafael.j.wysocki@...el.com, len.brown@...el.com, dave.hansen@...ux.intel.com
Subject: Re: [PATCH v3 2/3] x86/smp native_play_dead: Prefer
cpuidle_play_dead() over mwait_play_dead()
On Tue, Nov 12, 2024 at 2:50 PM Peter Zijlstra <peterz@...radead.org> wrote:
>
> On Tue, Nov 12, 2024 at 01:30:29PM +0100, Rafael J. Wysocki wrote:
>
> > > > Then we are back to the original approach though:
> > > >
> > > > https://lore.kernel.org/linux-pm/20241029101507.7188-3-patryk.wlazlyn@linux.intel.com/
> > >
> > > Well, that won't be brilliant for hybrid systems where the available
> > > states are different per CPU.
> >
> > But they aren't.
> >
> > At least so far that has not been the case on any platform known to me
> > and I'm not aware of any plans to make that happen (guess what, some
> > other OSes may be unhappy).
>
> Well, that's something at least.
>
> > > Also, all of this is a bit of a trainwreck... AFAICT AMD wants IO based
> > > idle (per the 2018 commit). So they want the ACPI thing.
> >
> > Yes.
> >
> > > But on Intel we really don't want HLT, and had that MWAIT, but that has
> > > real problems with KEXEC. And I don't think we can rely on INTEL_IDLE=y.
> >
> > We could because it handles ACPI now and ACPI idle doesn't add any
> > value on top of it except for the IO-based idle case.
>
> You're saying we can mandate INTEL_IDLE=y? Because currently defconfig
> doesn't even have it on.
It is conceivable.
> > > The ACPI thing doesn't support FFh states for it's enter_dead(), should it?
> >
> > It does AFAICS, but the FFH is still MWAIT.
>
> What I'm trying to say is that acpi_idle_play_dead() doesn't seem to
> support FFh and as such won't ever use MWAIT.
Right, but if it finds an FFH state deeper than C1, it will fall back
to the next play_dead method.
> > > Anyway, ideally x86 would grow a new instruction to offline a CPU, both
> > > MWAIT and HLT have problems vs non-maskable interrupts.
> > >
> > > I really don't know what is best here, maybe moving that whole CPUID
> > > loop to boot, store the value in a per-cpu mwait_play_dead_hint. Have
> > > AMD explicitly clear the value, and avoid mwait when 0 -- hint 0 is
> > > equal to HLT anyway.
> > >
> > > But as said, we need a new instruction.
> >
> > Before that, there is the problem with the MWAIT hint computation in
> > mwait_play_dead() and in fact intel_idle does know what hint to use in
> > there.
>
> But we need to deal witn INTEL_IDLE=n.
Then the code would do what it is doing today, as a matter of fallback.
> Also, I don't see any MWAIT_LEAF
> parsing in intel_idle.c. Yes, it requests the information, but then it
> mostly ignores it -- it only consumes two ECX bits or so.
>
> I don't see it finding a max-cstate from mwait_substates anywhere.
No, it gets this either from _CST or from a built-in table for the
given processor model.
> So given we don't have any such code, why can't we simply fix the cstate
> parsing we have in mwait_play_dead() and call it a day?
I'll leave this one to Artem, but there is at least one reason to
avoid doing that I know about: There is no guarantee that whatever has
been found was actually validated.
Powered by blists - more mailing lists