[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87r1jyaxum.ffs@nanos.tec.linutronix.de>
Date: Mon, 29 Mar 2021 15:33:21 +0200
From: Thomas Gleixner <tglx@...utronix.de>
To: Len Brown <lenb@...nel.org>
Cc: "Chang S. Bae" <chang.seok.bae@...el.com>,
Borislav Petkov <bp@...e.de>,
Andy Lutomirski <luto@...nel.org>,
Ingo Molnar <mingo@...nel.org>, X86 ML <x86@...nel.org>,
"Brown\, Len" <len.brown@...el.com>,
Dave Hansen <dave.hansen@...el.com>,
"Liu\, Jing2" <jing2.liu@...el.com>,
"Ravi V. Shankar" <ravi.v.shankar@...el.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v4 14/22] x86/fpu/xstate: Expand the xstate buffer on the first use of dynamic user state
On Mon, Mar 29 2021 at 09:14, Len Brown wrote:
> On Sat, Mar 20, 2021 at 6:14 PM Thomas Gleixner <tglx@...utronix.de> wrote:
>>
>> On Sun, Feb 21 2021 at 10:56, Chang S. Bae wrote:
>> > +
>> > +/* Update MSR IA32_XFD with xfirstuse_not_detected() if needed. */
>> > +static inline void xdisable_switch(struct fpu *prev, struct fpu *next)
>> > +{
>> > + if (!static_cpu_has(X86_FEATURE_XFD) || !xfirstuse_enabled())
>> > + return;
>> > +
>> > + if (unlikely(prev->state_mask != next->state_mask))
>> > + xdisable_setbits(xfirstuse_not_detected(next));
>> > +}
>>
>> So this is invoked on context switch. Toggling bit 18 of MSR_IA32_XFD
>> when it does not match. The spec document says:
>>
>> "System software may disable use of Intel AMX by clearing XCR0[18:17], by
>> clearing CR4.OSXSAVE, or by setting IA32_XFD[18]. It is recommended that
>> system software initialize AMX state (e.g., by executing TILERELEASE)
>> before doing so. This is because maintaining AMX state in a
>> non-initialized state may have negative power and performance
>> implications."
>>
>> I'm not seeing anything related to this. Is this a recommendation
>> which can be ignored or is that going to be duct taped into the code
>> base once the first user complains about slowdowns of their non AMX
>> workloads on that machine?
>
> I found the author of this passage, and he agreed to revise it to say this
> was targeted primarily at VMMs.
Why would this only a problem for VMMs?
> "negative power and performance implications" refers to the fact that
> the processor will not enter C6 when AMX INIT=0, instead it will demote
> to the next shallower C-state, eg C1E.
>
> (this is because the C6 flow doesn't save the AMX registers)
>
> For customers that have C6 enabled, the inability of a core to enter C6
> may impact the maximum turbo frequency of other cores.
That's the same on bare metal, right?
Thanks,
tglx
Powered by blists - more mailing lists