[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <cc87a7ae-4022-45fb-9ec9-c75c65d886c1@intel.com>
Date: Fri, 22 Nov 2024 16:10:40 -0800
From: Dave Hansen <dave.hansen@...el.com>
To: Aruna Ramakrishna <aruna.ramakrishna@...cle.com>,
linux-kernel@...r.kernel.org
Cc: x86@...nel.org, dave.hansen@...ux.intel.com, tglx@...utronix.de,
mingo@...nel.org, rudi.horn@...cle.com, joe.jin@...cle.com
Subject: Re: [PATCH v3 2/2] x86/pkeys: Set XSTATE_BV[PKRU] to 1 so that PKRU
is XRSTOR'd correctly
On 11/19/24 09:45, Aruna Ramakrishna wrote:
> PKRU value is not XRSTOR'd from the XSAVE area if the corresponding
> XSTATE_BV[i] bit is 0. A wrpkru(0) sets XSTATE_BV[PKRU] to 0 on AMD
> systems, which means the PKRU value updated on the sigframe later on,
> in update_pkru_in_sigframe(), is ignored.
>
> To make this behavior consistent across Intel and AMD systems, and to
> ensure that the PKRU value updated on the sigframe is always restored
> correctly, explicitly set XSTATE_BV[PKRU] to 1.
>
> Fixes: 70044df250d0 ("x86/pkeys: Update PKRU to enable all pkeys before XSAVE")
>
> Signed-off-by: Aruna Ramakrishna <aruna.ramakrishna@...cle.com>
> Suggested-by: Rudi Horn <rudi.horn@...cle.com>
I still think this changelog needs quite a bit of work for someone to
make sense of this if they read it in a year. Perhaps:
--
When XSTATE_BV[i] is 0, and XRSTOR attempts to restore state component
'i' it ignores any value in the XSAVE buffer and instead restores the
state component's init value.
This means that if XSAVE writes XSTATE_BV[PKRU]=0 then XRSTOR will
ignore the value that update_pkru_in_sigframe() writes to the XSAVE buffer.
XSTATE_BV[PKRU] only gets written as 0 if PKRU is in its init state. On
Intel CPUs, basically never happens because the kernel usually
overwrites the init value (aside: this is why we didn't notice this bug
until now). But on AMD, the init tracker is more aggressive and will
track PKRU as being in its init state upon any wrpkru(0x0).
Unfortunately, sig_prepare_pkru() does just that: wrpkru(0x0).
To fix this, always overwrite the sigframe XSTATE_BV with a value that
has XSTATE_BV[PKRU]==1. This ensures that XRSTOR will not ignore what
update_pkru_in_sigframe() wrote.
The problematic sequence of events is something like this:
Userspace does:
* wrpkru(0xffff0000) (or whatever)
* Hardware sets: XINUSE[PKRU]=1
Signal happens, kernel is entered:
* sig_prepare_pkru() => wrpkru(0x00000000)
* Hardware sets: XINUSE[PKRU]=0 (aggressive AMD init tracker)
* XSAVE writes most of XSAVE buffer, including
XSTATE_BV[PKRU]=XINUSE[PKRU]=0
* update_pkru_in_sigframe() overwrite PKRU in XSAVE buffer
... signal handling
* XRSTOR sees XSTATE_BV[PKRU]==0, ignores just-written value
from update_pkru_in_sigframe()
But otherwise, I think the code is fine:
Acked-by: Dave Hansen <dave.hansen@...ux.intel.com>
I can fix up the changelog at application time if everyone is OK with it.
Powered by blists - more mailing lists