[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <4225E088-6D34-421A-91AA-E3C4A6517EB7@oracle.com>
Date: Wed, 6 Nov 2024 18:33:53 +0000
From: Aruna Ramakrishna <aruna.ramakrishna@...cle.com>
To: Thomas Gleixner <tglx@...utronix.de>,
"mingo@...hat.com"
<mingo@...hat.com>,
"dave.hansen@...ux.intel.com"
<dave.hansen@...ux.intel.com>,
"x86@...nel.org" <x86@...nel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
CC: Rudi Horn <rudi.horn@...cle.com>, Joe Jin <joe.jin@...cle.com>,
Jeff Xu
<jeffxu@...omium.org>
Subject: [RFC] Restore PKRU to user-defined value after signal handling
Hello,
The following commit, which is part of 6.12 rc, does not work consistently on systems with AMD processors vs. Intel:
70044df250d0: x86/pkeys: Update PKRU to enable all pkeys before XSAVE
This zeroes out the pkeys in handle_signal() by calling sig_prepare_pkru():
/*
* Enable all pkeys temporarily, so as to ensure that both the current
* execution stack as well as the alternate signal stack are writeable.
* The application can use any of the available pkeys to protect the
* alternate signal stack, and we don't know which one it is, so enable
* all. The PKRU register will be reset to init_pkru later in the flow,
* in fpu__clear_user_states(), and it is the application's responsibility
* to enable the appropriate pkey as the first step in the signal handler
* so that the handler does not segfault.
*/
static inline u32 sig_prepare_pkru(void)
{
u32 orig_pkru = read_pkru();
write_pkru(0);
return orig_pkru;
}
The write_pkru(0) call seems to set xinuse[9] to 0 on systems with AMD CPUs (but not Intel), which means the user-defined PKRU value overwritten in the sigframe (in update_pkru_in_sigframe()) is not restored by XRSTOR and the PKRU value stays at 0 when it returns back to userspace. Which is unexpected.
AMD:
$ ./handler-pkru
startup pkru = 0x55555554
changed in main thread pkru = 0xfffffff0
received signal 10
in signal handler pkru = 0x55555554
after usr1 signal pkru = 0x00000000
…
xcr0 207
xcr0 AND xinuse 202
writing pkru 0
xcr0 207
xcr0 AND xinuse 2
Intel:
$ ./handler-pkru
startup pkru = 0x55555554
changed in main thread pkru = 0xfffffff0
received signal 10
in signal handler pkru = 0x55555554
after usr1 signal pkru = 0xfffffff0
…
xcr0 2E7
xcr0 AND xinuse 2A2
writing pkru 0
xcr0 2E7
xcr0 AND xinuse 2A2
From the Intel manual:
“
13.6 PROCESSOR TRACKING OF XSAVE-MANAGED STATE
The following notation describes the state of the init and modified optimizations:
• XINUSE denotes the state-component bitmap corresponding to the init optimization. If XINUSE[i] = 0, state component i is known to be in its initial configuration; otherwise XINUSE[i] = 1. It is possible for XINUSE[i] to be 1 even when state component i is in its initial configuration. On a processor that does not support the init optimization, XINUSE[i] is always 1 for every value of i.
...
• PKRU state. PKRU state is in its initial configuration if the value of the PKRU is 0.
...
13.8.1 Standard Form of XRSTOR
XRSTOR updates state component i based on the value of bit i in the XSTATE_BV field of the XSAVE header:
• If XSTATE_BV[i] = 0, the state component is set to its initial configuration. Section 13.6 specifies the initial configuration of each state component.
The initial configuration of state component 1 pertains only to the XMM registers and not to MXCSR. See below for the treatment of MXCSR
• If XSTATE_BV[i] = 1, the state component is loaded with data from the XSAVE area. See Section 13.5 for specifics for each state component and for details regarding mode-specific operation and operation determined by instruction prefixes. See Section 13.13 for details regarding faults caused by memory accesses.
“
The line “PKRU state is in its initial configuration if the value of the PKRU is 0” seems to imply that when the PKRU register is set to 0, xinuse[9] is also automatically set to 0 and that is expected behavior, which causes XRSTOR to not load the register value from XSAVE area. But we do not want xinuse[9] to be set to 0 here, as we want the PKRU value to be correctly restored from the sigframe - otherwise it becomes a security issue.
I’m not really sure of the correct way to reset xinuse[9] to 1 (after wrpkru(0)) - but something like this seems to work (thanks to Rudi Horn for both finding the issue and suggesting this patch):
diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index 1065ab995305..701a163f0ac5 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -68,9 +68,35 @@ static inline bool check_xstate_in_sigframe(struct fxregs_state __user *fxbuf,
*/
static inline int update_pkru_in_sigframe(struct xregs_state __user *buf, u32 pkru)
{
+ int err = 0;
+
if (unlikely(!cpu_feature_enabled(X86_FEATURE_OSPKE)))
return 0;
- return __put_user(pkru, (unsigned int __user *)get_xsave_addr_user(buf, XFEATURE_PKRU));
+
+ if (pkru != 0) {
+ err = __put_user(pkru,
+ (unsigned int __user *)get_xsave_addr_user(
+ buf, XFEATURE_PKRU));
+ u64 xfeatures;
+ u64 __user *xfeaturesp = &buf->header.xfeatures;
+
+ err |= __get_user(xfeatures, xfeaturesp);
+
+ /*
+ * On some systems, when PKRU is set to 0, the corresponding
+ * XINUSE bit is also zeroed out, which causes XRSTOR to not
+ * load the register value from XSAVE area. Which means the
+ * PKRU value that was updated on the sigframe will be
+ * effectively discarded.
+ *
+ * Mark PKRU as in use so that it is restored correctly.
+ */
+ if (!err & !(xfeatures & XFEATURE_MASK_PKRU)) {
+ xfeatures |= XFEATURE_MASK_PKRU;
+ err |= __put_user(xfeatures, xfeaturesp);
+ }
+ }
}
I’ve tested a version of this patch on both AMD and Intel systems and it works.
Please let me know if this is acceptable, or if there’s a better way to do this.
Thanks,
Aruna
Powered by blists - more mailing lists