[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3b6f8899-2a11-31da-f67c-57e786661ef3@intel.com>
Date: Tue, 8 Jun 2021 07:55:29 -0700
From: Dave Hansen <dave.hansen@...el.com>
To: liangjs <liangjs@....edu.cn>
Cc: linux-kernel@...r.kernel.org, Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
x86@...nel.org, "H. Peter Anvin" <hpa@...or.com>
Subject: Re: arch_set_user_pkey_access only works on the current task_struct
On 6/7/21 8:16 PM, liangjs wrote:
> On Mon, 2021-06-07 at 10:52 -0700, Dave Hansen wrote:
>> On 6/5/21 6:10 AM, Jiashuo Liang wrote:
>>> I am learning the kernel implementation of the x86 PKU feature. I find the
>>> arch_set_user_pkey_access function in arch/x86/kernel/fpu/xstate.c does not
>>> use its first parameter. So it is perhaps a bug?
>> I wouldn't really call it a bug. But, yes, it is something we should
>> clean up.
> Should we remove the tsk parameter, or allow it to change the PKRU of tsk?
Probably just remove the parameter.
By the way, there's a big PKRU rework in progress. It might be best to
wait until the dust settles to poke at this.
> By the way, we are calling write_pkru, which changes both the CPU's PKRU
> and the xsave one. Why is this necessary?
PKRU affects kernel accesses to user memory. That means that you can't
run the *kernel* with an out-of-date PKRU, thus the write_pkru().
Returning to userspace blindly restores the *WHOLE* XSAVE buffer to the
regsisters. If you don't update the XSAVE buffer, the write_pkru() will
be overwritten before returning to userspace.
> If I want to change PKRU of a task_struct other than current, do I still
> need to call __write_pkru?
No. You can't do that. Seriously.
The protection keys architecture really doesn't support off-thread
manipulation of PKRU. Imagine you want to mask a bit out of PKRU, you
do the following to make key 2 memory accessible and writable:
reg = read_pkru();
reg &= 0x30;
write_pkru(reg);
Now, imagine that you tried to interrupt this poor task in the middle of
that operation. Let's say you try to *set* the bits for key 4, effectively:
pkru |= 0x300;
Now you try to do that key-4 business with an IPI.
reg = read_pkru(); // PKRU=0x30
reg &= 0x30;
-> IPI
ipireg = read_pkru(); // PKRU=0x0
ipireg |= 0x300;
write_pkru(ipireg); // PKRU=0x300
write_pkru(reg); // PKRU=0x0
You *LOST* the update from the IPI.
Powered by blists - more mailing lists