[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c793e1d0-e508-4cf5-a18b-29d30d5e401f@intel.com>
Date: Fri, 21 Feb 2025 12:50:56 -0800
From: Dave Hansen <dave.hansen@...el.com>
To: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
Dmitry Vyukov <dvyukov@...gle.com>, peterz@...radead.org,
boqun.feng@...il.com, tglx@...utronix.de, mingo@...hat.com, bp@...en8.de,
dave.hansen@...ux.intel.com, hpa@...or.com, aruna.ramakrishna@...cle.com,
elver@...gle.com
Cc: "Paul E. McKenney" <paulmck@...nel.org>, x86@...nel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH 3/4] rseq: Make rseq work with protection keys
On 2/21/25 12:05, Mathieu Desnoyers wrote:
> On 2025-02-21 14:48, Dave Hansen wrote:
>> On 2/21/25 11:38, Mathieu Desnoyers wrote:
>>> I agree that switching to permissive key in the fast path would be
>>> simpler. AFAIU, the switch_to_permissive_pkey_reg() is only a pkey
>>> read when the key is already permissive.
>>
>> Unfortunately, on x86, PKRU is almost never in its permissive state. We
>> chose a policy (stored in the global init_pkru_value variable) that
>> allows R/W access to pkey 0, but disables access to everything else.
>> It's 0xfffffff5, IIRC.
>>
>> This ensures deny-by-default behavior and ensures that threads cloned
>> off long ago don't have a dangerous PKRU value for newly-allocated and
>> pkey-protected memory.
>>
>> If I had a time machine, it'd be interesting to go back and try to make
>> PKRU's default value be all 0's and also represent the logically most
>> restrictive value.
>
> Can we assume (or require) that struct rseq and struct rseq_cs reside in
> pkey-0 memory ?
Maybe. Signal stacks are _practically_ only able to use pkey-0. You can
technically protect them with anything you want and then WRPKRU as the
first instruction once you hop into the signal handler (since
instruction fetches aren't affected by x86 pkeys), but I seriously doubt
anybody would go to the trouble.
> In that case, we could add something to the pkey API that switches to a
> permissive state only if pkey 0 cannot be accessed.
>
> Therefore it would only trigger a pkey read in the common case, and
> issue a pkey write only if pkey 0 is not accessible.
I think that's a sane policy. An rseq access can happen at any time
(from the app's perspective) so the access would theoretically be done
with a random PKRU value from a random point in the thread's lifetime.
But it is a different policy that we've chosen with signals and "remote"
accesses, which is to just ignore pkeys entirely.
I don't have a strong opinion. It's hard to balance performance and
consistency with the other ABI here.
Powered by blists - more mailing lists