[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <b18e6478-ef4b-42b3-8cc4-42467b3a0a7f@efficios.com>
Date: Mon, 24 Feb 2025 14:18:05 -0500
From: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To: Dmitry Vyukov <dvyukov@...gle.com>, peterz@...radead.org,
boqun.feng@...il.com, tglx@...utronix.de, mingo@...hat.com, bp@...en8.de,
dave.hansen@...ux.intel.com, hpa@...or.com, aruna.ramakrishna@...cle.com,
elver@...gle.com
Cc: "Paul E. McKenney" <paulmck@...nel.org>, x86@...nel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4 3/4] rseq: Make rseq work with protection keys
On 2025-02-24 08:20, Dmitry Vyukov wrote:
> If an application registers rseq, and ever switches to another pkey
> protection (such that the rseq becomes inaccessible), then any
> context switch will cause failure in __rseq_handle_notify_resume()
> attempting to read/write struct rseq and/or rseq_cs. Since context
> switches are asynchronous and are outside of the application control
> (not part of the restricted code scope), temporarily switch to
> pkey value that allows access to the 0 (default) PKEY.
This is a good start, but the plan Dave and I discussed went further
than this. Those additions are needed:
1) Add validation at rseq registration that the struct rseq is indeed
pkey-0 memory (return failure if not).
2) The pkey-0 requirement is only for struct rseq, which we can check
for at rseq registration, and happens to be the fast path. For struct
rseq_cs, this is not the same tradeoff: we cannot easily check its
associated pkey because the rseq_cs pointer is updated by userspace
when entering a critical section. But the good news is that reading
the content of struct rseq_cs is *not* a fast-path: it's only done
when preempting/delivering a signal over a thread which has a
non-NULL rseq_cs pointer.
Therefore reading the struct rseq_cs content should be done with
write_permissive_pkey_val(), giving access to all pkeys.
Thanks,
Mathieu
>
> Signed-off-by: Dmitry Vyukov <dvyukov@...gle.com>
> Cc: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
> Cc: Peter Zijlstra <peterz@...radead.org>
> Cc: "Paul E. McKenney" <paulmck@...nel.org>
> Cc: Boqun Feng <boqun.feng@...il.com>
> Cc: Thomas Gleixner <tglx@...utronix.de>
> Cc: Ingo Molnar <mingo@...hat.com>
> Cc: Borislav Petkov <bp@...en8.de>
> Cc: Dave Hansen <dave.hansen@...ux.intel.com>
> Cc: "H. Peter Anvin" <hpa@...or.com>
> Cc: Aruna Ramakrishna <aruna.ramakrishna@...cle.com>
> Cc: x86@...nel.org
> Cc: linux-kernel@...r.kernel.org
> Fixes: d7822b1e24f2 ("rseq: Introduce restartable sequences system call")
>
> ---
> Changes in v4:
> - Added Fixes tag
>
> Changes in v3:
> - simplify control flow to always enable access to 0 pkey
>
> Changes in v2:
> - fixed typos and reworded the comment
> ---
> kernel/rseq.c | 11 +++++++++++
> 1 file changed, 11 insertions(+)
>
> diff --git a/kernel/rseq.c b/kernel/rseq.c
> index 2cb16091ec0ae..9d9c976d3b78c 100644
> --- a/kernel/rseq.c
> +++ b/kernel/rseq.c
> @@ -10,6 +10,7 @@
>
> #include <linux/sched.h>
> #include <linux/uaccess.h>
> +#include <linux/pkeys.h>
> #include <linux/syscalls.h>
> #include <linux/rseq.h>
> #include <linux/types.h>
> @@ -402,11 +403,19 @@ static int rseq_ip_fixup(struct pt_regs *regs)
> void __rseq_handle_notify_resume(struct ksignal *ksig, struct pt_regs *regs)
> {
> struct task_struct *t = current;
> + pkey_reg_t saved_pkey;
> int ret, sig;
>
> if (unlikely(t->flags & PF_EXITING))
> return;
>
> + /*
> + * Enable access to the default (0) pkey in case the thread has
> + * currently disabled access to it and struct rseq/rseq_cs has
> + * 0 pkey assigned (the only supported value for now).
> + */
> + saved_pkey = enable_zero_pkey_val();
> +
> /*
> * regs is NULL if and only if the caller is in a syscall path. Skip
> * fixup and leave rseq_cs as is so that rseq_sycall() will detect and
> @@ -419,9 +428,11 @@ void __rseq_handle_notify_resume(struct ksignal *ksig, struct pt_regs *regs)
> }
> if (unlikely(rseq_update_cpu_node_id(t)))
> goto error;
> + write_pkey_val(saved_pkey);
> return;
>
> error:
> + write_pkey_val(saved_pkey);
> sig = ksig ? ksig->sig : 0;
> force_sigsegv(sig);
> }
--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com
Powered by blists - more mailing lists