lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <8079f564-cec0-45e4-857b-74b2e630a9d5@arm.com>
Date: Mon, 20 Oct 2025 15:46:52 +0200
From: Kevin Brodsky <kevin.brodsky@....com>
To: Dmitry Vyukov <dvyukov@...gle.com>, mathieu.desnoyers@...icios.com,
 peterz@...radead.org, boqun.feng@...il.com, tglx@...utronix.de,
 mingo@...hat.com, bp@...en8.de, dave.hansen@...ux.intel.com, hpa@...or.com,
 aruna.ramakrishna@...cle.com, elver@...gle.com
Cc: "Paul E. McKenney" <paulmck@...nel.org>, x86@...nel.org,
 linux-kernel@...r.kernel.org, Florian Weimer <fweimer@...hat.com>
Subject: Re: [PATCH v7 3/4] rseq: Make rseq work with protection keys

+Florian Weimer

On 21/05/2025 10:47, Dmitry Vyukov wrote:
> If an application registers rseq, and ever switches to another pkey
> protection (such that the rseq becomes inaccessible), then any
> context switch will cause failure in __rseq_handle_notify_resume()
> attempting to read/write struct rseq and/or rseq_cs. Since context
> switches are asynchronous and are outside of the application control
> (not part of the restricted code scope), temporarily switch to
> pkey value that allows access to the 0 (default) PKEY.
>
> Signed-off-by: Dmitry Vyukov <dvyukov@...gle.com>
> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
> Cc: Peter Zijlstra <peterz@...radead.org>
> Cc: "Paul E. McKenney" <paulmck@...nel.org>
> Cc: Boqun Feng <boqun.feng@...il.com>
> Cc: Thomas Gleixner <tglx@...utronix.de>
> Cc: Ingo Molnar <mingo@...hat.com>
> Cc: Borislav Petkov <bp@...en8.de>
> Cc: Dave Hansen <dave.hansen@...ux.intel.com>
> Cc: "H. Peter Anvin" <hpa@...or.com>
> Cc: Aruna Ramakrishna <aruna.ramakrishna@...cle.com>
> Cc: x86@...nel.org
> Cc: linux-kernel@...r.kernel.org
> Fixes: d7822b1e24f2 ("rseq: Introduce restartable sequences system call")
>
> ---
> Changes in v7:
>  - Added Mathieu's Reviewed-by
>
> Changes in v6:
>  - Added a comment to struct rseq with MPK rules
>
> Changes in v4:
>  - Added Fixes tag
>
> Changes in v3:
>  - simplify control flow to always enable access to 0 pkey
>
> Changes in v2:
>  - fixed typos and reworded the comment
> ---
>  include/uapi/linux/rseq.h |  4 ++++
>  kernel/rseq.c             | 11 +++++++++++
>  2 files changed, 15 insertions(+)
>
> diff --git a/include/uapi/linux/rseq.h b/include/uapi/linux/rseq.h
> index c233aae5eac90..019fd248cf749 100644
> --- a/include/uapi/linux/rseq.h
> +++ b/include/uapi/linux/rseq.h
> @@ -58,6 +58,10 @@ struct rseq_cs {
>   * contained within a single cache-line.
>   *
>   * A single struct rseq per thread is allowed.
> + *
> + * If struct rseq or struct rseq_cs is used with Memory Protection Keys,
> + * then the assigned pkey should either be accessible whenever these structs
> + * are registered/installed, or they should be protected with pkey 0.

I think it's worth pointing out that every case of async uaccess seems
to be handled differently w.r.t. pkeys. Some interesting cases:

* During signal delivery, the pkey register is reset to permissive
before writing to the signal stack (this is the logic that patch 2
touches in fact). This is handled in the same way on x86 [1] and arm64 [2].

* AFAICT io_uring worker threads inherit the user thread's pkey register
on x86, since they are forked without setting PF_KTHREAD. OTOH on arm64
the pkey register is reset to the init value (RW access to pkey 0 only)
for all non-user threads [3].

* Now with this patch accesses to struct rseq are made with whatever the
pkey register is set to when the thread is interrupted + RW access for
pkey 0.

This is clearly a difficult problem, since no approach will satisfy all
users. I believe a new API would be desirable to allow users to
configure this; see the thread at [4] regarding signal handlers.

In any case, it seems best to ignore the value of the pkey register at
the point where the thread is interrupted, because userspace has no
control on that value and it could lead to spurious faults (depending on
when the interruption happens). For rseq, maybe the most logical option
would be to set the pkey register to permissive in
__rseq_handle_notify_resume(), because there is nothing sensible to
check at that point. Checking could be done at the point of registration
instead.

- Kevin

[1]
https://lore.kernel.org/lkml/20240802061318.2140081-1-aruna.ramakrishna@oracle.com/
[2]
https://lore.kernel.org/all/20241023150511.3923558-1-kevin.brodsky@arm.com/
[3] https://lore.kernel.org/all/20241001133618.1547996-2-joey.gouly@arm.com/
[4]
https://inbox.sourceware.org/libc-alpha/87jza5pxmx.fsf@oldenburg.str.redhat.com/

>   */
>  struct rseq {
>  	/*
> diff --git a/kernel/rseq.c b/kernel/rseq.c
> index b7a1ec327e811..88fc8cb789b3b 100644
> --- a/kernel/rseq.c
> +++ b/kernel/rseq.c
> @@ -10,6 +10,7 @@
>  
>  #include <linux/sched.h>
>  #include <linux/uaccess.h>
> +#include <linux/pkeys.h>
>  #include <linux/syscalls.h>
>  #include <linux/rseq.h>
>  #include <linux/types.h>
> @@ -424,11 +425,19 @@ static int rseq_ip_fixup(struct pt_regs *regs)
>  void __rseq_handle_notify_resume(struct ksignal *ksig, struct pt_regs *regs)
>  {
>  	struct task_struct *t = current;
> +	pkey_reg_t saved_pkey;
>  	int ret, sig;
>  
>  	if (unlikely(t->flags & PF_EXITING))
>  		return;
>  
> +	/*
> +	 * Enable access to the default (0) pkey in case the thread has
> +	 * currently disabled access to it and struct rseq/rseq_cs has
> +	 * 0 pkey assigned (the only supported value for now).
> +	 */
> +	saved_pkey = enable_zero_pkey_val();
> +
>  	/*
>  	 * regs is NULL if and only if the caller is in a syscall path.  Skip
>  	 * fixup and leave rseq_cs as is so that rseq_sycall() will detect and
> @@ -441,9 +450,11 @@ void __rseq_handle_notify_resume(struct ksignal *ksig, struct pt_regs *regs)
>  	}
>  	if (unlikely(rseq_update_cpu_node_id(t)))
>  		goto error;
> +	write_pkey_val(saved_pkey);
>  	return;
>  
>  error:
> +	write_pkey_val(saved_pkey);
>  	sig = ksig ? ksig->sig : 0;
>  	force_sigsegv(sig);
>  }

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ