lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CACT4Y+Z=q8CX5x01o2vZA3n+VjuQD2uUKsd-osCEf8LziDUZ1Q@mail.gmail.com>
Date: Mon, 24 Feb 2025 14:35:24 +0100
From: Dmitry Vyukov <dvyukov@...gle.com>
To: Dave Hansen <dave.hansen@...el.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>, peterz@...radead.org, 
	boqun.feng@...il.com, tglx@...utronix.de, mingo@...hat.com, bp@...en8.de, 
	dave.hansen@...ux.intel.com, hpa@...or.com, aruna.ramakrishna@...cle.com, 
	elver@...gle.com, "Paul E. McKenney" <paulmck@...nel.org>, x86@...nel.org, 
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH 3/4] rseq: Make rseq work with protection keys

On Fri, 21 Feb 2025 at 22:45, Dave Hansen <dave.hansen@...el.com> wrote:
>
> On 2/21/25 13:36, Mathieu Desnoyers wrote:
> >>>
> >>
> >> Because the rseq return to userspace handler is called on every return
> >> to userspace after a task is scheduled back after preemption, I am
> >> concerned about the overhead that would be added by a WRPKRU on the
> >> fast-path, given that it acts as as barrier against speculation. Issuing
> >> WRPKRU only after checking that pkey-0 is not accessible appears to be
> >> moving the overhead to a much less common case.
> >
> > Actually, we should distinguish between two accesses here:
> >
> > A) loads/stores from/to struct rseq
> >
> > B) loads from struct rseq_cs (only happens on rseq abort)
> >
> > (A) is a fast-path executed on return to userspace after a preemption.
> > In order to make it fast, we could require that struct rseq is pkey-0
> > and typically skip any WRPKRU for this access when pkey-0 is already
> > accessible. We can add a check on rseq registration to make sure that
> > struct rseq is indeed pkey-0, and reject it with an error if not. This
> > should help make the ABI robust and less error-prone.
> >
> > Now for (B), it's a slow path. When we observe that rseq->rseq_cs is
> > not NULL, we can simply override with a permissive pkey to make sure
> > the rseq_cs access will work.
> >
> > Thoughts ?
> I think this will be the first ABI which is explicitly pkey-0-only. I
> suspect there are a few more of these that are implicit but we just
> haven't found them yet.
>
> I wouldn't have any objections about doing this, especially given
> sanity checking at rseq registration.


Thanks for the thoughtful review.

I've tried to incorporate all suggestions in v4:
https://lore.kernel.org/all/68427864e0ca38af06482c96728216c3e0973418.1740403209.git.dvyukov@google.com/T/#m431c13b6140e447d41d228fb942d9e4b8a89874a

Yes, the intention for doing the switch only on the error path was to
avoid WRPKRU on the hot path.

For my current use case (at least how I currently have it implemented)
struct rseq and altstack are indeed protected with pkey 0. So I added
a new enable_zero_pkey_val() function that lazily enables only pkey 0.

Also added Fixes tag (for everything except for signal.c refactoring).
The "fixed" commit seems to be the original rseq patch d7822b1e24f2
("rseq: Introduce restartable sequences system call") b/c PKEYs
introduction is older.

I also had to significantly rework the test to make it work with rseq
protected by pkey 0 (which means we need to revoke access to pkey 0,
which means stack/tls/errno also become inaccessible).

Please take another look at v4.

I don't have preference as to how this should get into the Linus tree.
Hopefully the changes are not too pervasive for all subsystems to
cause massive conflicts when merging.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ