linux-kernel - Re: [PATCH v7 3/4] rseq: Make rseq work with protection keys

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <lhuv7ix5ec4.fsf@oldenburg.str.redhat.com>
Date: Wed, 26 Nov 2025 10:32:27 +0100
From: Florian Weimer <fweimer@...hat.com>
To: Thomas Gleixner <tglx@...utronix.de>
Cc: Kevin Brodsky <kevin.brodsky@....com>,  Dmitry Vyukov
 <dvyukov@...gle.com>,  mathieu.desnoyers@...icios.com,
  peterz@...radead.org,  boqun.feng@...il.com,  mingo@...hat.com,
  bp@...en8.de,  dave.hansen@...ux.intel.com,  hpa@...or.com,
  aruna.ramakrishna@...cle.com,  elver@...gle.com,  "Paul E. McKenney"
 <paulmck@...nel.org>,  x86@...nel.org,  linux-kernel@...r.kernel.org
Subject: Re: [PATCH v7 3/4] rseq: Make rseq work with protection keys

* Thomas Gleixner:

> That's all broken. Assume:
>
>   1) process starts with pkey 0 (default)
>   2) glibc creates TLS (protected by pkey 0)
>   3) main() switches to protection pkey 1
>
> If the switch to pkey 1 does not ensure that TLS (where RSEQ sits) is
> accessible by pkey 1, then how is userspace able to survive?
>
> You then do not even need the help of the kernel to die. If the process
> accesses TLS it dies on it's own.

Signals have the same problem.  With the x86 approach to disable all
access, protection keys are not really usable without tight control over
all code in the process.  This behavior breaks encapsulation.

I'm less concerned about the impact on restart of restartable sequences
because by design, it's a non-modular feature: syscalls and function
calls are already banned.  If the code wants to restart, it has to make
sure that the access rights at the restart point are correct.  But
that's like any other register contents, I think.

In the other direction, code that sets a restrictive access mask is
already not allowed to call into arbitrary code.  For example, we could
use protection keys internally within glibc in the dynamic linker and
require that a key that we allocated retains read access.

Unfortunately, there's a use case for singleton access rights that does
not include key 0: validate that a pointer points to memory colored in a
specific way (e.g, for vtables, or for bytecode).

If the kernel/scheduler cannot bypass restrictions on access key 0, then
supporting this kind of memory color check is rather difficult because
userspace would always have to put key 0 into the accessible set.

Would it help to allocate a dedicated key for rseq and specify that
userspace must always include this access in the accessible set?

In glibc, we cannot easily set a different key for the TLS area today
because it's not necessarily on an isolated page on which we could call
pkey_mprotect.  We plan to fix this next year, but it's not a trivial
change.

On the other hand, I get the idea that protection keys are pretty dead.
So far, I couldn't get the x86 signal issue fixed in the kernel, so we
can't use them for glibc hardening.  AArch64 duplicated the x86
behavior, too.  And POWER removed protection key support with the switch
to the radix MMU.

Thanks,
Florian