linux-kernel - Re: [PATCH v7 3/4] rseq: Make rseq work with protection keys

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <lhuecpk30ub.fsf@oldenburg.str.redhat.com>
Date: Wed, 26 Nov 2025 23:06:52 +0100
From: Florian Weimer <fweimer@...hat.com>
To: Thomas Gleixner <tglx@...utronix.de>
Cc: Kevin Brodsky <kevin.brodsky@....com>,  Dmitry Vyukov
 <dvyukov@...gle.com>,  mathieu.desnoyers@...icios.com,
  peterz@...radead.org,  boqun.feng@...il.com,  mingo@...hat.com,
  bp@...en8.de,  dave.hansen@...ux.intel.com,  hpa@...or.com,
  aruna.ramakrishna@...cle.com,  elver@...gle.com,  "Paul E. McKenney"
 <paulmck@...nel.org>,  x86@...nel.org,  linux-kernel@...r.kernel.org,
  Jens Axboe <axboe@...nel.dk>
Subject: Re: [PATCH v7 3/4] rseq: Make rseq work with protection keys

* Thomas Gleixner:

>>> What do we have to take into account:
>>>
>>>    1) signals
>>>
>>>       Broken as we know already.
>>>
>>>       IMO, the proper solution is to provide a mechanism to register a
>>>       set of permissions which are used for signal delivery. The
>>>       resulting hardware value should expand the permission, but keep
>>>       the current active ones enabled.
>>>
>>>       That can be kinda kept backwards compatible as the signal perms
>>>       would default to PKEY0.
>>
>> I had validated at one point that this works (although the patch that
>> enables internal pkeys usage in glibc did not exist back then).
>>
>>   pkeys: Support setting access rights for signal handlers
>>   <https://lore.kernel.org/linux-mm/5fee976a-42d4-d469-7058-b78ad8897219@redhat.com/>
>
> That looks about right and what I had in mind. Seems I missed that back
> in the days and that discussion unfortunately ran into a dead end :(

There was a follow-up where I tried to incorporate the feedback
(PKEY_ALLOC_SIGNALINHERIT), but based on more recent discussions (here
and before that), the original approach referenced above seems
preferable.

>>>    2) rseq
>>>
>>>       The option of having a separate key which needs to be always
>>>       enabled is definitely simple, but it wastes a key just for
>>>       that. There are only 16 of them :(
>>>
>>>       If we solve the signal case with an explicit permission set, we
>>>       can just reuse those signal permissions. They are maybe wider than
>>>       what's required to access RSEQ, but the signal permissions have to
>>>       include the TLS/RSEQ area to actually work.
>>
>> Would it address the use case for single-colored memory access?  Or
>> would that still crash if the process gets descheduled while the access
>> rights register is set to the restricted value?
>
> It would just work the same way as signals. Assume
>
>          signal_perms = [PK0=RW, PK1=R, PK2=RW]
>
>          set_pkey(PK0..6=NONE, PK7=R)
>
>          access()              <- can fault
>                                <- or interrupt can happen
>
>          set_pkey(normal)
>
> So when the fault or interrupt results in a signal and/or the return to
> user space needs to access RSEQ we have in signal delivery:
>
>          cur = pkey_extend(signal_perms);
>
> --> Perms are now [PK0=RW, PK1=R, PK2=RW, PK7=R]         
>
>          access_user_stack();
>          ....
>          // Return with the extended permissions to deliver the signal
>          // Will be restored on sigreturn
>
> and in rseq:
>
>          cur = pkey_extend(signal_perms);
>
> --> Perms are now [PK0=RW, PK1=R, PK2=RW, PK7=R]         
>
>          access_user_rseq();
>          pkey_set(cur);
>
> If the RSEQ access is nested in the signal delivery return then nothing
> happens as the permissions are not changing because they are already
> extended: A | A = A :).

Agreed.  And the pkey_extend/pkey_set don't have a prohibitive cost, I
assume.  I got the impression you were trying to avoid that sequence,
but I think it's more about defining the way pkey_extend works.

There's an unmerged glibc patch that allocates a protection key for the
dynamic linker, so we might end up with every process using rseq
(without critical sections) and protection keys.

Thanks,
Florian