linux-kernel - Re: [patch V6 01/11] rseq: Add fields and constants for time slice extension

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87y0lzhn39.ffs@tglx>
Date: Wed, 14 Jan 2026 22:59:54 +0100
From: Thomas Gleixner <tglx@...nel.org>
To: Florian Weimer <fweimer@...hat.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>, LKML
 <linux-kernel@...r.kernel.org>, "Paul E. McKenney" <paulmck@...nel.org>,
 Boqun Feng <boqun.feng@...il.com>, Jonathan Corbet <corbet@....net>,
 Prakash Sangappa <prakash.sangappa@...cle.com>, Madadi Vineeth Reddy
 <vineethr@...ux.ibm.com>, K Prateek Nayak <kprateek.nayak@....com>, Steven
 Rostedt <rostedt@...dmis.org>, Sebastian Andrzej Siewior
 <bigeasy@...utronix.de>, Arnd Bergmann <arnd@...db.de>,
 linux-arch@...r.kernel.org, Randy Dunlap <rdunlap@...radead.org>, Peter
 Zijlstra <peterz@...radead.org>, Ron Geva <rongevarg@...il.com>, Waiman
 Long <longman@...hat.com>, "carlos@...hat.com" <carlos@...hat.com>, Michael
 Jeanson <mjeanson@...icios.com>
Subject: Re: [patch V6 01/11] rseq: Add fields and constants for time slice
 extension

On Wed, Jan 14 2026 at 00:45, Florian Weimer wrote:
> * Thomas Gleixner:
>> I'm not completely opposed to make it process wide. For threads created
>> after enablement, that's trivial because that can be done when the per
>> thread RSEQ is registered. But when it gets enabled _after_ threads have
>> been created already then we need code to chase the threads and enable
>> it after the fact because we are not going to query the enablement in
>> curr->mm::whatever just to have another conditional and another
>> cacheline to access.
>
> In glibc, we make sure that the registration for restartable sequences
> happens before any user code (with the exception of IFUNC resolvers) can
> run.  This includes code from signal handlers.  We started masking
> signals on newly created threads for this reason, to make these
> partially initialized states unobservable.
>
> It's not clear to me what the expected outcome is.  If we ever want to
> offer deadline extension as a mutex attribute (for example), then we
> have to switch this on at process start unconditionally because we don't
> know if this new API will be used by the new process (potentially after
> dlopen, so we can't even use things likely analyzing the symbol
> footprint ahead of time).

Sure, but then you can enable it at each thread start, no?

>> The only option is to reject enablement when there is already more than
>> one thread in the process, but there is a reasonable argument that a
>> process might only enable it for a subset of threads, which have actual
>> lock interaction and not bother with it for other things. I'm not seeing
>> a reason to restrict the flexibility of configuration just because you
>> envision magic use cases all over the place.
>
> Sure, but it looks like this needs a custom/minimal libc.  It's like
> repurposing set_robust_list for something else.  It can be done, but it
> has a significant cost in terms of compatibility because some
> functionality (that other libraries in the process depend on) will stop
> working.

The kernel is not there to cater magic user space expectations. It
provides interfaces and the minimal amount of policy.

If glibc wants to use it for mutexes (for all the wrong reasons) then
glibc needs to take care of enabling it like it does for registering
RSEQ for each newly created thread.

If glibc does not and the application does care for their particular
concurrency control, then it is the application's problem to ensure that
it is enabled for the threads it cares about, right?

>> On the other hand there is no guarantee that libc registers RSEQ when a
>> thread is started as it can be disabled or not supported, so you have
>> exactly the same problem there that the code which wants to use it needs
>> to ensure that a RSEQ area is registered, no?
>
> With glibc, if RSEQ is registered on the main thread, it will be
> registered on all other threads, too.  Technically, it's possible to
> unregister RSEQ with the kernel, of course, but that's totally
> undefined, like unmapping memory originally returned from malloc.

This is again user land policy. glibc decides to register RSEQ for each
new thread, but the kernel does not care whether it does or not.

>>>> The prctl allows you to query the state, so all parties can make
>>>> informed decisions. It's not any different from other mechanisms, which
>>>> require coordination between different parts.
>>>
>>> I'm fine with having prctl enable the feature (for the whole process)
>>> and query its state.
>>>
>>> The part I'm concerned with is the prctl disabling the feature, as
>>> we're losing the availability invariant after setup.
>>
>>   close(0);
>>
>> has the same problem. How many instances of bugs in that area have you
>> seen so far?
>
> We've had significant issues due to incorrect close calls (maybe not
> close(0) in particular, but definitely with double-closes removing
> descriptors created by other threads.

That's again not a kernel problem. The primary UNIX design principle is
to allow user space to shoot itself into the foot. There is zero reason
to change that unless it's a justified security issue.

   Time slice extension best effort magic does definitely qualify for
   that. It's harmless as the only side effect is that user space wastes
   cycles...

Thanks,

        tglx