linux-kernel - Re: [patch 00/12] rseq: Implement time slice extension mechanism

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <3d16490f-e4d3-4e91-af17-62018e789da9@efficios.com>
Date: Fri, 12 Sep 2025 08:33:42 -0400
From: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To: Thomas Gleixner <tglx@...utronix.de>, LKML <linux-kernel@...r.kernel.org>
Cc: Peter Zilstra <peterz@...radead.org>,
 "Paul E. McKenney" <paulmck@...nel.org>, Boqun Feng <boqun.feng@...il.com>,
 Jonathan Corbet <corbet@....net>,
 Prakash Sangappa <prakash.sangappa@...cle.com>,
 Madadi Vineeth Reddy <vineethr@...ux.ibm.com>,
 K Prateek Nayak <kprateek.nayak@....com>,
 Steven Rostedt <rostedt@...dmis.org>,
 Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
 Arnd Bergmann <arnd@...db.de>, linux-arch@...r.kernel.org
Subject: Re: [patch 00/12] rseq: Implement time slice extension mechanism

On 2025-09-11 16:18, Thomas Gleixner wrote:
> On Thu, Sep 11 2025 at 11:27, Mathieu Desnoyers wrote:
>> On 2025-09-08 18:59, Thomas Gleixner wrote:
[...]
>> Does it "have" to ? What is the consequence of misbehaving ?
> 
> It receives SIGSEGV because that means that it did not follow the rules
> and stuck an arbitrary syscall into the critical section.

Not following the rules could also be done by just looping for a long
time in userspace within or after the critical section, in which case
the timer should catch it.

> 
>> I wonder if we could achieve this without the cpu-local atomic, and
>> just rely on simple relaxed-atomic or volatile loads/stores and compiler
>> barriers in userspace. Let's say we have:
>>
>> union {
>> 	u16 slice_ctrl;
>> 	struct {
>> 		u8 rseq->slice_request;
>> 		u8 rseq->slice_grant;
> 
> Interesting way to define a struct member :)

This goes with the usual warning "this code has never even been
remotely close to a compiler, so handle with care" ;-)

> 
>> 	};
>> };
>>
>> With userspace doing:
>>
>> rseq->slice_request = true;  /* WRITE_ONCE() */
>> barrier();
>> critical_section();
>> barrier();
>> rseq->slice_request = false; /* WRITE_ONCE() */
>> if (rseq->slice_grant)       /* READ_ONCE() */
>>     rseq_slice_yield();
> 
> That should work as it's strictly CPU local. Good point, now that you
> said it it's obvious :)
> 
> Let me rework it accordingly.

I have two questions wrt ABI here:

1) Do we expect the slice requests to be done from C and higher level
    languages or only from assembly ?

2) Slice requests are a good fit for locking. Locking typically
    has nesting ability.

    We should consider making the slice request ABI a 8-bit
    or 16-bit nesting counter to allow nesting of its users.

3) Slice requests are also a good fit for rseq critical sections.
    Of course someone could explicitly increment/decrement the
    slice request counter before/after the rseq critical sections, but
    I think we could do better there and integrate this directly within
    the struct rseq_cs as a new critical section flag. Basically, a
    critical section with this new RSEQ_CS_SLICE_REQUEST flag (or
    better name) set within its descriptor flags would behave as if
    the slice request counter is non-zero when preempted without
    requiring any extra instruction on the fast path. The only
    added overhead would be a check of the rseq->slice_grant flag
    when exiting the critical section to conditionally issue
    rseq_slice_yield().

    This point (3) is an optimization that could come as a future step
    if the overhead of incrementing the slice_request proves to be a
    bottleneck for rseq critical sections.

> 
>> In the kernel interrupt return path, if the kernel observes
>> "rseq->slice_request" set and "rseq->slice_grant" cleared,
>> it grants the extension and sets "rseq->slice_grant".
> 
> They can't be both set. If they are then user space fiddled with the
> bits.

Ah, yes, that's true if the kernel clears the slice_request when setting
the slice_grant.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com