lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <87plbwrbef.ffs@tglx>
Date: Thu, 11 Sep 2025 22:18:16 +0200
From: Thomas Gleixner <tglx@...utronix.de>
To: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>, LKML
 <linux-kernel@...r.kernel.org>
Cc: Peter Zilstra <peterz@...radead.org>, "Paul E. McKenney"
 <paulmck@...nel.org>, Boqun Feng <boqun.feng@...il.com>, Jonathan Corbet
 <corbet@....net>, Prakash Sangappa <prakash.sangappa@...cle.com>, Madadi
 Vineeth Reddy <vineethr@...ux.ibm.com>, K Prateek Nayak
 <kprateek.nayak@....com>, Steven Rostedt <rostedt@...dmis.org>, Sebastian
 Andrzej Siewior <bigeasy@...utronix.de>, Arnd Bergmann <arnd@...db.de>,
 linux-arch@...r.kernel.org
Subject: Re: [patch 00/12] rseq: Implement time slice extension mechanism

On Thu, Sep 11 2025 at 11:27, Mathieu Desnoyers wrote:
> On 2025-09-08 18:59, Thomas Gleixner wrote:
>> If it is interrupted and the interrupt return path in the kernel observes a
>> rescheduling request, then the kernel can grant a time slice extension. The
>> kernel clears the REQUEST bit and sets the GRANTED bit with a simple
>> non-atomic store operation. If it does not grant the extension only the
>> REQUEST bit is cleared.
>> 
>> If user space observes the REQUEST bit cleared, when it finished the
>> critical section, then it has to check the GRANTED bit. If that is set,
>> then it has to invoke the rseq_slice_yield() syscall to terminate the
>
> Does it "have" to ? What is the consequence of misbehaving ?

It receives SIGSEGV because that means that it did not follow the rules
and stuck an arbitrary syscall into the critical section.

> I wonder if we could achieve this without the cpu-local atomic, and
> just rely on simple relaxed-atomic or volatile loads/stores and compiler
> barriers in userspace. Let's say we have:
>
> union {
> 	u16 slice_ctrl;
> 	struct {
> 		u8 rseq->slice_request;
> 		u8 rseq->slice_grant;

Interesting way to define a struct member :)

> 	};
> };
>
> With userspace doing:
>
> rseq->slice_request = true;  /* WRITE_ONCE() */
> barrier();
> critical_section();
> barrier();
> rseq->slice_request = false; /* WRITE_ONCE() */
> if (rseq->slice_grant)       /* READ_ONCE() */
>    rseq_slice_yield();

That should work as it's strictly CPU local. Good point, now that you
said it it's obvious :)

Let me rework it accordingly.

> In the kernel interrupt return path, if the kernel observes
> "rseq->slice_request" set and "rseq->slice_grant" cleared,
> it grants the extension and sets "rseq->slice_grant".

They can't be both set. If they are then user space fiddled with the
bits.

>>      - A futile attempt to make this "work" on the PREEMPT_LAZY preemption
>>        model which is utilized by PREEMPT_RT.
>
> Can you clarify why this attempt is "futile" ?

Because on RT interrupts usually end up with TIF_PREEMPT set either due
to softirqs or interrupt threads. And no, we don't want to
overcomplicate things right now to make it "work" for real-time tasks in
the first place as that's just going to result either endless
discussions or subtle latency problems or both. For now allowing it for
the 'LAZY' case is good enough.

With the non-RT LAZY model that's not really a good idea either, because
when TIF_PREEMPT is set, then either the preempting task is in a RT
class or the to be preempted task already has overrun the LAZY granted
computation time and the scheduler sets TIF_PREEMPT to whack it over the
head.

Thanks,

        tglx

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ