[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <878qe4ifas.ffs@tglx>
Date: Sun, 11 Jan 2026 12:01:31 +0100
From: Thomas Gleixner <tglx@...nel.org>
To: Peter Zijlstra <peterz@...radead.org>, Mathieu Desnoyers
<mathieu.desnoyers@...icios.com>
Cc: LKML <linux-kernel@...r.kernel.org>, "Paul E. McKenney"
<paulmck@...nel.org>, Boqun Feng <boqun.feng@...il.com>, Jonathan Corbet
<corbet@....net>, Prakash Sangappa <prakash.sangappa@...cle.com>, Madadi
Vineeth Reddy <vineethr@...ux.ibm.com>, K Prateek Nayak
<kprateek.nayak@....com>, Steven Rostedt <rostedt@...dmis.org>, Sebastian
Andrzej Siewior <bigeasy@...utronix.de>, Arnd Bergmann <arnd@...db.de>,
linux-arch@...r.kernel.org, Randy Dunlap <rdunlap@...radead.org>, Ron Geva
<rongevarg@...il.com>, Waiman Long <longman@...hat.com>
Subject: Re: [patch V6 10/11] entry: Hook up rseq time slice extension
On Fri, Dec 19 2025 at 12:07, Peter Zijlstra wrote:
> On Tue, Dec 16, 2025 at 10:37:24AM -0500, Mathieu Desnoyers wrote:
>> On 2025-12-15 13:24, Thomas Gleixner wrote:
>> > Wire the grant decision function up in exit_to_user_mode_loop()
>> >
>> [...]
>> > +/* TIF bits, which prevent a time slice extension. */
>> > +#ifdef CONFIG_PREEMPT_RT
>> > +# define TIF_SLICE_EXT_SCHED (_TIF_NEED_RESCHED_LAZY)
>> > +#else
>> > +# define TIF_SLICE_EXT_SCHED (_TIF_NEED_RESCHED | _TIF_NEED_RESCHED_LAZY)
>>
>> It would be relevant to explain the difference between RT and non-RT
>> in the commit message.
>
> So if you include TIF_NEED_RESCHED the extension period directly affects
> the minimum scheduler delay like:
>
> min(extension_period, min_sched_delay)
>
> because this is strictly a from-userspace thing. That is, it is
> equivalent to the in-kernel preemption/IRQ disabled regions -- with
> exception of the scheduler critical sections itself.
>
> As I've agrued many times -- I don't see a fundamental reason to not do
> this for RT -- but perhaps further reduce the magic number such that its
> impact cannot be observed on a 'good' machine.
>
> But yes, if/when we do this on RT it needs the promise to agressively
> decrease the magic number any time it can actually be measured to impact
> performance.
>
> cyclictest should probably get a mode where it (ab)uses the feature to
> failure before we do this.
>
> Anyway, I don't mind excluding RT for now, but it *does* deserve a
> comment.
I know you argued about this many times, but I still maintain my point
of view that TIF_PREEMPT and TIF_PREEMPT_LAZY are fundmentally different:
TIF_PREEMPT_LAZY grants a non-RT task to complete until it reaches
return to user
TIF_PREEMPT enforces preemption at the next possible preemption
point
My main concern is this scenario:
sched_other_task()
request_slice_extension()
---> interrupt
RT task is woken up
return_to_user()
grant_extension()
...
which means the RT task is delayed until the OTHER task relinquishes the
CPU voluntarily or via timeout.
That might be desired _if_ both tasks are using the same lock, but in
case of fully independent tasks it's not necessarily a good idea. If a
RT application uses locks in the RT tasks, then obviously latency is not
so much of a concern, but for optimized RT applications the side effect
of other processes getting a free pass to increase latency is troublesome.
So I prefer to keep the current semantics for RT. This can be revisited
of course when a proper evaluation has been done, but IMO there are too
many moving parts in a RT system to make this actually work correctly
under all circumstances.
I'll add proper comments to that effect.
Thanks,
tglx
Powered by blists - more mailing lists