[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87cy68wbt6.ffs@tglx>
Date: Mon, 27 Oct 2025 17:26:29 +0100
From: Thomas Gleixner <tglx@...utronix.de>
To: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Cc: LKML <linux-kernel@...r.kernel.org>, Mathieu Desnoyers
<mathieu.desnoyers@...icios.com>, Peter Zijlstra <peterz@...radead.org>,
"Paul E. McKenney" <paulmck@...nel.org>, Boqun Feng
<boqun.feng@...il.com>, Jonathan Corbet <corbet@....net>, Prakash Sangappa
<prakash.sangappa@...cle.com>, Madadi Vineeth Reddy
<vineethr@...ux.ibm.com>, K Prateek Nayak <kprateek.nayak@....com>, Steven
Rostedt <rostedt@...dmis.org>, Arnd Bergmann <arnd@...db.de>,
linux-arch@...r.kernel.org
Subject: Re: [patch V2 08/12] rseq: Implement time slice extension
enforcement timer
On Mon, Oct 27 2025 at 12:38, Sebastian Andrzej Siewior wrote:
> On 2025-10-22 14:57:38 [+0200], Thomas Gleixner wrote:
>> +static enum hrtimer_restart rseq_slice_expired(struct hrtimer *tmr)
>> +{
>> + struct slice_timer *st = container_of(tmr, struct slice_timer, timer);
>> +
>> + if (st->cookie == current && current->rseq.slice.state.granted) {
>> + rseq_stat_inc(rseq_stats.s_expired);
>> + set_need_resched_current();
>> + }
>
> You arm the timer while leaving to userland. Once in userland the task
> can be migrated to another CPU. Once migrated, this CPU can host another
> task while the timer fires and does nothing.
That's inevitable. If the scheduler decides to do that then there is
nothing which can be done about it and that's why the cookie pointer
exists.
>> + return HRTIMER_NORESTART;
>> +}
>> +
> …
>> +static void rseq_cancel_slice_extension_timer(void)
>> +{
>> + struct slice_timer *st = this_cpu_ptr(&slice_timer);
>> +
>> + /*
>> + * st->cookie can be safely read as preemption is disabled and the
>> + * timer is CPU local. The active check can obviously race with the
>> + * hrtimer interrupt, but that's better than disabling interrupts
>> + * unconditionally right away.
>> + *
>> + * As this is most probably the first expiring timer, the cancel is
>> + * expensive as it has to reprogram the hardware, but that's less
>> + * expensive than going through a full hrtimer_interrupt() cycle
>> + * for nothing.
>> + *
>> + * hrtimer_try_to_cancel() is sufficient here as with interrupts
>> + * disabled the timer callback cannot be running and the timer base
>> + * is well determined as the timer is pinned on the local CPU.
>> + */
>> + if (st->cookie == current && hrtimer_active(&st->timer)) {
>> + scoped_guard(irq)
>> + hrtimer_try_to_cancel(&st->timer);
>
> I don't see why hrtimer_active() and IRQ-disable is a benefit here.
> Unless you want to avoid a branch to hrtimer_try_to_cancel().
>
> The function has its own hrtimer_active() check and disables interrupts
> while accessing the hrtimer_base lock. Since preemption is disabled,
> st->cookie remains stable.
> It can fire right after the hrtimer_active() here. You could just
>
> if (st->cookie == current)
> hrtimer_try_to_cancel(&st->timer);
>
> at the expense of a branch to hrtimer_try_to_cancel() if the timer
> already expired (no interrupts off/on).
That's not equivalent. As this is CPU local the interrupt disable
ensures that the timer is not running on this CPU. Otherwise you need
hrtimer_cancel(). Read the comment. :)
If it fired already, then the task is reaching this code too
late. Nothing to see there.
>> + .extra1 = (unsigned int *)&rseq_slice_ext_nsecs_min,
>> + .extra2 = (unsigned int *)&rseq_slice_ext_nsecs_max,
> …
>
> maybe +
>
> diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst
> index f3ee807b5d8b3..ed34d21ed94e4 100644
> --- a/Documentation/admin-guide/sysctl/kernel.rst
> +++ b/Documentation/admin-guide/sysctl/kernel.rst
> @@ -1228,6 +1228,12 @@ reboot-cmd (SPARC only)
> ROM/Flash boot loader. Maybe to tell it what to do after
> rebooting. ???
>
> +rseq_slice_extension_nsec
> +=========================
> +
> +A task may ask to delay its scheduling if it is in a critical section via the
> +prctl(PR_RSEQ_SLICE_EXTENSION_SET) mechanism. This sets the maximum allowed
> +extension in nanoseconds before a mandatory scheduling of the task is forced.
Yes. Forgot about it as I already documented it in the time slice
extension docs. Let me add that.
Thanks,
tglx
Powered by blists - more mailing lists