linux-kernel - Re: [patch V3 08/12] rseq: Implement time slice extension enforcement timer

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87ms59tmnm.ffs@tglx>
Date: Wed, 29 Oct 2025 22:37:17 +0100
From: Thomas Gleixner <tglx@...utronix.de>
To: Steven Rostedt <rostedt@...dmis.org>
Cc: LKML <linux-kernel@...r.kernel.org>, Peter Zijlstra
 <peterz@...radead.org>, Mathieu Desnoyers
 <mathieu.desnoyers@...icios.com>, "Paul E. McKenney" <paulmck@...nel.org>,
 Boqun Feng <boqun.feng@...il.com>, Jonathan Corbet <corbet@....net>,
 Prakash Sangappa <prakash.sangappa@...cle.com>, Madadi Vineeth Reddy
 <vineethr@...ux.ibm.com>, K Prateek Nayak <kprateek.nayak@....com>,
 Sebastian Andrzej Siewior <bigeasy@...utronix.de>, Arnd Bergmann
 <arnd@...db.de>, linux-arch@...r.kernel.org
Subject: Re: [patch V3 08/12] rseq: Implement time slice extension
 enforcement timer

On Wed, Oct 29 2025 at 14:45, Steven Rostedt wrote:
> On Wed, 29 Oct 2025 14:22:26 +0100 (CET)
> Thomas Gleixner <tglx@...utronix.de> wrote:
>>  rseq_exit_to_user_mode_restart(struct pt_regs *regs, unsigned long ti_work)
>>  {
>> -	if (likely(!test_tif_rseq(ti_work)))
>> -		return false;
>> -
>> -	if (unlikely(__rseq_exit_to_user_mode_restart(regs))) {
>> -		current->rseq.event.slowpath = true;
>> -		set_tsk_thread_flag(current, TIF_NOTIFY_RESUME);
>> -		return true;
>> +	if (unlikely(test_tif_rseq(ti_work))) {
>> +		if (unlikely(__rseq_exit_to_user_mode_restart(regs))) {
>> +			current->rseq.event.slowpath = true;
>> +			set_tsk_thread_flag(current, TIF_NOTIFY_RESUME);
>> +			return true;
>
> Just to make sure I understand this. By setting TIF_NOTIFY_RESUME and
> returning true it can still comeback to set the timer?

No. NOTIFY_RESUME is only set when the access faults or when the user
space memory is corrupted and the grant is moot in that case.

But if TIF_RSEQ is set then a previously granted extensionn is anyway
revoked because that means:

        granted();
        ---> preemption (evtl. migration): Set's TIF_RSEQ
           schedule()
        rseq_exit_to_user_mode_restart()
           if (TIF_RSEQ is set)
              handle_rseq()
                 revoke_grant()
                 
> I guess this also begs the question of if user space can use both the
> restartable sequences at the same time as requesting an extended time slice?

It can and that actually makes sense.

       enter_cs()
         request_grant()
         set_cs()
         ...

interrupt
        set_need_resched()
        exit_to_user_mode()
           if (need_resched()
              grant_extention() // clears NEED_RESCHED
           ...
        rseq_exit_to_user_mode_restart()
           if (IF_RSEQ is set)  // Branch not taken
              ...
           arm_timer()
        return_to_user()

        leave_cs()
          if (granted)
             sys_rseq_sched_yield()

which means the extension grant prevented the critical section to be
aborted. If the extension is not granted or revoked then this behaves
like a regular RSEQ CS abort.

>> +	 * This check prevents that a granted time slice extension exceeds
>
>	   This check prevents a granted time slice ...
>
>> +	 * the maximum scheduling latency when the grant expired before

I'm not a native speaker, but your suggested edit is bogus. Let me
put it into the full sentence:

	   This check prevents a granted time slice extension exceeds
           the maximum ....

Can you spot the fail?

>> +	/*
>> +	 * Store the task pointer as a cookie for comparison in the timer
>> +	 * function. This is safe as the timer is CPU local and cannot be
>> +	 * in the expiry function at this point.
>> +	 */
>
> I'm just curious in this scenario:
>
>   1) Task A requests an extension and is granted.
>       st->cookie = Task A
>       hrtimer_start();
>
>   2) Before getting back to user space, a RT kernel thread wakes up and
>      preempts Task A. Does this clear the timer?

No.

>   3) RT kernel thread finishes but then schedules Task B within the expiry.
>
>   4) Task B requests an extension (assuming it had a short time slice that
>      allowed it to end before the expiry of the original timer).
>
> I guess it doesn't matter that st->cookie = Task B, as Task A was already
> scheduled out. But would calling hrtimer_start() on an existing timer cause
> any issue?

No. The timer is canceled and reprogrammed.

> I guess it doesn't matter as it looks like the code in hrtimer_start() does
> indeed remove an existing timer.

You guessed right :)

>> +	st->cookie = curr;
>> +	hrtimer_start(&st->timer, curr->rseq.slice.expires, HRTIMER_MODE_ABS_PINNED_HARD);
>> +	/* Arm the syscall entry work */
>> +	set_task_syscall_work(curr, SYSCALL_RSEQ_SLICE);
>> +	return false;
>> +}
>> +
>> +static void rseq_cancel_slice_extension_timer(void)
>> +{
>> +	struct slice_timer *st = this_cpu_ptr(&slice_timer);
>> +
>> +	/*
>> +	 * st->cookie can be safely read as preemption is disabled and the
>> +	 * timer is CPU local.
>> +	 *
>> +	 * As this is most probably the first expiring timer, the cancel is
>
>            As this is probably the first ...
>
>> +	 * expensive as it has to reprogram the hardware, but that's less
>> +	 * expensive than going through a full hrtimer_interrupt() cycle
>> +	 * for nothing.
>> +	 *
>> +	 * hrtimer_try_to_cancel() is sufficient here as the timer is CPU
>> +	 * local and once the hrtimer code disabled interrupts the timer
>> +	 * callback cannot be running.
>> +	 */
>> +	if (st->cookie == current)
>> +		hrtimer_try_to_cancel(&st->timer);
>
> If the above scenario did happen, the timer will go off as
> st->cookie == current would likely be false?
>
> Hmm, if it does go off and the task did schedule back in, would it get its
> need_resched set? This is a very unlikely scenario thus I guess it doesn't
> really matter.

Correct.

> I'm just thinking about corner cases and how it could affect this code and
> possibly cause noticeable issues.

Right. That corner case exists and there is not much to be done about it
unless you inflict the timer cancelation into schedule(), which is not
an option at all.

> -- Steve

/me trims 50+ lines of pointless quotation.

Thanks,

        tglx