lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87wm4guqnb.ffs@tglx>
Date: Mon, 27 Oct 2025 19:48:56 +0100
From: Thomas Gleixner <tglx@...utronix.de>
To: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Cc: LKML <linux-kernel@...r.kernel.org>, Peter Zilstra
 <peterz@...radead.org>, Mathieu Desnoyers
 <mathieu.desnoyers@...icios.com>, "Paul E. McKenney" <paulmck@...nel.org>,
 Boqun Feng <boqun.feng@...il.com>, Jonathan Corbet <corbet@....net>,
 Prakash Sangappa <prakash.sangappa@...cle.com>, Madadi Vineeth Reddy
 <vineethr@...ux.ibm.com>, K Prateek Nayak <kprateek.nayak@....com>, Steven
 Rostedt <rostedt@...dmis.org>, Arnd Bergmann <arnd@...db.de>,
 linux-arch@...r.kernel.org
Subject: Re: [patch V2 00/12] rseq: Implement time slice extension mechanism

On Mon, Oct 27 2025 at 18:30, Sebastian Andrzej Siewior wrote:

> |       slice_test-2903    [001] d..2.  2313.285484: hrtimer_cancel: hrtimer=0000000030a688cc
> extension granted, timer started and revoked and set need resched.
>
> |       slice_test-2903    [001] dN.2.  2313.285487: sched_stat_runtime: comm=slice_test pid=2903 runtime=36886 [ns]
> This is coming from schedule() already. It took me a while since I was
> hunting a missing clear of need-resched.
>
> |       slice_test-2903    [001] d..2.  2313.285489: sched_switch: prev_comm=slice_test prev_pid=2903 prev_prio=120 prev_state=R+ ==> next_comm=ksoftirqd/1 next_pid=32 next_prio=120
> |      ksoftirqd/1-32      [001] ..s.1  2313.285490: softirq_entry: vec=7 [action=SCHED]
> |      ksoftirqd/1-32      [001] ..s.1  2313.285501: softirq_exit: vec=7 [action=SCHED]
> |      ksoftirqd/1-32      [001] d..2.  2313.285502: sched_stat_runtime: comm=ksoftirqd/1 pid=32 runtime=16438 [ns]
> |      ksoftirqd/1-32      [001] d..2.  2313.285503: sched_switch: prev_comm=ksoftirqd/1 prev_pid=32 prev_prio=120 prev_state=S ==> next_comm=slice_test next_pid=2904 next_prio=120
> |       slice_test-2904    [001] .....  2313.285507: sys_enter: NR 230 (1, 0, 7f4692c7baa0, 0, 0, 0)
> |       slice_test-2904    [001] .....  2313.285507: hrtimer_setup: hrtimer=00000000f2d53899 clockid=CLOCK_MONOTONIC mode=REL
> |       slice_test-2904    [001] d..1.  2313.285507: hrtimer_start: hrtimer=00000000f2d53899 function=hrtimer_wakeup expires=2313208168792 softexpires=2313208118792 mode=REL
> |       slice_test-2904    [001] d..2.  2313.285508: sched_stat_runtime: comm=slice_test pid=2904 runtime=6149 [ns]
> |       slice_test-2904    [001] d..2.  2313.285510: sched_switch: prev_comm=slice_test prev_pid=2904 prev_prio=120 prev_state=S ==> next_comm=slice_test next_pid=2903 next_prio=120
> |       slice_test-2903    [001] .....  2313.285510: sys_enter: NR 470 (7fffc04f1ff0, c350, 11a0e0, 0, 7f4692e99000, 0)
>
> slice_test-2903 enters _now_ rseq_slice_yield() so it must have been in
> userland during the suppressed wake up at 2313.285457.
> But a few iterations later it turns at out this trace event is recorded
> _after_ the rseq magic happens at sys_enter time. We entered
> rseq_slice_yield() a few cycles after the extension was granted. Buh.
> So it seems to work as intended but it is not obvious tell from tracing
> why it does not work.

Tracing of the syscall happens _after_ syscall_trace_enter() invoked
rseq_syscall_enter_work() which canceled the timer and set
NEED_RESCHED. That immediately rescheduled _after_ the preempt enable:

  syscall()
    do_syscall_64()
      syscall_enter_from_user_mode() {
        syscall_enter_from_user_mode_work()
          syscall_trace_enter()
            rseq_syscall_enter_work()
              preempt_disable()
              hrtimer_try_to_cancel()
                remove_hrtimer()                <- tracepoint
              set_need_resched()
              preempt_enable()
                schedule()
           ...
           trace_sys_enter()                    <- tracepoint

Even if it would not reschedule immediately the ordering would be
reverse.

Thanks,

        tglx

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ