[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6f33e6b7ad296f4fd0e9c089ac92e53c08cfd850.camel@redhat.com>
Date: Tue, 27 May 2025 16:35:04 +0200
From: Gabriele Monaco <gmonaco@...hat.com>
To: Nam Cao <namcao@...utronix.de>
Cc: linux-kernel@...r.kernel.org, Steven Rostedt <rostedt@...dmis.org>,
linux-trace-kernel@...r.kernel.org, linux-doc@...r.kernel.org, Ingo Molnar
<mingo@...hat.com>, Peter Zijlstra <peterz@...radead.org>, Tomas Glozar
<tglozar@...hat.com>, Juri Lelli <jlelli@...hat.com>
Subject: Re: [RFC PATCH v2 12/12] rv: Add opid per-cpu monitor
On Tue, 2025-05-27 at 15:37 +0200, Nam Cao wrote:
> On Wed, May 14, 2025 at 10:43:14AM +0200, Gabriele Monaco wrote:
> > Add a per-cpu monitor as part of the sched model:
> > * opid: operations with preemption and irq disabled
> > Monitor to ensure wakeup and need_resched occur with irq and
> > preemption disabled or in irq handlers.
>
> This monitor reports some warnings:
>
> $ perf record -e rv:error_opid --call-graph dwarf -a -- ./stress-
> epoll
> (stress-epoll program from
> https://github.com/rouming/test-tools/blob/master/stress-epoll.c)
>
Thanks for trying it out, and good to know about this stressor.
Unfortunately it's a bit hard to understand from this stack trace, but
that's very likely a problem in the model.
I have a few ideas where that could be but I believe it's something
visible only on a physical machine (haven't tested much on x86 bare
metal, only VM).
You're running on bare metal right?
> $ perf script
> stress-epoll 315 [003] 527.674724: rv:error_opid: event
> preempt_disable not expected in the state preempt_disabled
> ffffffff9fdfb34f da_event_opid+0x10f ([kernel.kallsyms])
> ffffffff9fdfb34f da_event_opid+0x10f ([kernel.kallsyms])
> ffffffff9fdfba0d handle_preempt_disable+0x3d
> ([kernel.kallsyms])
> ffffffff9fdd32d0 __traceiter_preempt_disable+0x30
> ([kernel.kallsyms])
> ffffffff9fdd38fe trace_preempt_off+0x4e ([kernel.kallsyms])
> ffffffff9fee6c1c vfs_write+0x12c ([kernel.kallsyms])
> ffffffff9fee7128 ksys_write+0x68 ([kernel.kallsyms])
> ffffffffa0bdbd92 do_syscall_64+0xb2 ([kernel.kallsyms])
> ffffffff9fa00130 entry_SYSCALL_64_after_hwframe+0x77
> ([kernel.kallsyms])
> f833f __GI___libc_write+0x4f (/usr/lib/x86_64-
> linux-gnu/libc.so.6)
> f833f __GI___libc_write+0x4f (/usr/lib/x86_64-
> linux-gnu/libc.so.6)
> 1937 thread_work+0x47 (/root/test-tools/stress-
> epoll)
> 891f4 start_thread+0x304 (/usr/lib/x86_64-linux-
> gnu/libc.so.6)
> 10989b clone3+0x2b (/usr/lib/x86_64-linux-
> gnu/libc.so.6)
>
> stress-epoll 318 [002] 527.674759: rv:error_opid: event
> preempt_disable not expected in the state disabled
> ffffffff9fdfb34f da_event_opid+0x10f ([kernel.kallsyms])
> ffffffff9fdfb34f da_event_opid+0x10f ([kernel.kallsyms])
> ffffffff9fdfba0d handle_preempt_disable+0x3d
> ([kernel.kallsyms])
> ffffffff9fdd32d0 __traceiter_preempt_disable+0x30
> ([kernel.kallsyms])
> ffffffff9fdd38fe trace_preempt_off+0x4e ([kernel.kallsyms])
> ffffffffa0bec1aa _raw_spin_lock_irq+0x1a ([kernel.kallsyms])
> ffffffff9ff4fe73 eventfd_write+0x63 ([kernel.kallsyms])
> ffffffff9fee6be5 vfs_write+0xf5 ([kernel.kallsyms])
> ffffffff9fee7128 ksys_write+0x68 ([kernel.kallsyms])
> ffffffffa0bdbd92 do_syscall_64+0xb2 ([kernel.kallsyms])
> ffffffff9fa00130 entry_SYSCALL_64_after_hwframe+0x77
> ([kernel.kallsyms])
> f833f __GI___libc_write+0x4f (/usr/lib/x86_64-
> linux-gnu/libc.so.6)
> f833f __GI___libc_write+0x4f (/usr/lib/x86_64-
> linux-gnu/libc.so.6)
> 1937 thread_work+0x47 (/root/test-tools/stress-
> epoll)
> 891f4 start_thread+0x304 (/usr/lib/x86_64-linux-
> gnu/libc.so.6)
> 10989b clone3+0x2b (/usr/lib/x86_64-linux-
> gnu/libc.so.6)
>
> I'm not sure what I'm looking at here. Do you think these are kernel
> bugs,
> or the monitor is missing some corner cases?
>
As said, likely a missing corner case, I believe it has to do with IRQs
(which is what makes this monitor more complex than it could be).
Thanks for the pointers, I'll try reproduce it this way.
Gabriele
Powered by blists - more mailing lists