[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <348528a9-7e1a-4aa7-8219-5cad81969137@paulmck-laptop>
Date: Fri, 14 Nov 2025 09:00:21 -0800
From: "Paul E. McKenney" <paulmck@...nel.org>
To: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Cc: Steven Rostedt <rostedt@...dmis.org>,
Stephen Rothwell <sfr@...b.auug.org.au>,
Frederic Weisbecker <frederic@...nel.org>,
Neeraj Upadhyay <neeraj.upadhyay@...nel.org>,
Boqun Feng <boqun.feng@...il.com>,
Uladzislau Rezki <urezki@...il.com>,
Masami Hiramatsu <mhiramat@...nel.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Linux Next Mailing List <linux-next@...r.kernel.org>,
yonghong.song@...ux.dev
Subject: Re: linux-next: manual merge of the rcu tree with the ftrace tree
On Fri, Nov 14, 2025 at 05:33:30PM +0100, Sebastian Andrzej Siewior wrote:
> On 2025-11-14 11:22:02 [-0500], Steven Rostedt wrote:
> > It's to match this code:
> >
> > --- a/include/linux/tracepoint.h
> > +++ b/include/linux/tracepoint.h
> > @@ -100,6 +100,25 @@ void for_each_tracepoint_in_module(struct module *mod,
> > }
> > #endif /* CONFIG_MODULES */
> >
> > +/*
> > + * BPF programs can attach to the tracepoint callbacks. But if the
> > + * callbacks are called with preemption disabled, the BPF programs
> > + * can cause quite a bit of latency. When PREEMPT_RT is enabled,
> > + * instead of disabling preemption, use srcu_fast_notrace() for
> > + * synchronization. As BPF programs that are attached to tracepoints
> > + * expect to stay on the same CPU, also disable migration.
> > + */
> > +#ifdef CONFIG_PREEMPT_RT
> > +extern struct srcu_struct tracepoint_srcu;
> > +# define tracepoint_sync() synchronize_srcu(&tracepoint_srcu);
> > +# define tracepoint_guard() \
> > + guard(srcu_fast_notrace)(&tracepoint_srcu); \
> > + guard(migrate)()
> > +#else
> > +# define tracepoint_sync() synchronize_rcu();
> > +# define tracepoint_guard() guard(preempt_notrace)()
> > +#endif
> > +
> >
> > Where in PREEMPT_RT we do not disable preemption around the tracepoint
> > callback, but in non RT we do. Instead it uses a srcu and migrate disable.
>
> I appreciate the effort. I really do. But why can't we have SRCU on both
> configs?
Due to performance concerns for non-RT kernels and workloads, where we
really need preemption disabled.
> Also why does tracepoint_guard() need to disable migration? The BPF
> program already disables migrations (see for instance
> bpf_prog_run_array()).
> This is true for RT and !RT. So there is no need to do it here.
The addition of migration disabling was in response to failures, which
this fixed. Or at least greatly reduced the probability of! Let's see...
That migrate_disable() has been there since 2022, so the failures were
happening despite it. Adding Yonghong on CC for his perspective.
> > The migrate_disable in the syscall tracepoint (which gets called by the
> > system call version that doesn't disable migration, even in RT), needs to
> > disable migration so that the accounting that happens in:
> >
> > trace_event_buffer_reserve()
> >
> > matches what happens when that function gets called by a normal tracepoint
> > callback.
>
> buh. But this is something. If we know that the call chain does not
> disable migration, couldn't we just use a different function? I mean we
> have tracing_gen_ctx_dec() and tracing_gen_ctx)(). Wouldn't this work
> for migrate_disable(), too?
> Just in case we need it and can not avoid it, see above.
On this, I must defer to the tracing experts. ;-)
Thanx, Paul
Powered by blists - more mailing lists