[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250921190056.2a17d4cc@batman.local.home>
Date: Sun, 21 Sep 2025 19:00:56 -0400
From: Steven Rostedt <rostedt@...nel.org>
To: "Masami Hiramatsu (Google)" <mhiramat@...nel.org>
Cc: Menglong Dong <menglong.dong@...ux.dev>, Peter Zijlstra
<peterz@...radead.org>, Menglong Dong <menglong8.dong@...il.com>,
jolsa@...nel.org, tglx@...utronix.de, mingo@...hat.com, bp@...en8.de,
dave.hansen@...ux.intel.com, x86@...nel.org, hpa@...or.com,
kees@...nel.org, samitolvanen@...gle.com, rppt@...nel.org, luto@...nel.org,
ast@...nel.org, andrii@...nel.org, linux-kernel@...r.kernel.org,
bpf@...r.kernel.org
Subject: Re: [PATCH] tracing: fgraph: Protect return handler from recursion
loop
On Sun, 21 Sep 2025 13:06:47 +0900
Masami Hiramatsu (Google) <mhiramat@...nel.org> wrote:
> >
> > Hi, the logic seems right, but the warning is triggered when
> > I try to run the bpf bench testing:
>
> Hmm, this is strange. Let me check why this happens.
>
> Thank you,
>
> >
> > $ ./benchs/run_bench_trigger.sh kretprobe-multi-all
> > [ 20.619642] NOTICE: Automounting of tracing to debugfs is deprecated and will be removed in 2030
> > [ 139.509036] ------------[ cut here ]------------
> > [ 139.509180] WARNING: CPU: 2 PID: 522 at kernel/trace/fgraph.c:839 ftrace_return_to_handler+0x2b9/0x2d0
> > [ 139.509411] Modules linked in: virtio_net
> > [ 139.509514] CPU: 2 UID: 0 PID: 522 Comm: bench Not tainted 6.17.0-rc5-g1fe6d652bfa0 #106 NONE
> > [ 139.509720] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.17.0-1-1 04/01/2014
> > [ 139.509948] RIP: 0010:ftrace_return_to_handler+0x2b9/0x2d0
> > [ 139.510086] Code: e8 0c 08 0e 00 0f 0b 49 c7 c1 00 73 20 81 e9 d1 fe ff ff 40 f6 c6 10 75 11 49 c7 c3 ef ff ff ff ba 10 00 00 00 e9 57 fe ff ff <0f> 0b e9 a5 fe ff ff e8 1b 72 0d 01 66 66 2e 0f 1f 84 00 00 00 00
> > [ 139.510536] RSP: 0018:ffffc9000012cef8 EFLAGS: 00010002
> > [ 139.510664] RAX: ffff88810f709800 RBX: ffffc900007c3678 RCX: 0000000000000003
> > [ 139.510835] RDX: 0000000000000008 RSI: 0000000000000018 RDI: 0000000000000000
> > [ 139.511007] RBP: 0000000000000000 R08: 0000000000000034 R09: ffffffff82550319
> > [ 139.511184] R10: ffffc9000012cf50 R11: fffffffffffffff7 R12: 0000000000000000
> > [ 139.511357] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
> > [ 139.511532] FS: 00007fe58276fb00(0000) GS:ffff8884ab3b8000(0000) knlGS:0000000000000000
> > [ 139.511724] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ 139.511865] CR2: 0000562a28314b67 CR3: 00000001143f9000 CR4: 0000000000750ef0
> > [ 139.512038] PKRU: 55555554
> > [ 139.512106] Call Trace:
> > [ 139.512177] <IRQ>
> > [ 139.512232] ? irq_exit_rcu+0x4/0xb0
> > [ 139.512322] return_to_handler+0x1e/0x50
> > [ 139.512422] ? idle_cpu+0x9/0x50
> > [ 139.512506] ? sysvec_apic_timer_interrupt+0x69/0x80
> > [ 139.512638] ? idle_cpu+0x9/0x50
> > [ 139.512731] ? irq_exit_rcu+0x3a/0xb0
> > [ 139.512833] ? ftrace_stub_direct_tramp+0x10/0x10
> > [ 139.512961] ? sysvec_apic_timer_interrupt+0x69/0x80
> > [ 139.513101] </IRQ>
> > [ 139.513168] <TASK>
> >
> > > +
> > > #ifdef CONFIG_FUNCTION_GRAPH_RETVAL
> > > trace.retval = ftrace_regs_get_return_value(fregs);
> > > #endif
> > > @@ -852,6 +862,8 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe
> > > }
> > > }
> > >
> > > + ftrace_test_recursion_unlock(bit);
> > > +out:
> > > /*
> > > * The ftrace_graph_return() may still access the current
> > > * ret_stack structure, we need to make sure the update of
Hmm, I wonder if this has to do with the "TRANSITION BIT". The
ftrace_test_recursion_trylock() allows one level of recursion. This is
to handle the case of an interrupt happening after the recursion bit is
set and traces something before it updates the context in the preempt
count. This would cause a false positive of the recursion test. To
handle this case, it allows a single level of recursion.
I originally was going to mention this, but I still can't see how this
would affect it. Because if the entry were to allow one level of
recursion, so would the exit. I still see the entry preventing the exit
to be called. But perhaps there's an combination that I missed?
-- Steve
Powered by blists - more mailing lists