lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <20250921130647.9bd0cba7d49b15d0b0ebe6f7@kernel.org>
Date: Sun, 21 Sep 2025 13:06:47 +0900
From: Masami Hiramatsu (Google) <mhiramat@...nel.org>
To: Menglong Dong <menglong.dong@...ux.dev>
Cc: Peter Zijlstra <peterz@...radead.org>, Steven Rostedt
 <rostedt@...nel.org>, Menglong Dong <menglong8.dong@...il.com>,
 jolsa@...nel.org, tglx@...utronix.de, mingo@...hat.com, bp@...en8.de,
 dave.hansen@...ux.intel.com, x86@...nel.org, hpa@...or.com,
 kees@...nel.org, samitolvanen@...gle.com, rppt@...nel.org, luto@...nel.org,
 ast@...nel.org, andrii@...nel.org, linux-kernel@...r.kernel.org,
 bpf@...r.kernel.org
Subject: Re: [PATCH] tracing: fgraph: Protect return handler from recursion
 loop

On Sat, 20 Sep 2025 21:39:25 +0800
Menglong Dong <menglong.dong@...ux.dev> wrote:

> On 2025/9/19 19:57, Masami Hiramatsu (Google) wrote:
> > From: Masami Hiramatsu (Google) <mhiramat@...nel.org>
> > 
> > function_graph_enter_regs() prevents itself from recursion by
> > ftrace_test_recursion_trylock(), but __ftrace_return_to_handler(),
> > which is called at the exit, does not prevent such recursion.
> > Therefore, while it can prevent recursive calls from
> > fgraph_ops::entryfunc(), it is not able to prevent recursive calls
> > to fgraph from fgraph_ops::retfunc(), resulting in a recursive loop.
> > This can lead an unexpected recursion bug reported by Menglong.
> > 
> >  is_endbr() is called in __ftrace_return_to_handler -> fprobe_return
> >   -> kprobe_multi_link_exit_handler -> is_endbr.
> > 
> > To fix this issue, acquire ftrace_test_recursion_trylock() in the
> > __ftrace_return_to_handler() after unwind the shadow stack to mark
> > this section must prevent recursive call of fgraph inside user-defined
> > fgraph_ops::retfunc().
> > 
> > This is essentially a fix to commit 4346ba160409 ("fprobe: Rewrite
> > fprobe on function-graph tracer"), because before that fgraph was
> > only used from the function graph tracer. Fprobe allowed user to run
> > any callbacks from fgraph after that commit.
> > 
> > Reported-by: Menglong Dong <menglong8.dong@...il.com>
> > Closes: https://lore.kernel.org/all/20250918120939.1706585-1-dongml2@chinatelecom.cn/
> > Fixes: 4346ba160409 ("fprobe: Rewrite fprobe on function-graph tracer")
> > Cc: stable@...r.kernel.org
> > Signed-off-by: Masami Hiramatsu (Google) <mhiramat@...nel.org>
> > ---
> >  kernel/trace/fgraph.c |   12 ++++++++++++
> >  1 file changed, 12 insertions(+)
> > 
> > diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c
> > index 1e3b32b1e82c..08dde420635b 100644
> > --- a/kernel/trace/fgraph.c
> > +++ b/kernel/trace/fgraph.c
> > @@ -815,6 +815,7 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe
> >  	unsigned long bitmap;
> >  	unsigned long ret;
> >  	int offset;
> > +	int bit;
> >  	int i;
> >  
> >  	ret_stack = ftrace_pop_return_trace(&trace, &ret, frame_pointer, &offset);
> > @@ -829,6 +830,15 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe
> >  	if (fregs)
> >  		ftrace_regs_set_instruction_pointer(fregs, ret);
> >  
> > +	bit = ftrace_test_recursion_trylock(trace.func, ret);
> > +	/*
> > +	 * This must be succeeded because the entry handler returns before
> > +	 * modifying the return address if it is nested. Anyway, we need to
> > +	 * avoid calling user callbacks if it is nested.
> > +	 */
> > +	if (WARN_ON_ONCE(bit < 0))
> > +		goto out;
> 
> Hi, the logic seems right, but the warning is triggered when
> I try to run the bpf bench testing:

Hmm, this is strange. Let me check why this happens.

Thank you,

> 
> $ ./benchs/run_bench_trigger.sh kretprobe-multi-all
> [   20.619642] NOTICE: Automounting of tracing to debugfs is deprecated and will be removed in 2030
> [  139.509036] ------------[ cut here ]------------
> [  139.509180] WARNING: CPU: 2 PID: 522 at kernel/trace/fgraph.c:839 ftrace_return_to_handler+0x2b9/0x2d0
> [  139.509411] Modules linked in: virtio_net
> [  139.509514] CPU: 2 UID: 0 PID: 522 Comm: bench Not tainted 6.17.0-rc5-g1fe6d652bfa0 #106 NONE 
> [  139.509720] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.17.0-1-1 04/01/2014
> [  139.509948] RIP: 0010:ftrace_return_to_handler+0x2b9/0x2d0
> [  139.510086] Code: e8 0c 08 0e 00 0f 0b 49 c7 c1 00 73 20 81 e9 d1 fe ff ff 40 f6 c6 10 75 11 49 c7 c3 ef ff ff ff ba 10 00 00 00 e9 57 fe ff ff <0f> 0b e9 a5 fe ff ff e8 1b 72 0d 01 66 66 2e 0f 1f 84 00 00 00 00
> [  139.510536] RSP: 0018:ffffc9000012cef8 EFLAGS: 00010002
> [  139.510664] RAX: ffff88810f709800 RBX: ffffc900007c3678 RCX: 0000000000000003
> [  139.510835] RDX: 0000000000000008 RSI: 0000000000000018 RDI: 0000000000000000
> [  139.511007] RBP: 0000000000000000 R08: 0000000000000034 R09: ffffffff82550319
> [  139.511184] R10: ffffc9000012cf50 R11: fffffffffffffff7 R12: 0000000000000000
> [  139.511357] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
> [  139.511532] FS:  00007fe58276fb00(0000) GS:ffff8884ab3b8000(0000) knlGS:0000000000000000
> [  139.511724] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  139.511865] CR2: 0000562a28314b67 CR3: 00000001143f9000 CR4: 0000000000750ef0
> [  139.512038] PKRU: 55555554
> [  139.512106] Call Trace:
> [  139.512177]  <IRQ>
> [  139.512232]  ? irq_exit_rcu+0x4/0xb0
> [  139.512322]  return_to_handler+0x1e/0x50
> [  139.512422]  ? idle_cpu+0x9/0x50
> [  139.512506]  ? sysvec_apic_timer_interrupt+0x69/0x80
> [  139.512638]  ? idle_cpu+0x9/0x50
> [  139.512731]  ? irq_exit_rcu+0x3a/0xb0
> [  139.512833]  ? ftrace_stub_direct_tramp+0x10/0x10
> [  139.512961]  ? sysvec_apic_timer_interrupt+0x69/0x80
> [  139.513101]  </IRQ>
> [  139.513168]  <TASK>
> 
> > +
> >  #ifdef CONFIG_FUNCTION_GRAPH_RETVAL
> >  	trace.retval = ftrace_regs_get_return_value(fregs);
> >  #endif
> > @@ -852,6 +862,8 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe
> >  		}
> >  	}
> >  
> > +	ftrace_test_recursion_unlock(bit);
> > +out:
> >  	/*
> >  	 * The ftrace_graph_return() may still access the current
> >  	 * ret_stack structure, we need to make sure the update of
> > 
> > 
> > 
> 
> 
> 
> 


-- 
Masami Hiramatsu (Google) <mhiramat@...nel.org>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ