linux-kernel - Re: [PATCH] tracing: fgraph: Protect return handler from recursion loop

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <20250925075932.e42ead81f0c7d0ba6eb31511@kernel.org>
Date: Thu, 25 Sep 2025 07:59:32 +0900
From: Masami Hiramatsu (Google) <mhiramat@...nel.org>
To: Steven Rostedt <rostedt@...nel.org>
Cc: Menglong Dong <menglong.dong@...ux.dev>, Peter Zijlstra
 <peterz@...radead.org>, Menglong Dong <menglong8.dong@...il.com>,
 jolsa@...nel.org, tglx@...utronix.de, mingo@...hat.com, bp@...en8.de,
 dave.hansen@...ux.intel.com, x86@...nel.org, hpa@...or.com,
 kees@...nel.org, samitolvanen@...gle.com, rppt@...nel.org, luto@...nel.org,
 ast@...nel.org, andrii@...nel.org, linux-kernel@...r.kernel.org,
 bpf@...r.kernel.org
Subject: Re: [PATCH] tracing: fgraph: Protect return handler from recursion
 loop

On Sun, 21 Sep 2025 19:00:56 -0400
Steven Rostedt <rostedt@...nel.org> wrote:

> On Sun, 21 Sep 2025 13:06:47 +0900
> Masami Hiramatsu (Google) <mhiramat@...nel.org> wrote:
> > > 
> > > Hi, the logic seems right, but the warning is triggered when
> > > I try to run the bpf bench testing:  
> > 
> > Hmm, this is strange. Let me check why this happens.
> > 
> > Thank you,
> > 
> > > 
> > > $ ./benchs/run_bench_trigger.sh kretprobe-multi-all
> > > [   20.619642] NOTICE: Automounting of tracing to debugfs is deprecated and will be removed in 2030
> > > [  139.509036] ------------[ cut here ]------------
> > > [  139.509180] WARNING: CPU: 2 PID: 522 at kernel/trace/fgraph.c:839 ftrace_return_to_handler+0x2b9/0x2d0
> > > [  139.509411] Modules linked in: virtio_net
> > > [  139.509514] CPU: 2 UID: 0 PID: 522 Comm: bench Not tainted 6.17.0-rc5-g1fe6d652bfa0 #106 NONE 
> > > [  139.509720] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.17.0-1-1 04/01/2014
> > > [  139.509948] RIP: 0010:ftrace_return_to_handler+0x2b9/0x2d0
> > > [  139.510086] Code: e8 0c 08 0e 00 0f 0b 49 c7 c1 00 73 20 81 e9 d1 fe ff ff 40 f6 c6 10 75 11 49 c7 c3 ef ff ff ff ba 10 00 00 00 e9 57 fe ff ff <0f> 0b e9 a5 fe ff ff e8 1b 72 0d 01 66 66 2e 0f 1f 84 00 00 00 00
> > > [  139.510536] RSP: 0018:ffffc9000012cef8 EFLAGS: 00010002
> > > [  139.510664] RAX: ffff88810f709800 RBX: ffffc900007c3678 RCX: 0000000000000003
> > > [  139.510835] RDX: 0000000000000008 RSI: 0000000000000018 RDI: 0000000000000000
> > > [  139.511007] RBP: 0000000000000000 R08: 0000000000000034 R09: ffffffff82550319
> > > [  139.511184] R10: ffffc9000012cf50 R11: fffffffffffffff7 R12: 0000000000000000
> > > [  139.511357] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
> > > [  139.511532] FS:  00007fe58276fb00(0000) GS:ffff8884ab3b8000(0000) knlGS:0000000000000000
> > > [  139.511724] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > [  139.511865] CR2: 0000562a28314b67 CR3: 00000001143f9000 CR4: 0000000000750ef0
> > > [  139.512038] PKRU: 55555554
> > > [  139.512106] Call Trace:
> > > [  139.512177]  <IRQ>
> > > [  139.512232]  ? irq_exit_rcu+0x4/0xb0
> > > [  139.512322]  return_to_handler+0x1e/0x50
> > > [  139.512422]  ? idle_cpu+0x9/0x50
> > > [  139.512506]  ? sysvec_apic_timer_interrupt+0x69/0x80
> > > [  139.512638]  ? idle_cpu+0x9/0x50
> > > [  139.512731]  ? irq_exit_rcu+0x3a/0xb0
> > > [  139.512833]  ? ftrace_stub_direct_tramp+0x10/0x10
> > > [  139.512961]  ? sysvec_apic_timer_interrupt+0x69/0x80
> > > [  139.513101]  </IRQ>
> > > [  139.513168]  <TASK>
> > >   
> > > > +
> > > >  #ifdef CONFIG_FUNCTION_GRAPH_RETVAL
> > > >  	trace.retval = ftrace_regs_get_return_value(fregs);
> > > >  #endif
> > > > @@ -852,6 +862,8 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe
> > > >  		}
> > > >  	}
> > > >  
> > > > +	ftrace_test_recursion_unlock(bit);
> > > > +out:
> > > >  	/*
> > > >  	 * The ftrace_graph_return() may still access the current
> > > >  	 * ret_stack structure, we need to make sure the update of
> 
> 
> Hmm, I wonder if this has to do with the "TRANSITION BIT". The
> ftrace_test_recursion_trylock() allows one level of recursion. This is
> to handle the case of an interrupt happening after the recursion bit is
> set and traces something before it updates the context in the preempt
> count. This would cause a false positive of the recursion test. To
> handle this case, it allows a single level of recursion.
> 
> I originally was going to mention this, but I still can't see how this
> would affect it. Because if the entry were to allow one level of
> recursion, so would the exit. I still see the entry preventing the exit
> to be called. But perhaps there's an combination that I missed?

If the fgraph allows one level of recursion, fprobe should work around
functions called from fprobe's callback, such as is_endbr(), as follows:

/* probe on enter */
call target_func()
  => function_graph_enter_regs() /* 1st lock */
    -> fprobe_entry()
      -> user_callback()
        -> is_endbr()
          => function_graph_enter_regs() /* 2nd lock */
            -> fprobe_entry()
              -> user_callback()
                -> is_endbr()
                  => function_graph_enter_regs() /* lock failed */

/* probe on exit */
call target_func()
  => function_graph_enter_regs()
     -> ftrace_push_return_trace()
/* run target_func() */
=> __ftrace_return_to_handler() /* 1st lock */
  -> fprobe_exit()
    -> user_callback()
      -> is_endbr()
        => function_graph_enter_regs() /* 2nd lock */
          -> ftrace_push_return_trace()
         /* unlock 2nd */
      /* run is_endbr() */
      => __ftrace_return_to_handler() /* 2nd lock again */
        -> fprobe_exit()
          -> user_callback()
            -> is_endbr()
              => function_graph_enter_regs() /* lock failed */
            /* run is_endbr() */

So it can detect the recursion.

Thank you,

-- 
Masami Hiramatsu (Google) <mhiramat@...nel.org>