[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aNGnaylt_WNL6bZr@krava>
Date: Mon, 22 Sep 2025 21:45:47 +0200
From: Jiri Olsa <olsajiri@...il.com>
To: Jiri Olsa <olsajiri@...il.com>
Cc: Masami Hiramatsu <mhiramat@...nel.org>,
Steven Rostedt <rostedt@...dmis.org>,
Peter Zijlstra <peterz@...radead.org>,
Steven Rostedt <rostedt@...nel.org>,
Menglong Dong <menglong8.dong@...il.com>, tglx@...utronix.de,
mingo@...hat.com, bp@...en8.de, dave.hansen@...ux.intel.com,
x86@...nel.org, hpa@...or.com, kees@...nel.org,
samitolvanen@...gle.com, rppt@...nel.org, luto@...nel.org,
ast@...nel.org, andrii@...nel.org, linux-kernel@...r.kernel.org,
bpf@...r.kernel.org
Subject: Re: [PATCH] tracing: fgraph: Protect return handler from recursion
loop
On Mon, Sep 22, 2025 at 03:38:13PM +0200, Jiri Olsa wrote:
> On Mon, Sep 22, 2025 at 03:16:55PM +0900, Masami Hiramatsu wrote:
> > On Sat, 20 Sep 2025 09:45:15 +0200
> > Jiri Olsa <olsajiri@...il.com> wrote:
> >
> > > On Fri, Sep 19, 2025 at 11:27:46AM -0400, Steven Rostedt wrote:
> > > > On Fri, 19 Sep 2025 20:57:36 +0900
> > > > "Masami Hiramatsu (Google)" <mhiramat@...nel.org> wrote:
> > > >
> > > > > From: Masami Hiramatsu (Google) <mhiramat@...nel.org>
> > > > >
> > > > > function_graph_enter_regs() prevents itself from recursion by
> > > > > ftrace_test_recursion_trylock(), but __ftrace_return_to_handler(),
> > > > > which is called at the exit, does not prevent such recursion.
> > > > > Therefore, while it can prevent recursive calls from
> > > > > fgraph_ops::entryfunc(), it is not able to prevent recursive calls
> > > > > to fgraph from fgraph_ops::retfunc(), resulting in a recursive loop.
> > > > > This can lead an unexpected recursion bug reported by Menglong.
> > > > >
> > > > > is_endbr() is called in __ftrace_return_to_handler -> fprobe_return
> > > > > -> kprobe_multi_link_exit_handler -> is_endbr.
> > > >
> > > > So basically its if the handler for the return part calls something that it
> > > > is tracing, it can trigger the recursion?
> > > >
> > > > >
> > > > > To fix this issue, acquire ftrace_test_recursion_trylock() in the
> > > > > __ftrace_return_to_handler() after unwind the shadow stack to mark
> > > > > this section must prevent recursive call of fgraph inside user-defined
> > > > > fgraph_ops::retfunc().
> > > > >
> > > > > This is essentially a fix to commit 4346ba160409 ("fprobe: Rewrite
> > > > > fprobe on function-graph tracer"), because before that fgraph was
> > > > > only used from the function graph tracer. Fprobe allowed user to run
> > > > > any callbacks from fgraph after that commit.
> > > >
> > > > I would actually say it's because before this commit, the return handler
> > > > callers never called anything that the entry handlers didn't already call.
> > > > If there was recursion, the entry handler would catch it (and the entry
> > > > tells fgraph if the exit handler should be called).
> > > >
> > > > The difference here is with fprobes, you can have the exit handler calling
> > > > functions that the entry handler does not, which exposes more cases where
> > > > recursion could happen.
> > >
> > > so IIUC we have return kprobe multi probe on is_endbr and now we do:
> > >
> > > is_endbr()
> > > { -> function_graph_enter_regs installs return probe
> > > ...
> > > } -> __ftrace_return_to_handler
> > > fprobe_return
> > > kprobe_multi_link_exit_handler
> > > is_endbr
> > > { -> function_graph_enter_regs installs return probe
> > > ...
> > > } -> __ftrace_return_to_handler
> > > fprobe_return
> > > kprobe_multi_link_exit_handler
> > > is_endbr
> > > { -> function_graph_enter_regs installs return probe
> > > ...
> > > } -> __ftrace_return_to_handler
> > > ... recursion
> > >
> > >
> > > with the fix:
> > >
> > > is_endbr()
> > > { -> function_graph_enter_regs installs return probe
> > > ...
> > > } -> __ftrace_return_to_handler
> > > fprobe_return
> > > kprobe_multi_link_exit_handler
> > > ...
> > > is_endbr
> > > { -> function_graph_enter_regs
> > > ftrace_test_recursion_trylock fails and we do NOT install return probe
> > > ...
> > > }
> > >
> > >
> > > there's is_endbr call also in kprobe_multi_link_handler, but it won't
> > > trigger recursion, because function_graph_enter_regs already uses
> > > ftrace_test_recursion_trylock
> > >
> > >
> > > if above is correct then the fix looks good to me
> > >
> > > Acked-by: Jiri Olsa <jolsa@...nel.org>
> >
> > Hi Jiri,
> >
> > I found ftrace_test_recursion_trylock() allows one nest level, can you
> > make sure it is OK?
we have nesting check on the kprobe multi layer making sure
the bpf program will not nest into itself
kprobe_multi_link_prog_run
bpf_prog_active check
jirka
Powered by blists - more mailing lists