[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <5bc5b9b07d997e299b048005826bc6d592265c67@linux.dev>
Date: Tue, 08 Oct 2024 02:17:57 +0000
From: jeff.xie@...ux.dev
To: "Steven Rostedt" <rostedt@...dmis.org>
Cc: mhiramat@...nel.org, mathieu.desnoyers@...icios.com,
 linux-trace-kernel@...r.kernel.org, linux-kernel@...r.kernel.org,
 xiehuan09@...il.com, dolinux.peng@...il.com, chensong_2000@....cn
Subject: Re: [PATCH v3] ftrace: Get the true parent ip for function tracer
October 8, 2024 at 5:10 AM, "Steven Rostedt" <rostedt@...dmis.org> wrote:
> 
> On Sat, 5 Oct 2024 10:13:20 -0400
> 
> Steven Rostedt <rostedt@...dmis.org> wrote:
> 
> > 
> > The crash happened here:
> > 
> 
> This has been bothering me all weekend so I dug more into it.
> 
> The reason it was bothering me is because we use current later on, and it
> 
> has no issue. But then I noticed the real bug!
> 
> I was confused because the crashed instruction pointer was in the
> 
> get_current() code. But that's not where the crash actually happened.
> 
> > 
> > static __always_inline unsigned long
> > 
> >  function_get_true_parent_ip(unsigned long parent_ip, struct ftrace_regs *fregs)
> > 
> >  {
> > 
> >  unsigned long true_parent_ip;
> > 
> >  int idx = 0;
> > 
> >  
> > 
> >  true_parent_ip = parent_ip;
> > 
> >  if (unlikely(parent_ip == (unsigned long)&return_to_handler))
> > 
> >  true_parent_ip = ftrace_graph_ret_addr(current, &idx, parent_ip, <<<----- CRASH
> > 
> 
> That's not the crash
> 
> > 
> > (unsigned long *)fregs->regs.sp);
> > 
> 
> The above is!
> 
> fregs should *NEVER* be used directly. OK, I need to make:
> 
> struct ftrace_regs {
> 
>  void *nothing_here[];
> 
> };
> 
> Now, to stop bugs like this.
> 
> fregs is unique by architecture, and may not even be defined! That is, it
> 
> can pass NULL for fregs. And only x86 has fregs->regs available. Other
> 
> archs do not.
Thanks, I just saw this comment from include/linux/ftrace.h
 * NOTE: user *must not* access regs directly, only do it via APIs, because
 * the member can be changed according to the architecture.
 */
struct ftrace_regs {
        struct pt_regs          regs;
};
> 
> You must use the fregs helper functions, thus the above can be:
> 
> static __always_inline unsigned long
> 
> function_get_true_parent_ip(unsigned long parent_ip, struct ftrace_regs *fregs)
> 
> {
> 
>  unsigned long true_parent_ip;
> 
>  int idx = 0;
> 
>  true_parent_ip = parent_ip;
> 
>  if (unlikely(parent_ip == (unsigned long)&return_to_handler) && fregs)
> 
>  true_parent_ip = ftrace_graph_ret_addr(current, &idx, parent_ip,
> 
>  (unsigned long *)ftrace_regs_get_stack_pointer(fregs));
> 
>  return true_parent_ip;
> 
> }
> 
> So you can make a v5 with this update. And I'll go and make the empty
> 
> ftrace_regs structure.
Thanks, I will update the patch.
> 
> Thanks!
> 
> -- Steve
> 
> > 
> > return true_parent_ip;
> > 
> >  }
> > 
> >  
> > 
> >  It appears that on some archs (x86 32 bit) the function tracer can be
> > 
> >  called when "current" is not set up yet, and can crash when accessing it.
> > 
> >  
> > 
> >  So perhaps we need to add:
> > 
> >  
> > 
> >  #ifdef CONFIG_ARCH_WANTS_NO_INSTR
> > 
> >  static __always_inline unsigned long
> > 
> >  function_get_true_parent_ip(unsigned long parent_ip, struct ftrace_regs *fregs)
> > 
> >  {
> > 
> >  unsigned long true_parent_ip;
> > 
> >  int idx = 0;
> > 
> >  
> > 
> >  true_parent_ip = parent_ip;
> > 
> >  if (unlikely(parent_ip == (unsigned long)&return_to_handler))
> > 
> >  true_parent_ip = ftrace_graph_ret_addr(current, &idx, parent_ip, <<<----- CRASH
> > 
> >  (unsigned long *)fregs->regs.sp);
> > 
> >  return true_parent_ip;
> > 
> >  }
> > 
> >  #else
> > 
> >  # define function_get_true_parent_ip(parent_ip, fregs) parent_ip
> > 
> >  #endif
> > 
> >  
> > 
> >  That is, if the arch has noinstr implemented, it should always be safe
> > 
> >  to access current, but if not, then there's no guarantee.
> > 
> >  
> > 
> >  ?
> >
>
Powered by blists - more mailing lists
 
