lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAEf4BzbeE6n7E6K8_dhZ26ZHoVsz8V9mUSxm3CYzz2npmdpbiQ@mail.gmail.com>
Date: Sun, 27 Oct 2024 18:23:02 -0700
From: Andrii Nakryiko <andrii.nakryiko@...il.com>
To: Masami Hiramatsu <mhiramat@...nel.org>
Cc: Steven Rostedt <rostedt@...dmis.org>, Mathieu Desnoyers <mathieu.desnoyers@...icios.com>, 
	linux-kernel@...r.kernel.org, Michael Jeanson <mjeanson@...icios.com>, 
	Peter Zijlstra <peterz@...radead.org>, Alexei Starovoitov <ast@...nel.org>, Yonghong Song <yhs@...com>, 
	"Paul E . McKenney" <paulmck@...nel.org>, Ingo Molnar <mingo@...hat.com>, 
	Arnaldo Carvalho de Melo <acme@...nel.org>, Mark Rutland <mark.rutland@....com>, 
	Alexander Shishkin <alexander.shishkin@...ux.intel.com>, Namhyung Kim <namhyung@...nel.org>, 
	bpf@...r.kernel.org, Joel Fernandes <joel@...lfernandes.org>, 
	Jordan Rife <jrife@...gle.com>
Subject: Re: [RFC PATCH v3 2/3] tracing: Introduce tracepoint_is_syscall()

On Sun, Oct 27, 2024 at 7:19 AM Masami Hiramatsu <mhiramat@...nel.org> wrote:
>
> On Sat, 26 Oct 2024 20:08:40 -0400
> Steven Rostedt <rostedt@...dmis.org> wrote:
>
> > On Sat, 26 Oct 2024 11:46:28 -0400
> > Mathieu Desnoyers <mathieu.desnoyers@...icios.com> wrote:
> >
> > > Introduce a "syscall" flag within the extended structure to know whether
> > > a tracepoint needs rcu tasks trace grace period before reclaim.
> > > This can be queried using tracepoint_is_syscall().
> > >
> > > Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
> > > Cc: Michael Jeanson <mjeanson@...icios.com>
> > > Cc: Steven Rostedt <rostedt@...dmis.org>
> > > Cc: Masami Hiramatsu <mhiramat@...nel.org>
> > > Cc: Peter Zijlstra <peterz@...radead.org>
> > > Cc: Alexei Starovoitov <ast@...nel.org>
> > > Cc: Yonghong Song <yhs@...com>
> > > Cc: Paul E. McKenney <paulmck@...nel.org>
> > > Cc: Ingo Molnar <mingo@...hat.com>
> > > Cc: Arnaldo Carvalho de Melo <acme@...nel.org>
> > > Cc: Mark Rutland <mark.rutland@....com>
> > > Cc: Alexander Shishkin <alexander.shishkin@...ux.intel.com>
> > > Cc: Namhyung Kim <namhyung@...nel.org>
> > > Cc: Andrii Nakryiko <andrii.nakryiko@...il.com>
> > > Cc: bpf@...r.kernel.org
> > > Cc: Joel Fernandes <joel@...lfernandes.org>
> > > Cc: Jordan Rife <jrife@...gle.com>
> > > ---
> > >  include/linux/tracepoint-defs.h |  2 ++
> > >  include/linux/tracepoint.h      | 24 ++++++++++++++++++++++++
> > >  include/trace/define_trace.h    |  2 +-
> > >  3 files changed, 27 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/include/linux/tracepoint-defs.h b/include/linux/tracepoint-defs.h
> > > index 967c08d9da84..53119e074c87 100644
> > > --- a/include/linux/tracepoint-defs.h
> > > +++ b/include/linux/tracepoint-defs.h
> > > @@ -32,6 +32,8 @@ struct tracepoint_func {
> > >  struct tracepoint_ext {
> > >     int (*regfunc)(void);
> > >     void (*unregfunc)(void);
> > > +   /* Flags. */
> > > +   unsigned int syscall:1;
> >
> > I wonder if we should call it "sleepable" instead? For this patch set
> > do we really care if it's a system call or not? It's really if the
> > tracepoint is sleepable or not that's the issue. System calls are just
> > one user of it, there may be more in the future, and the changes to BPF
> > will still be needed.
>
> I agree with this. Even if currently we restrict only syscall events
> can be sleep, "tracepoint_is_syscall()" requires to add comment to
> explain why on all call sites e.g.
>

+1 to naming this "sleepable" (or at least "faultable"). BPF world
uses "sleepable BPF" terminology for BPF programs and attachment hooks
that can take page fault (and wait/sleep waiting for those to be
handled), so this would be consistent with that. Also, from BPF
standpoint this will be advertised as attaching to sleepable
tracepoints regardless, so "syscall" terminology is too specific and
misleading, because while current set of tracepoints are
syscall-specific, the important part is taking page fault, no tracing
syscalls.


>  /*
>   * The syscall event is only sleepable event, so we ensure it is
>   * syscall event for checking sleepable or not.
>   */
>
> If it called tracepoint_is_sleepable(), we don't need such comment.
>
> Thank you,
>
> >
> > Other than that, I think this could work.
> >
> > -- Steve
> >
> >
> > >  };
> > >
> > >  struct tracepoint {
> > > diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h
> > > index 83dc24ee8b13..93e70bc64533 100644
> > > --- a/include/linux/tracepoint.h
> > > +++ b/include/linux/tracepoint.h
> > > @@ -104,6 +104,12 @@ void for_each_tracepoint_in_module(struct module *mod,
> > >   * tracepoint_synchronize_unregister must be called between the last tracepoint
> > >   * probe unregistration and the end of module exit to make sure there is no
> > >   * caller executing a probe when it is freed.
> > > + *
> > > + * An alternative is to use the following for batch reclaim associated
> > > + * with a given tracepoint:
> > > + *
> > > + * - tracepoint_is_syscall() == false: call_rcu()
> > > + * - tracepoint_is_syscall() == true:  call_rcu_tasks_trace()
> > >   */
> > >  #ifdef CONFIG_TRACEPOINTS
> > >  static inline void tracepoint_synchronize_unregister(void)
> > > @@ -111,9 +117,17 @@ static inline void tracepoint_synchronize_unregister(void)
> > >     synchronize_rcu_tasks_trace();
> > >     synchronize_rcu();
> > >  }
> > > +static inline bool tracepoint_is_syscall(struct tracepoint *tp)
> > > +{
> > > +   return tp->ext && tp->ext->syscall;
> > > +}
> > >  #else
> > >  static inline void tracepoint_synchronize_unregister(void)
> > >  { }
> > > +static inline bool tracepoint_is_syscall(struct tracepoint *tp)
> > > +{
> > > +   return false;
> > > +}
> > >  #endif
> > >
> > >  #ifdef CONFIG_HAVE_SYSCALL_TRACEPOINTS
> > > @@ -345,6 +359,15 @@ static inline struct tracepoint *tracepoint_ptr_deref(tracepoint_ptr_t *p)
> > >     struct tracepoint_ext __tracepoint_ext_##_name = {              \
> > >             .regfunc = _reg,                                        \
> > >             .unregfunc = _unreg,                                    \
> > > +           .syscall = false,                                       \
> > > +   };                                                              \
> > > +   __DEFINE_TRACE_EXT(_name, &__tracepoint_ext_##_name, PARAMS(_proto), PARAMS(_args));
> > > +
> > > +#define DEFINE_TRACE_SYSCALL(_name, _reg, _unreg, _proto, _args)   \
> > > +   struct tracepoint_ext __tracepoint_ext_##_name = {              \
> > > +           .regfunc = _reg,                                        \
> > > +           .unregfunc = _unreg,                                    \
> > > +           .syscall = true,                                        \
> > >     };                                                              \
> > >     __DEFINE_TRACE_EXT(_name, &__tracepoint_ext_##_name, PARAMS(_proto), PARAMS(_args));
> > >
> > > @@ -389,6 +412,7 @@ static inline struct tracepoint *tracepoint_ptr_deref(tracepoint_ptr_t *p)
> > >  #define __DECLARE_TRACE_SYSCALL    __DECLARE_TRACE
> > >
> > >  #define DEFINE_TRACE_FN(name, reg, unreg, proto, args)
> > > +#define DEFINE_TRACE_SYSCALL(name, reg, unreg, proto, args)
> > >  #define DEFINE_TRACE(name, proto, args)
> > >  #define EXPORT_TRACEPOINT_SYMBOL_GPL(name)
> > >  #define EXPORT_TRACEPOINT_SYMBOL(name)
> > > diff --git a/include/trace/define_trace.h b/include/trace/define_trace.h
> > > index ff5fa17a6259..63fea2218afa 100644
> > > --- a/include/trace/define_trace.h
> > > +++ b/include/trace/define_trace.h
> > > @@ -48,7 +48,7 @@
> > >
> > >  #undef TRACE_EVENT_SYSCALL
> > >  #define TRACE_EVENT_SYSCALL(name, proto, args, struct, assign, print, reg, unreg) \
> > > -   DEFINE_TRACE_FN(name, reg, unreg, PARAMS(proto), PARAMS(args))
> > > +   DEFINE_TRACE_SYSCALL(name, reg, unreg, PARAMS(proto), PARAMS(args))
> > >
> > >  #undef TRACE_EVENT_NOP
> > >  #define TRACE_EVENT_NOP(name, proto, args, struct, assign, print)
> >
>
>
> --
> Masami Hiramatsu (Google) <mhiramat@...nel.org>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ