[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZsMwyO1Tv6BsOyc-@krava>
Date: Mon, 19 Aug 2024 13:47:20 +0200
From: Jiri Olsa <olsajiri@...il.com>
To: Steven Rostedt <rostedt@...dmis.org>
Cc: Jiri Olsa <olsajiri@...il.com>,
Alexei Starovoitov <alexei.starovoitov@...il.com>,
Juri Lelli <juri.lelli@...hat.com>, bpf <bpf@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>,
Artem Savkov <asavkov@...hat.com>,
"Jose E. Marchesi" <jose.marchesi@...cle.com>
Subject: Re: NULL pointer deref when running BPF monitor program (6.11.0-rc1)
On Fri, Aug 16, 2024 at 03:30:40PM -0400, Steven Rostedt wrote:
> On Fri, 16 Aug 2024 20:59:47 +0200
> Jiri Olsa <olsajiri@...il.com> wrote:
>
> > so far the only working solution I have is adding '__nullable' suffix
> > to argument name:
> >
> > diff --git a/include/trace/events/sched.h b/include/trace/events/sched.h
> > index 9ea4c404bd4e..fc46f0b42741 100644
> > --- a/include/trace/events/sched.h
> > +++ b/include/trace/events/sched.h
> > @@ -559,9 +559,9 @@ DEFINE_EVENT(sched_stat_runtime, sched_stat_runtime,
> > */
> > TRACE_EVENT(sched_pi_setprio,
> >
> > - TP_PROTO(struct task_struct *tsk, struct task_struct *pi_task),
> > + TP_PROTO(struct task_struct *tsk, struct task_struct *pi_task__nullable),
> >
> > - TP_ARGS(tsk, pi_task),
> > + TP_ARGS(tsk, pi_task__nullable),
> >
> > TP_STRUCT__entry(
> > __array( char, comm, TASK_COMM_LEN )
> > @@ -574,8 +574,8 @@ TRACE_EVENT(sched_pi_setprio,
> > memcpy(__entry->comm, tsk->comm, TASK_COMM_LEN);
> > __entry->pid = tsk->pid;
> > __entry->oldprio = tsk->prio;
> > - __entry->newprio = pi_task ?
> > - min(tsk->normal_prio, pi_task->prio) :
> > + __entry->newprio = pi_task__nullable ?
> > + min(tsk->normal_prio, pi_task__nullable->prio) :
> > tsk->normal_prio;
> > /* XXX SCHED_DEADLINE bits missing */
> > ),
> >
> >
> > now I'm trying to make work something like:
> >
> > diff --git a/include/trace/events/sched.h b/include/trace/events/sched.h
> > index 9ea4c404bd4e..4e4aae2d5700 100644
> > --- a/include/trace/events/sched.h
> > +++ b/include/trace/events/sched.h
> > @@ -559,9 +559,9 @@ DEFINE_EVENT(sched_stat_runtime, sched_stat_runtime,
> > */
> > TRACE_EVENT(sched_pi_setprio,
> >
> > - TP_PROTO(struct task_struct *tsk, struct task_struct *pi_task),
> > + TP_PROTO(struct task_struct *tsk, struct task_struct *__nullable(pi_task)),
> >
> > - TP_ARGS(tsk, pi_task),
> > + TP_ARGS(tsk, __nullable(pi_task)),
> >
> > TP_STRUCT__entry(
> > __array( char, comm, TASK_COMM_LEN )
>
> Hmm, that's really ugly though. Both versions.
>
> Now when Alexei said:
>
> > > > > > We cannot make all tracepoint pointers to be PTR_TRUSTED | PTR_MAYBE_NULL
> > > > > > by default, since it will break a bunch of progs.
> > > > > > Instead we can annotate this tracepoint arg as __nullable and
> > > > > > teach the verifier to recognize such special arguments of tracepoints.
>
> I'm not familiar with the verifier, so I don't know how the above is
> implemented, and why it would break a bunch of progs.
verifier assumes that programs attached to the tracepoint can access
pointer arguments without checking them for null and some of those
programs most likely access such arguments directly
changing that globally and require bpf program to do null check for all
pointer arguments will make verifier fail to load existing programs
>
> If you had a macro around the parameter:
>
> TP_PROTO(struct task_struct *tsk, struct task_struct *__nullable(pi_task)),
>
> Could having that go through another macro pass in trace_events.h work?
> That is, could we associate the trace event with "nullable" parameters
> that could be stored someplace else for you?
IIUC you mean to store extra data for each tracepoint that would
annotate the argument? as Alexei pointed out earlier it might be
too much, because we'd be fine with just adding suffix to annotated
arguments in __bpf_trace_##call:
__bpf_trace_##call(void *__data, proto) \
{ \
CONCATENATE(bpf_trace_run, COUNT_ARGS(args))(__data, CAST_TO_U64(args)); \
}
with that verifier could easily get suffix information from BTF and
once gcc implements btf_type_tag we can easily switch to that
jirka
Powered by blists - more mailing lists