[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAADnVQ+FANha0fO_BF+iHJ4iZSCPtDfoUkzR8mMFwOakw8+eCg@mail.gmail.com>
Date: Wed, 30 Apr 2025 09:53:15 -0700
From: Alexei Starovoitov <alexei.starovoitov@...il.com>
To: Leon Hwang <leon.hwang@...ux.dev>
Cc: Kafai Wan <mannkafai@...il.com>, Song Liu <song@...nel.org>, Jiri Olsa <jolsa@...nel.org>,
Alexei Starovoitov <ast@...nel.org>, Daniel Borkmann <daniel@...earbox.net>,
Andrii Nakryiko <andrii@...nel.org>, Martin KaFai Lau <martin.lau@...ux.dev>, Eduard <eddyz87@...il.com>,
Yonghong Song <yonghong.song@...ux.dev>, John Fastabend <john.fastabend@...il.com>,
KP Singh <kpsingh@...nel.org>, Stanislav Fomichev <sdf@...ichev.me>, Hao Luo <haoluo@...gle.com>,
Matt Bobrowski <mattbobrowski@...gle.com>, Steven Rostedt <rostedt@...dmis.org>,
Masami Hiramatsu <mhiramat@...nel.org>, Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
"David S. Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>, Simon Horman <horms@...nel.org>,
Mykola Lysenko <mykolal@...com>, Shuah Khan <shuah@...nel.org>, LKML <linux-kernel@...r.kernel.org>,
bpf <bpf@...r.kernel.org>,
linux-trace-kernel <linux-trace-kernel@...r.kernel.org>,
Network Development <netdev@...r.kernel.org>,
"open list:KERNEL SELFTEST FRAMEWORK" <linux-kselftest@...r.kernel.org>
Subject: Re: [PATCH bpf-next 1/4] bpf: Allow get_func_[arg|arg_cnt] helpers in
raw tracepoint programs
On Wed, Apr 30, 2025 at 8:55 AM Leon Hwang <leon.hwang@...ux.dev> wrote:
>
>
>
> On 2025/4/30 20:43, Kafai Wan wrote:
> > On Wed, Apr 30, 2025 at 10:46 AM Alexei Starovoitov
> > <alexei.starovoitov@...il.com> wrote:
> >>
> >> On Sat, Apr 26, 2025 at 9:00 AM KaFai Wan <mannkafai@...il.com> wrote:
> >>>
>
> [...]
>
> >>> @@ -2312,7 +2322,7 @@ void __bpf_trace_run(struct bpf_raw_tp_link *link, u64 *args)
> >>> #define REPEAT(X, FN, DL, ...) REPEAT_##X(FN, DL, __VA_ARGS__)
> >>>
> >>> #define SARG(X) u64 arg##X
> >>> -#define COPY(X) args[X] = arg##X
> >>> +#define COPY(X) args[X + 1] = arg##X
> >>>
> >>> #define __DL_COM (,)
> >>> #define __DL_SEM (;)
> >>> @@ -2323,9 +2333,10 @@ void __bpf_trace_run(struct bpf_raw_tp_link *link, u64 *args)
> >>> void bpf_trace_run##x(struct bpf_raw_tp_link *link, \
> >>> REPEAT(x, SARG, __DL_COM, __SEQ_0_11)) \
> >>> { \
> >>> - u64 args[x]; \
> >>> + u64 args[x + 1]; \
> >>> + args[0] = x; \
> >>> REPEAT(x, COPY, __DL_SEM, __SEQ_0_11); \
> >>> - __bpf_trace_run(link, args); \
> >>> + __bpf_trace_run(link, args + 1); \
> >>
> >> This is neat, but what is this for?
> >> The program that attaches to a particular raw_tp knows what it is
> >> attaching to and how many arguments are there,
> >> so bpf_get_func_arg_cnt() is a 5th wheel.
> >>
> >> If the reason is "for completeness" then it's not a good reason
> >> to penalize performance. Though it's just an extra 8 byte of stack
> >> and a single store of a constant.
> >>
> > If we try to capture all arguments of a specific raw_tp in tracing programs,
> > We first obtain the arguments count from the format file in debugfs or BTF
> > and pass this count to the BPF program via .bss section or cookie (if
> > available).
> >
> > If we store the count in ctx and get it via get_func_arg_cnt helper in
> > the BPF program,
> > a) It's easier and more efficient to get the arguments count in the BPF program.
> > b) It could use a single BPF program to capture arguments for multiple raw_tps,
> > reduce the number of BPF programs when massive tracing.
> >
>
>
> bpf_get_func_arg() will be very helpful for bpfsnoop[1] when tracing tp_btf.
>
> In bpfsnoop, it can generate a small snippet of bpf instructions to use
> bpf_get_func_arg() for retrieving and filtering arguments. For example,
> with the netif_receive_skb tracepoint, bpfsnoop can use
> bpf_get_func_arg() to filter the skb argument using pcap-filter(7)[2] or
> a custom attribute-based filter. This will allow bpfsnoop to trace
> multiple tracepoints using a single bpf program code.
I doubt you thought it through end to end.
When tracepoint prog attaches we have this check:
/*
* check that program doesn't access arguments beyond what's
* available in this tracepoint
*/
if (prog->aux->max_ctx_offset > btp->num_args * sizeof(u64))
return -EINVAL;
So you cannot have a single bpf prog attached to many tracepoints
to read many arguments as-is.
You can hack around that limit with probe_read,
but the values won't be trusted and you won't be able to pass
such untrusted pointers into skb and other helpers/kfuncs.
Powered by blists - more mailing lists