[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAADnVQ+ZuQS_RSFL8ThrDkZwSygX2Rx49LBAcMpiv3y4nnYunQ@mail.gmail.com>
Date: Wed, 5 Nov 2025 14:00:22 -0800
From: Alexei Starovoitov <alexei.starovoitov@...il.com>
To: Andrii Nakryiko <andrii.nakryiko@...il.com>
Cc: Menglong Dong <menglong.dong@...ux.dev>, Menglong Dong <menglong8.dong@...il.com>,
Alexei Starovoitov <ast@...nel.org>, Jiri Olsa <jolsa@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>, John Fastabend <john.fastabend@...il.com>,
Andrii Nakryiko <andrii@...nel.org>, Martin KaFai Lau <martin.lau@...ux.dev>, Eduard <eddyz87@...il.com>,
Song Liu <song@...nel.org>, Yonghong Song <yonghong.song@...ux.dev>, KP Singh <kpsingh@...nel.org>,
Stanislav Fomichev <sdf@...ichev.me>, Hao Luo <haoluo@...gle.com>,
Matt Bobrowski <mattbobrowski@...gle.com>, Steven Rostedt <rostedt@...dmis.org>,
Masami Hiramatsu <mhiramat@...nel.org>, Leon Hwang <leon.hwang@...ux.dev>, jiang.biao@...ux.dev,
bpf <bpf@...r.kernel.org>, LKML <linux-kernel@...r.kernel.org>,
linux-trace-kernel <linux-trace-kernel@...r.kernel.org>
Subject: Re: [PATCH bpf-next v3 4/7] bpf,x86: add tracing session supporting
for x86_64
On Wed, Nov 5, 2025 at 9:30 AM Andrii Nakryiko
<andrii.nakryiko@...il.com> wrote:
>
> On Tue, Nov 4, 2025 at 6:43 PM Alexei Starovoitov
> <alexei.starovoitov@...il.com> wrote:
> >
> > On Tue, Nov 4, 2025 at 4:40 PM Andrii Nakryiko
> > <andrii.nakryiko@...il.com> wrote:
> > >
> > > On Mon, Nov 3, 2025 at 3:29 AM Menglong Dong <menglong.dong@...ux.dev> wrote:
> > > >
> > > > On 2025/11/1 01:57, Alexei Starovoitov wrote:
> > > > > On Thu, Oct 30, 2025 at 8:36 PM Menglong Dong <menglong8.dong@...il.com> wrote:
> > > > > >
> > > > > > On Fri, Oct 31, 2025 at 9:42 AM Alexei Starovoitov
> > > > > > <alexei.starovoitov@...il.com> wrote:
> > > > > > >
> > > > > > > On Sat, Oct 25, 2025 at 8:02 PM Menglong Dong <menglong8.dong@...il.com> wrote:
> > > > > > > >
> > > > > > > > Add BPF_TRACE_SESSION supporting to x86_64. invoke_bpf_session_entry and
> > > > > > > > invoke_bpf_session_exit is introduced for this purpose.
> > > > > > > >
> > > > > > > > In invoke_bpf_session_entry(), we will check if the return value of the
> > > > > > > > fentry is 0, and set the corresponding session flag if not. And in
> > > > > > > > invoke_bpf_session_exit(), we will check if the corresponding flag is
> > > > > > > > set. If set, the fexit will be skipped.
> > > > > > > >
> > > > > > > > As designed, the session flags and session cookie address is stored after
> > > > > > > > the return value, and the stack look like this:
> > > > > > > >
> > > > > > > > cookie ptr -> 8 bytes
> > > > > > > > session flags -> 8 bytes
> > > > > > > > return value -> 8 bytes
> > > > > > > > argN -> 8 bytes
> > > > > > > > ...
> > > > > > > > arg1 -> 8 bytes
> > > > > > > > nr_args -> 8 bytes
> > >
> > > Let's look at "cookie ptr", "session flags", and "nr_args". We can
> > > combine all of them into a single 8 byte slot: assign each session
> > > program index 0, 1, ..., Nsession. 1 bit for entry/exit flag, few bits
> > > for session prog index, and few more bits for nr_args, and we still
> > > will have tons of space for some other additions in the future. From
> > > that session program index you can calculate cookieN address to return
> > > to user.
> > >
> > > And we should look whether moving nr_args into bpf_run_ctx would
> > > actually minimize amount of trampoline assembly code, as we can
> > > implement a bunch of stuff in pure C. (well, BPF verifier inlining is
> > > a separate thing, but it can be mostly arch-independent, right?)
> >
> > Instead of all that I have a different suggestion...
> >
> > how about we introduce this "session" attach type,
> > but won't mess with trampoline and whole new session->nr_links.
> > Instead the same prog can be added to 'fentry' list
> > and 'fexit' list.
> > We lose the ability to skip fexit, but I'm still not convinced
> > it's necessary.
> > The biggest benefit is that it will work for existing JITs and trampolines.
> > No new complex asm will be necessary.
> > As far as writable session_cookie ...
> > let's add another 8 byte space to bpf_tramp_run_ctx
> > and only allow single 'fsession' prog for a given kernel function.
> > Again to avoid changing all trampolines.
> > This way the feature can be implemented purely in C and no arch
> > specific changes.
> > It's more limited, but doesn't sound that the use case for multiple
> > fsession-s exist. All this is on/off tracing. Not something
> > that will be attached 24/7.
>
> I'd rather not have a feature at all, than have a feature that might
> or might not work depending on circumstances I don't control. If
> someone happens to be using fsession program on the same kernel
> function I happen to be tracing (e.g., with retsnoop), random failure
> to attach would be maddening to debug.
fentry won't conflict with fsession. I'm proposing
the limit of fsession-s to 1. Due to stack usage there gotta be
a limit anyway. I say, 32 is really the max. which is 256 bytes
for cookies plus all the stack usage for args, nr_args, run_ctx, etc.
Total of under 512 is ok.
So tooling would have to deal with the limit regardless.
Powered by blists - more mailing lists