linux-kernel - Re: [PATCH bpf-next v3 4/7] bpf,x86: add tracing session supporting for x86

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAEf4BzZoyrSdV6SbcA7in=51==en2aoEcmvS6=vz-fixLS9M_A@mail.gmail.com>
Date: Thu, 6 Nov 2025 09:03:48 -0800
From: Andrii Nakryiko <andrii.nakryiko@...il.com>
To: Menglong Dong <menglong.dong@...ux.dev>
Cc: Alexei Starovoitov <alexei.starovoitov@...il.com>, Menglong Dong <menglong8.dong@...il.com>, 
	Alexei Starovoitov <ast@...nel.org>, Jiri Olsa <jolsa@...nel.org>, 
	Daniel Borkmann <daniel@...earbox.net>, John Fastabend <john.fastabend@...il.com>, 
	Andrii Nakryiko <andrii@...nel.org>, Martin KaFai Lau <martin.lau@...ux.dev>, Eduard <eddyz87@...il.com>, 
	Song Liu <song@...nel.org>, Yonghong Song <yonghong.song@...ux.dev>, KP Singh <kpsingh@...nel.org>, 
	Stanislav Fomichev <sdf@...ichev.me>, Hao Luo <haoluo@...gle.com>, 
	Matt Bobrowski <mattbobrowski@...gle.com>, Steven Rostedt <rostedt@...dmis.org>, 
	Masami Hiramatsu <mhiramat@...nel.org>, Leon Hwang <leon.hwang@...ux.dev>, jiang.biao@...ux.dev, 
	bpf <bpf@...r.kernel.org>, LKML <linux-kernel@...r.kernel.org>, 
	linux-trace-kernel <linux-trace-kernel@...r.kernel.org>
Subject: Re: [PATCH bpf-next v3 4/7] bpf,x86: add tracing session supporting
 for x86_64

On Thu, Nov 6, 2025 at 4:15 AM Menglong Dong <menglong.dong@...ux.dev> wrote:
>
> On 2025/11/6 06:00, Alexei Starovoitov wrote:
> > On Wed, Nov 5, 2025 at 9:30 AM Andrii Nakryiko
> > <andrii.nakryiko@...il.com> wrote:
> > >
> > > On Tue, Nov 4, 2025 at 6:43 PM Alexei Starovoitov
> > > <alexei.starovoitov@...il.com> wrote:
> > > >
> > > > On Tue, Nov 4, 2025 at 4:40 PM Andrii Nakryiko
> > > > <andrii.nakryiko@...il.com> wrote:
> > > > >
> [......]
> > > >
> > > > Instead of all that I have a different suggestion...
> > > >
> > > > how about we introduce this "session" attach type,
> > > > but won't mess with trampoline and whole new session->nr_links.
> > > > Instead the same prog can be added to 'fentry' list
> > > > and 'fexit' list.
> > > > We lose the ability to skip fexit, but I'm still not convinced
> > > > it's necessary.
> > > > The biggest benefit is that it will work for existing JITs and trampolines.
> > > > No new complex asm will be necessary.
> > > > As far as writable session_cookie ...
> > > > let's add another 8 byte space to bpf_tramp_run_ctx
> > > > and only allow single 'fsession' prog for a given kernel function.
> > > > Again to avoid changing all trampolines.
> > > > This way the feature can be implemented purely in C and no arch
> > > > specific changes.
> > > > It's more limited, but doesn't sound that the use case for multiple
> > > > fsession-s exist. All this is on/off tracing. Not something
> > > > that will be attached 24/7.
> > >
> > > I'd rather not have a feature at all, than have a feature that might
> > > or might not work depending on circumstances I don't control. If
> > > someone happens to be using fsession program on the same kernel
> > > function I happen to be tracing (e.g., with retsnoop), random failure
> > > to attach would be maddening to debug.
> >
> > fentry won't conflict with fsession. I'm proposing
> > the limit of fsession-s to 1. Due to stack usage there gotta be
>
> I think Andrii means that the problem is the limiting the fsession to
> 1, which can make we attach fail if someone else has already attach
> it.
>

Yeah, I wasn't worried about fsession vs fentry interactions. I was
(still am) worried that whenever I'll be deciding to use fsession I'd
have to think about (and implement) fallback plan in case fsession
attachment fails because (presumably) someone else already used
fsession on the function. Just too much hassle to bother with fsession
at that point.

> If we want to limit the stack usage, I think what we should limit is
> the count of the fsession progs that use session cookie, rather the
> count of the fsessions.
>
> I understand your idea that add the prog to both fentry and fexit list
> instead of introducing a BPF_TRAMP_SESSION in the progs_hlist.
> However, we still have to modify the arch stuff, as we need to store the
> "bpf_fsession_return". What's more, it's a little complex to add a prog
> to both fentry and fexit list, as bpf_tramp_link doesn't support such
> operation.
>
> So I think it's more clear to keep current logic. As Andrii suggested,
> we can reuse the nr_args, and no more room will be used on the
> stack, except the session cookies.
>
> As for the session cookies, how about we limit its number? For example,
> only 4 session cookies are allowed to be used on a target, which
> I think should be enough.

I think Alexei's concern is not so much stack space usage
(realistically you will almost never have more that 1-2-3
fsessions+fentry+fexit programs per trampoline), his concern is code
complexity in arch-specific trampoline generation code. And I share
the concern. But if we do support 4 session cookies, we might as well
just not put any artificial limits because code complexity-wise this
won't change anything.

So I think we should decide whether we add fsession with session
cookie as a concept, and if yes, then not add unnecessary
restrictions, as at that point the code complexity won't really change
much.

>
> I can remove the "fexit skipping" part to make the trampoline simpler
> if you prefer, which I'm OK, considering the optimization I'm making
> to the origin call in x86_64.
>
> Therefore, the logic won't be complex, just reserve the room for the
> session cookies and the call to invoke_bpf().
>
> Thanks!
> Menglong Dong
>
> > a limit anyway. I say, 32 is really the max. which is 256 bytes
> > for cookies plus all the stack usage for args, nr_args, run_ctx, etc.
> > Total of under 512 is ok.
> > So tooling would have to deal with the limit regardless.
> >
>
>
>
>