lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAEf4Bzas7Or4yPzqdHqEcgVpTDx2j26dR5oRnSg7bepr-uDqHw@mail.gmail.com>
Date: Tue, 4 Nov 2025 16:40:43 -0800
From: Andrii Nakryiko <andrii.nakryiko@...il.com>
To: Menglong Dong <menglong.dong@...ux.dev>
Cc: Menglong Dong <menglong8.dong@...il.com>, 
	Alexei Starovoitov <alexei.starovoitov@...il.com>, Alexei Starovoitov <ast@...nel.org>, 
	Jiri Olsa <jolsa@...nel.org>, Daniel Borkmann <daniel@...earbox.net>, 
	John Fastabend <john.fastabend@...il.com>, Andrii Nakryiko <andrii@...nel.org>, 
	Martin KaFai Lau <martin.lau@...ux.dev>, Eduard <eddyz87@...il.com>, Song Liu <song@...nel.org>, 
	Yonghong Song <yonghong.song@...ux.dev>, KP Singh <kpsingh@...nel.org>, 
	Stanislav Fomichev <sdf@...ichev.me>, Hao Luo <haoluo@...gle.com>, 
	Matt Bobrowski <mattbobrowski@...gle.com>, Steven Rostedt <rostedt@...dmis.org>, 
	Masami Hiramatsu <mhiramat@...nel.org>, Leon Hwang <leon.hwang@...ux.dev>, jiang.biao@...ux.dev, 
	bpf <bpf@...r.kernel.org>, LKML <linux-kernel@...r.kernel.org>, 
	linux-trace-kernel <linux-trace-kernel@...r.kernel.org>
Subject: Re: [PATCH bpf-next v3 4/7] bpf,x86: add tracing session supporting
 for x86_64

On Mon, Nov 3, 2025 at 3:29 AM Menglong Dong <menglong.dong@...ux.dev> wrote:
>
> On 2025/11/1 01:57, Alexei Starovoitov wrote:
> > On Thu, Oct 30, 2025 at 8:36 PM Menglong Dong <menglong8.dong@...il.com> wrote:
> > >
> > > On Fri, Oct 31, 2025 at 9:42 AM Alexei Starovoitov
> > > <alexei.starovoitov@...il.com> wrote:
> > > >
> > > > On Sat, Oct 25, 2025 at 8:02 PM Menglong Dong <menglong8.dong@...il.com> wrote:
> > > > >
> > > > > Add BPF_TRACE_SESSION supporting to x86_64. invoke_bpf_session_entry and
> > > > > invoke_bpf_session_exit is introduced for this purpose.
> > > > >
> > > > > In invoke_bpf_session_entry(), we will check if the return value of the
> > > > > fentry is 0, and set the corresponding session flag if not. And in
> > > > > invoke_bpf_session_exit(), we will check if the corresponding flag is
> > > > > set. If set, the fexit will be skipped.
> > > > >
> > > > > As designed, the session flags and session cookie address is stored after
> > > > > the return value, and the stack look like this:
> > > > >
> > > > >   cookie ptr    -> 8 bytes
> > > > >   session flags -> 8 bytes
> > > > >   return value  -> 8 bytes
> > > > >   argN          -> 8 bytes
> > > > >   ...
> > > > >   arg1          -> 8 bytes
> > > > >   nr_args       -> 8 bytes

Let's look at "cookie ptr", "session flags", and "nr_args". We can
combine all of them into a single 8 byte slot: assign each session
program index 0, 1, ..., Nsession. 1 bit for entry/exit flag, few bits
for session prog index, and few more bits for nr_args, and we still
will have tons of space for some other additions in the future. From
that session program index you can calculate cookieN address to return
to user.

And we should look whether moving nr_args into bpf_run_ctx would
actually minimize amount of trampoline assembly code, as we can
implement a bunch of stuff in pure C. (well, BPF verifier inlining is
a separate thing, but it can be mostly arch-independent, right?)

> > > > >   ...
> > > > >   cookieN       -> 8 bytes
> > > > >   cookie1       -> 8 bytes
> > > > >
> > > > > In the entry of the session, we will clear the return value, so the fentry
> > > > > will always get 0 with ctx[nr_args] or bpf_get_func_ret().
> > > > >
> > > > > Before the execution of the BPF prog, the "cookie ptr" will be filled with
> > > > > the corresponding cookie address, which is done in
> > > > > invoke_bpf_session_entry() and invoke_bpf_session_exit().
> > > >

[...]

> I have to maintain a hash table "m_context" to check if
> the exit of skb_clone() is what I want. It works, but it has
> extra overhead in the hash table lookup and deleting. What's
> more, it's not stable on some complex case.
>
> The problem is that we don't has a way to pair the
> fentry/fexit on the stack(do we?).
>
> >
> > > AFAIT, the mast usage of session cookie in kprobe is passing the
> > > function arguments to the exit. For tracing, we can get the args
> > > in the fexit. So the session cookie in tracing is not as important as
> > > in kprobe.
> >
> > Since kprobe_multi was introduced, retsnoop and tetragon adopted
> > it to do mass attach, and both use bpf_get_attach_cookie().
> > While both don't use bpf_session_cookie().
> > Searching things around I also didn't find a single real user
> > of bpf_session_cookie() other than selftests/bpf and Jiri's slides :)
> >
> > So, doing all this work in trampoline for bpf_session_cookie()
> > doesn't seem warranted, but with that doing session in trampoline
> > also doesn't look useful, since the only benefit vs a pair
> > of fentry/fexit is skip of fexit, which can be done already.
> > Plus complexity in all JITs... so, I say, let's shelve the whole thing for now.
>
> Yeah, the session cookie is not useful in my case too, I'm OK to
> skip it.

hmm, in my view session cookie is one of the main reasons why
"session" concept is useful :) that's certainly the case for
multi-kprobe session, IMO (and yes, I haven't yet used it in retsnoop,
but it's just because I'm lazy, not because it's not useful... )

>
> The pair of fentry/fexit have some use cases(my nettrace and
> Leon's bpfsnoop, at least). Not sure if the reason above is sufficient,
> please ignore the message if it is not :)
>
> The way we implement the trampoline makes it arch related and
> complex. I tried to rewrite it with C code, but failed. It's too difficult
> and I suspect it's impossible :/
>
> Thanks!
> Menglong Dong
>
> >
> >
>
>
>
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ