netdev - Re: [PATCH bpf-next v4 6/9] bpf,x86: add tracing session supporting for x86

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAEf4BzbivvVtDWywzAQY8txk6tTOw__NEzrMU-wH52oYMBQPaw@mail.gmail.com>
Date: Fri, 19 Dec 2025 08:56:18 -0800
From: Andrii Nakryiko <andrii.nakryiko@...il.com>
To: Menglong Dong <menglong.dong@...ux.dev>
Cc: Menglong Dong <menglong8.dong@...il.com>, ast@...nel.org, andrii@...nel.org, 
	davem@...emloft.net, dsahern@...nel.org, daniel@...earbox.net, 
	martin.lau@...ux.dev, eddyz87@...il.com, song@...nel.org, 
	yonghong.song@...ux.dev, john.fastabend@...il.com, kpsingh@...nel.org, 
	sdf@...ichev.me, haoluo@...gle.com, jolsa@...nel.org, tglx@...utronix.de, 
	mingo@...hat.com, bp@...en8.de, dave.hansen@...ux.intel.com, x86@...nel.org, 
	hpa@...or.com, netdev@...r.kernel.org, bpf@...r.kernel.org, 
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH bpf-next v4 6/9] bpf,x86: add tracing session supporting
 for x86_64

On Thu, Dec 18, 2025 at 5:42 PM Menglong Dong <menglong.dong@...ux.dev> wrote:
>
> On 2025/12/19 08:55 Andrii Nakryiko <andrii.nakryiko@...il.com> write:
> > On Wed, Dec 17, 2025 at 1:55 AM Menglong Dong <menglong8.dong@...il.com> wrote:
> > >
> > > Add BPF_TRACE_SESSION supporting to x86_64, including:
> > >
> > > 1. clear the return value in the stack before fentry to make the fentry
> > >    of the fsession can only get 0 with bpf_get_func_ret(). If we can limit
> > >    that bpf_get_func_ret() can only be used in the
> > >    "bpf_fsession_is_return() == true" code path, we don't need do this
> > >    thing anymore.
> >
> > What does bpf_get_func_ret() return today for fentry? zero or just
> > random garbage? If the latter, we can keep the same semantics for
> > fsession on entry. Ultimately, result of bpf_get_func_ret() is
> > meaningless outside of fexit/session-exit
>
> For fentry, bpf_get_func_ret() is not allowed to be called. For fsession,
> I think the best way is that we allow to call bpf_get_func_ret() in the
> "bpf_fsession_is_return() == true"  branch, and prohibit it in
> "bpf_fsession_is_return() == false" branch. However, we need to track
> such condition in verifier, which will make things complicated. So
> I think we can allow the usage of bpf_get_func_ret() in fsession and
> make sure it will always get zero in the fsession-fentry for now.

yeah, that's fine. and assembly complication is not that big, just
zero out a slot on the stack, right? I think it's fine.

>
> Thanks!
> Menglong Dong
>
> >
> > >
> > > 2. clear all the session cookies' value in the stack. If we can make sure
> > >    that the reading to session cookie can only be done after initialize in
> > >    the verifier, we don't need this anymore.
> > >
> > > 2. store the index of the cookie to ctx[-1] before the calling to fsession
> > >
> > > 3. store the "is_return" flag to ctx[-1] before the calling to fexit of
> > >    the fsession.
> > >
> > > Signed-off-by: Menglong Dong <dongml2@...natelecom.cn>
> > > Co-developed-by: Leon Hwang <leon.hwang@...ux.dev>
> > > Signed-off-by: Leon Hwang <leon.hwang@...ux.dev>
> > > ---
> > > v4:
> > > - some adjustment to the 1st patch, such as we get the fsession prog from
> > >   fentry and fexit hlist
> > > - remove the supporting of skipping fexit with fentry return non-zero
> > >
> > > v2:
> > > - add session cookie support
> > > - add the session stuff after return value, instead of before nr_args
> > > ---
> > >  arch/x86/net/bpf_jit_comp.c | 36 +++++++++++++++++++++++++++++++-----
> > >  1 file changed, 31 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
> > > index 8cbeefb26192..99b0223374bd 100644
> > > --- a/arch/x86/net/bpf_jit_comp.c
> > > +++ b/arch/x86/net/bpf_jit_comp.c
> > > @@ -3086,12 +3086,17 @@ static int emit_cond_near_jump(u8 **pprog, void *func, void *ip, u8 jmp_cond)
> > >  static int invoke_bpf(const struct btf_func_model *m, u8 **pprog,
> > >                       struct bpf_tramp_links *tl, int stack_size,
> > >                       int run_ctx_off, bool save_ret,
> > > -                     void *image, void *rw_image)
> > > +                     void *image, void *rw_image, u64 nr_regs)
> > >  {
> > >         int i;
> > >         u8 *prog = *pprog;
> > >
> > >         for (i = 0; i < tl->nr_links; i++) {
> > > +               if (tl->links[i]->link.prog->call_session_cookie) {
> > > +                       /* 'stack_size + 8' is the offset of nr_regs in stack */
> > > +                       emit_st_r0_imm64(&prog, nr_regs, stack_size + 8);
> > > +                       nr_regs -= (1 << BPF_TRAMP_M_COOKIE);
> >
> > you have to rename nr_regs to something more meaningful because it's
> > so weird to see some bit manipulations with *number of arguments*
> >
> > > +               }
> > >                 if (invoke_bpf_prog(m, &prog, tl->links[i], stack_size,
> > >                                     run_ctx_off, save_ret, image, rw_image))
> > >                         return -EINVAL;
> > > @@ -3208,8 +3213,9 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *rw_im
> > >                                          struct bpf_tramp_links *tlinks,
> > >                                          void *func_addr)
> > >  {
> > > -       int i, ret, nr_regs = m->nr_args, stack_size = 0;
> > > -       int regs_off, nregs_off, ip_off, run_ctx_off, arg_stack_off, rbx_off;
> > > +       int i, ret, nr_regs = m->nr_args, cookie_cnt, stack_size = 0;
> > > +       int regs_off, nregs_off, ip_off, run_ctx_off, arg_stack_off, rbx_off,
> > > +           cookie_off;
> >
> > if it doesn't fit on a single line, just `int cookie_off;` on a
> > separate line, why wrap the line?
> >
> > >         struct bpf_tramp_links *fentry = &tlinks[BPF_TRAMP_FENTRY];
> > >         struct bpf_tramp_links *fexit = &tlinks[BPF_TRAMP_FEXIT];
> > >         struct bpf_tramp_links *fmod_ret = &tlinks[BPF_TRAMP_MODIFY_RETURN];
> >
> > [...]
> >
>
>
>
>