[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2187165.bB369e8A3T@7940hx>
Date: Wed, 14 Jan 2026 11:27:20 +0800
From: Menglong Dong <menglong.dong@...ux.dev>
To: Menglong Dong <menglong8.dong@...il.com>,
Andrii Nakryiko <andrii.nakryiko@...il.com>
Cc: ast@...nel.org, andrii@...nel.org, daniel@...earbox.net,
martin.lau@...ux.dev, eddyz87@...il.com, song@...nel.org,
yonghong.song@...ux.dev, john.fastabend@...il.com, kpsingh@...nel.org,
sdf@...ichev.me, haoluo@...gle.com, jolsa@...nel.org, davem@...emloft.net,
dsahern@...nel.org, tglx@...utronix.de, mingo@...hat.com,
jiang.biao@...ux.dev, bp@...en8.de, dave.hansen@...ux.intel.com,
x86@...nel.org, hpa@...or.com, bpf@...r.kernel.org, netdev@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject:
Re: [PATCH bpf-next v9 07/11] bpf,x86: add fsession support for x86_64
On 2026/1/14 09:25 Andrii Nakryiko <andrii.nakryiko@...il.com> write:
> On Sat, Jan 10, 2026 at 6:12 AM Menglong Dong <menglong8.dong@...il.com> wrote:
> >
> > Add BPF_TRACE_FSESSION supporting to x86_64, including:
[...]
> >
> > diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
> > index d94f7038c441..0671a434c00d 100644
> > --- a/arch/x86/net/bpf_jit_comp.c
> > +++ b/arch/x86/net/bpf_jit_comp.c
> > @@ -3094,12 +3094,17 @@ static int emit_cond_near_jump(u8 **pprog, void *func, void *ip, u8 jmp_cond)
> > static int invoke_bpf(const struct btf_func_model *m, u8 **pprog,
> > struct bpf_tramp_links *tl, int stack_size,
> > int run_ctx_off, bool save_ret,
> > - void *image, void *rw_image)
> > + void *image, void *rw_image, u64 func_meta)
> > {
> > int i;
> > u8 *prog = *pprog;
> >
> > for (i = 0; i < tl->nr_links; i++) {
> > + if (tl->links[i]->link.prog->call_session_cookie) {
> > + /* 'stack_size + 8' is the offset of func_md in stack */
>
> not func_md, don't invent new names, "func_meta" (but it's also so
Ah, it should be func_meta here, it's a typo.
> backwards that you have stack offsets as positive... and it's not even
> in verifier's stack slots, just bytes... very confusing to me)
Do you mean the offset to emit_store_stack_imm64()? I'll convert it
to negative after modify the emit_store_stack_imm64() as you suggested.
>
> > + emit_store_stack_imm64(&prog, stack_size + 8, func_meta);
> > + func_meta -= (1 << BPF_TRAMP_M_COOKIE);
>
> was this supposed to be BPF_TRAMP_M_IS_RETURN?... and why didn't AI catch this?
It should be BPF_TRAMP_M_COOKIE here. I'm decreasing and
compute the offset of the session cookie for the next bpf
program.
This part correspond to the 5th patch. It will be more clear if you
combine it to the 5th patch. Seems that it's a little confusing
here :/
Maybe some comment is needed here.
>
> > + }
> > if (invoke_bpf_prog(m, &prog, tl->links[i], stack_size,
> > run_ctx_off, save_ret, image, rw_image))
> > return -EINVAL;
> > @@ -3222,7 +3227,9 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *rw_im
> > struct bpf_tramp_links *fexit = &tlinks[BPF_TRAMP_FEXIT];
> > struct bpf_tramp_links *fmod_ret = &tlinks[BPF_TRAMP_MODIFY_RETURN];
> > void *orig_call = func_addr;
> > + int cookie_off, cookie_cnt;
> > u8 **branches = NULL;
> > + u64 func_meta;
> > u8 *prog;
> > bool save_ret;
> >
> > @@ -3290,6 +3297,11 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *rw_im
> >
> > ip_off = stack_size;
> >
> > + cookie_cnt = bpf_fsession_cookie_cnt(tlinks);
> > + /* room for session cookies */
> > + stack_size += cookie_cnt * 8;
> > + cookie_off = stack_size;
> > +
> > stack_size += 8;
> > rbx_off = stack_size;
> >
> > @@ -3383,9 +3395,19 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *rw_im
> > }
> > }
> >
> > + if (bpf_fsession_cnt(tlinks)) {
> > + /* clear all the session cookies' value */
> > + for (int i = 0; i < cookie_cnt; i++)
> > + emit_store_stack_imm64(&prog, cookie_off - 8 * i, 0);
> > + /* clear the return value to make sure fentry always get 0 */
> > + emit_store_stack_imm64(&prog, 8, 0);
> > + }
> > + func_meta = nr_regs + (((cookie_off - regs_off) / 8) << BPF_TRAMP_M_COOKIE);
>
> func_meta conceptually is a collection of bit fields, so using +/-
> feels weird, use | and &, more in line with working with bits?
It's not only for bit fields. For nr_args and cookie offset, they are
byte fields. Especially for cookie offset, arithmetic operation is performed
too. So I think it make sense here, right?
>
> (also you defined that BPF_TRAMP_M_NR_ARGS but you are not using it
> consistently...)
I'm not sure if we should define it. As we use the least significant byte for
the nr_args, the shift for it is always 0. If we use it in the inline, unnecessary
instruction will be generated, which is the bit shift instruction.
I defined it here for better code reading. Maybe we can do some comment
in the inline of bpf_get_func_arg(), instead of defining such a unused
macro?
Thanks!
Menglong Dong
>
>
>
>
> > +
> > if (fentry->nr_links) {
> > if (invoke_bpf(m, &prog, fentry, regs_off, run_ctx_off,
> > - flags & BPF_TRAMP_F_RET_FENTRY_RET, image, rw_image))
> > + flags & BPF_TRAMP_F_RET_FENTRY_RET, image, rw_image,
> > + func_meta))
> > return -EINVAL;
> > }
> >
> > @@ -3445,9 +3467,14 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *rw_im
> > }
> > }
> >
> > + /* set the "is_return" flag for fsession */
> > + func_meta += (1 << BPF_TRAMP_M_IS_RETURN);
> > + if (bpf_fsession_cnt(tlinks))
> > + emit_store_stack_imm64(&prog, nregs_off, func_meta);
> > +
> > if (fexit->nr_links) {
> > if (invoke_bpf(m, &prog, fexit, regs_off, run_ctx_off,
> > - false, image, rw_image)) {
> > + false, image, rw_image, func_meta)) {
> > ret = -EINVAL;
> > goto cleanup;
> > }
> > --
> > 2.52.0
> >
>
Powered by blists - more mailing lists