[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAEf4BzYR87Tpp4=UXBe=_A50kfZsAh1b8P__wmNU_4tKo5LqHA@mail.gmail.com>
Date: Thu, 22 Jan 2026 08:57:57 -0800
From: Andrii Nakryiko <andrii.nakryiko@...il.com>
To: Menglong Dong <menglong.dong@...ux.dev>
Cc: Menglong Dong <menglong8.dong@...il.com>, ast@...nel.org, andrii@...nel.org,
daniel@...earbox.net, martin.lau@...ux.dev, eddyz87@...il.com,
song@...nel.org, yonghong.song@...ux.dev, john.fastabend@...il.com,
kpsingh@...nel.org, sdf@...ichev.me, haoluo@...gle.com, jolsa@...nel.org,
davem@...emloft.net, dsahern@...nel.org, tglx@...utronix.de, mingo@...hat.com,
jiang.biao@...ux.dev, bp@...en8.de, dave.hansen@...ux.intel.com,
x86@...nel.org, hpa@...or.com, bpf@...r.kernel.org, netdev@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH bpf-next v10 07/12] bpf,x86: add fsession support for x86_64
On Wed, Jan 21, 2026 at 6:41 PM Menglong Dong <menglong.dong@...ux.dev> wrote:
>
> On 2026/1/22 08:22 Andrii Nakryiko <andrii.nakryiko@...il.com> write:
> > On Wed, Jan 21, 2026 at 4:06 PM Andrii Nakryiko
> > <andrii.nakryiko@...il.com> wrote:
> > >
> > > On Thu, Jan 15, 2026 at 3:24 AM Menglong Dong <menglong8.dong@...il.com> wrote:
> > > >
> > > > Add BPF_TRACE_FSESSION supporting to x86_64, including:
> > > >
> > > > 1. clear the return value in the stack before fentry to make the fentry
> > > > of the fsession can only get 0 with bpf_get_func_ret().
> > > >
> > > > 2. clear all the session cookies' value in the stack.
> > > >
> > > > 2. store the index of the cookie to ctx[-1] before the calling to fsession
> > > >
> > > > 3. store the "is_return" flag to ctx[-1] before the calling to fexit of
> > > > the fsession.
> > > >
> > > > Signed-off-by: Menglong Dong <dongml2@...natelecom.cn>
> > > > Co-developed-by: Leon Hwang <leon.hwang@...ux.dev>
> > > > Signed-off-by: Leon Hwang <leon.hwang@...ux.dev>
> > > > ---
> > > > v10:
> > > > - use "|" for func_meta instead of "+"
> > > > - pass the "func_meta_off" to invoke_bpf() explicitly, instead of
> > > > computing it with "stack_size + 8"
> > > > - pass the "cookie_off" to invoke_bpf() instead of computing the current
> > > > cookie index with "func_meta"
> > > >
> > > > v5:
> > > > - add the variable "func_meta"
> > > > - define cookie_off in a new line
> > > >
> > > > v4:
> > > > - some adjustment to the 1st patch, such as we get the fsession prog from
> > > > fentry and fexit hlist
> > > > - remove the supporting of skipping fexit with fentry return non-zero
> > > >
> > > > v2:
> > > > - add session cookie support
> > > > - add the session stuff after return value, instead of before nr_args
> > > > ---
> > > > arch/x86/net/bpf_jit_comp.c | 52 ++++++++++++++++++++++++++++---------
> > > > 1 file changed, 40 insertions(+), 12 deletions(-)
> > > >
> > > > diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
> > > > index 2f31331955b5..16720f2be16c 100644
> > > > --- a/arch/x86/net/bpf_jit_comp.c
> > > > +++ b/arch/x86/net/bpf_jit_comp.c
> > > > @@ -3094,13 +3094,19 @@ static int emit_cond_near_jump(u8 **pprog, void *func, void *ip, u8 jmp_cond)
> > > >
> > > > static int invoke_bpf(const struct btf_func_model *m, u8 **pprog,
> > > > struct bpf_tramp_links *tl, int stack_size,
> > > > - int run_ctx_off, bool save_ret,
> > > > - void *image, void *rw_image)
> > > > + int run_ctx_off, int func_meta_off, bool save_ret,
> > > > + void *image, void *rw_image, u64 func_meta,
> > > > + int cookie_off)
> > > > {
> > > > - int i;
> > > > + int i, cur_cookie = (cookie_off - stack_size) / 8;
> > >
> > > not sure why you went with passing cookie_off and then calculating,
> > > effectively, cookie count out of that?... why not pass cookie count
> > > directly then? it's minor, but just seems like a weird choice here,
> > > tbh
>
> Hi, Andrii. I think you misunderstand it here. The cur_cookie is not the
> same as cookie count. The layout of the stack looks like this:
>
> return value -> 8 bytes
> argN -> 8 bytes
> ...
> arg1 -> 8 bytes
> nr_args -> 8 bytes
> ip (optional) -> 8 bytes
> cookie2 -> 8 bytes
> cookie1 -> 8 bytes
>
> So if the bpf_get_func_ip() not used, the cur_cookie is exactly the same
> as cookie count. But if it exist, they are not the same.
>
> The location of the cookies is independent from the context, and the
> cur_cookie, which is the index of the current cookie, don't rely on cookie
> count too and can be bigger than cookie count.
>
> PS: the location of "ip" should always laid before the nr_args, as we get
> it with ctx[-2]. Maybe we can optimize it later. We store the index of
> the ip the func_meta too, therefore it is independent from the ctx too.
> Ah, it looks not make much sense ;)
no, it makes sense, I missed that we have this optional ip stored,
which changes cookies offset. So let's leave everything as is.
A thing to consider for the future would be whether it would make
sense to have dedicated ip slot on the stack regardless of whether we
use bpf_get_func_ip() or not, just not fill it out if not necessary.
That would fix offsets and make things a bit simpler. But as I said,
just something to think about, no need to change the logic right now,
I think.
>
> > >
> >
> > consider also just calculating cookie count out from bpf_tramp_links?
> > would that work? Then "func_meta" would really be just nr_args (and
> > I'd call it that) and bool for whether this is entry or exit
> > invokation (for IS_RETURN bit, and maybe we'll need this distinction
> > somewhere else in the future), and then invoke_bpf() will construct
> > func_meta from scratch.
> >
> > It's relatively minor thing, but as I mentioned before, it's this
> > hybrid approach of partially opaque (from invoke_bpf's POV) func_meta
> > which we also adjust or fill out (for cookie index) is a bit of a sign
> > that this is not a proper interface.
>
> Yeah, the current approach is indeed not perfect. But I think it's
> a little not flex if we construct the whole func_meta in invoke_bpf().
> For now, we need to pass nr_args, is_return, cookie_off to it. And
> we need to add more function arguments to invoke_bpf() if there
> are new flags occur in the feature, which is not convenient, right?
>
I think at some point we should just collect all those different
arguments into a small on-the-stack struct and pass it as an
"invocation parameters" argument. That way we'll have properly named
arguments (struct fields) and we can easily have some defaults
skipped, if necessary. But again, just something for the future to
ponder. Just resubmit your patches as they are right now.
> So what do you think?
>
> Thanks!
> Menglong Dong
>
> >
> > >
> > >
> > > > u8 *prog = *pprog;
> > > >
> > > > for (i = 0; i < tl->nr_links; i++) {
> > > > + if (tl->links[i]->link.prog->call_session_cookie) {
> > > > + emit_store_stack_imm64(&prog, BPF_REG_0, -func_meta_off,
> > > > + func_meta | (cur_cookie << BPF_TRAMP_SHIFT_COOKIE));
> > > > + cur_cookie--;
> > > > + }
> > > > if (invoke_bpf_prog(m, &prog, tl->links[i], stack_size,
> > > > run_ctx_off, save_ret, image, rw_image))
> > > > return -EINVAL;
> > >
> > > [...]
> >
>
>
>
>
Powered by blists - more mailing lists