[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CADxym3ZGco3_V7w8+ZrJwnPd6nx=YKwYASWcUFOFyLe7L5oa_w@mail.gmail.com>
Date: Tue, 15 Jul 2025 10:37:52 +0800
From: Menglong Dong <menglong8.dong@...il.com>
To: Alexei Starovoitov <alexei.starovoitov@...il.com>
Cc: Steven Rostedt <rostedt@...dmis.org>, Jiri Olsa <jolsa@...nel.org>, bpf <bpf@...r.kernel.org>,
Menglong Dong <dongml2@...natelecom.cn>, Martin KaFai Lau <martin.lau@...ux.dev>,
Eduard Zingerman <eddyz87@...il.com>, Song Liu <song@...nel.org>,
Yonghong Song <yonghong.song@...ux.dev>, John Fastabend <john.fastabend@...il.com>,
KP Singh <kpsingh@...nel.org>, Stanislav Fomichev <sdf@...ichev.me>, Hao Luo <haoluo@...gle.com>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH bpf-next v2 01/18] bpf: add function hash table for tracing-multi
On Tue, Jul 15, 2025 at 9:55 AM Alexei Starovoitov
<alexei.starovoitov@...il.com> wrote:
>
> On Thu, Jul 3, 2025 at 5:17 AM Menglong Dong <menglong8.dong@...il.com> wrote:
> >
> > We don't use rhashtable here, as the compiler is not clever enough and it
> > refused to inline the hash lookup for me, which bring in addition overhead
> > in the following BPF global trampoline.
>
> That's not good enough justification.
> rhashtable is used in many performance critical components.
> You need to figure out what was causing compiler not to inline lookup
> in your case.
> Did you make sure that params are constant as I suggested earlier?
> If 'static inline' wasn't enough, have you tried always_inline ?
Yeah, I'm sure all the params are constant. always_inline works, but I have
to replace the "inline" with "__always_inline" for rhashtable_lookup_fast,
rhashtable_lookup, __rhashtable_lookup, rht_key_get_hash, etc. After that,
everything will be inlined.
In fact, I think rhashtable is not good enough in our case, which
has high performance requirements. With rhashtable, the insn count
is 35 to finish the hash lookup. With the hash table here, it needs only
17 insn, which means the rhashtable introduces ~5% overhead.
BTW, the function padding based metadata needs only 5 insn, which
decreases 5% overhead.
>
> > The release of the metadata is controlled by the percpu ref and RCU
> > together, and have similar logic to the release of bpf trampoline image in
> > bpf_tramp_image_put().
>
> tbh the locking complexity in this patch is through the roof.
> rcu, rcu_tasks, rcu_task_trace, percpu_ref, ...
> all that look questionable.
> kfunc_mds looks to be rcu protected, but md-s are percpu_ref.
> Why? There were choices made that I don't understand the reasons for.
> I don't think we should start in depth review of rhashtable-wanne-be
> when rhashtable should just work.
In fact, all these locking is not for the mds, but for md. For mds, we protect
it with RCU only, what complex is the md. So the logic for the hashtable
that we introduced is quite simple. We allocate a now struct kfunc_md_array,
we copy the old one to it, and we assign it to kfunc_mds with
rcu_assign_pointer().
And we free the old one. That's all.
We need to protect the md the same as how we protect the trampoline image,
as it is used in the global trampoline from the beginning to the ending.
The rcu_tasks, rcu_task_trace, percpu_ref is used for that purpose. It's
complex, but it is really the same as what we do in bpf_tramp_image_put().
You wrote that part, and I think you understand me :/
For the fexit/modify_return case, percpu_ref will be used in the global
trampoline to protect the md, just like that we do with __bpf_tramp_enter().
When releasing, we will kill the percpu_ref first. Then, we use
rcu_task_trace to ensure the rest of the bpf global trampoline is finished.
Maybe I should split this part (the release of the md) to the next patch(the
bpf global trampoline) to make it easier to understand?
Thanks!
Menglong Dong
Powered by blists - more mailing lists