[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240328111330.194dcbe5@gandalf.local.home>
Date: Thu, 28 Mar 2024 11:13:30 -0400
From: Steven Rostedt <rostedt@...dmis.org>
To: 梦龙董 <dongmenglong.8@...edance.com>
Cc: Alexei Starovoitov <alexei.starovoitov@...il.com>, Jiri Olsa
<jolsa@...nel.org>, Andrii Nakryiko <andrii@...nel.org>, Alexei Starovoitov
<ast@...nel.org>, Daniel Borkmann <daniel@...earbox.net>, Martin KaFai Lau
<martin.lau@...ux.dev>, Eddy Z <eddyz87@...il.com>, Song Liu
<song@...nel.org>, Yonghong Song <yonghong.song@...ux.dev>, John Fastabend
<john.fastabend@...il.com>, KP Singh <kpsingh@...nel.org>, Stanislav
Fomichev <sdf@...gle.com>, Hao Luo <haoluo@...gle.com>, Alexander Gordeev
<agordeev@...ux.ibm.com>, Christian Borntraeger
<borntraeger@...ux.ibm.com>, Sven Schnelle <svens@...ux.ibm.com>, "David S.
Miller" <davem@...emloft.net>, David Ahern <dsahern@...nel.org>, Dave
Hansen <dave.hansen@...ux.intel.com>, X86 ML <x86@...nel.org>, Mathieu
Desnoyers <mathieu.desnoyers@...icios.com>, Quentin Monnet
<quentin@...valent.com>, bpf <bpf@...r.kernel.org>, linux-arm-kernel
<linux-arm-kernel@...ts.infradead.org>, LKML
<linux-kernel@...r.kernel.org>, linux-riscv
<linux-riscv@...ts.infradead.org>, linux-s390 <linux-s390@...r.kernel.org>,
Network Development <netdev@...r.kernel.org>,
linux-trace-kernel@...r.kernel.org, "open list:KERNEL SELFTEST FRAMEWORK"
<linux-kselftest@...r.kernel.org>, linux-stm32@...md-mailman.stormreply.com
Subject: Re: [External] Re: [PATCH bpf-next v2 1/9] bpf: tracing: add
support to record and check the accessed args
On Thu, 28 Mar 2024 22:43:46 +0800
梦龙董 <dongmenglong.8@...edance.com> wrote:
> I have done a simple benchmark on creating 1000
> trampolines. It is slow, quite slow, which consume up to
> 60s. We can't do it this way.
>
> Now, I have a bad idea. How about we introduce
> a "dynamic trampoline"? The basic logic of it can be:
>
> """
> save regs
> bpfs = trampoline_lookup_ip(ip)
> fentry = bpfs->fentries
> while fentry:
> fentry(ctx)
> fentry = fentry->next
>
> call origin
> save return value
>
> fexit = bpfs->fexits
> while fexit:
> fexit(ctx)
> fexit = fexit->next
>
> xxxxxx
> """
>
> And we lookup the "bpfs" by the function ip in a hash map
> in trampoline_lookup_ip. The type of "bpfs" is:
>
> struct bpf_array {
> struct bpf_prog *fentries;
> struct bpf_prog *fexits;
> struct bpf_prog *modify_returns;
> }
>
> When we need to attach the bpf progA to function A/B/C,
> we only need to create the bpf_arrayA, bpf_arrayB, bpf_arrayC
> and add the progA to them, and insert them to the hash map
> "direct_call_bpfs", and attach the "dynamic trampoline" to
> A/B/C. If bpf_arrayA exist, just add progA to the tail of
> bpf_arrayA->fentries. When we need to attach progB to
> B/C, just add progB to bpf_arrayB->fentries and
> bpf_arrayB->fentries.
>
> Compared to the trampoline, extra overhead is introduced
> by the hash lookuping.
>
> I have not begun to code yet, and I am not sure the overhead is
> acceptable. Considering that we also need to do hash lookup
> by the function in kprobe_multi, maybe the overhead is
> acceptable?
Sounds like you are just recreating the function management that ftrace
has. It also can add thousands of trampolines very quickly, because it does
it in batches. It takes special synchronization steps to attach to fentry.
ftrace (and I believe multi-kprobes) updates all the attachments for each
step, so the synchronization needed is only done once.
If you really want to have thousands of functions, why not just register it
with ftrace itself. It will give you the arguments via the ftrace_regs
structure. Can't you just register a program as the callback?
It will probably make your accounting much easier, and just let ftrace
handle the fentry logic. That's what it was made to do.
-- Steve
Powered by blists - more mailing lists