[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250730095641.660800b1@gandalf.local.home>
Date: Wed, 30 Jul 2025 09:56:41 -0400
From: Steven Rostedt <rostedt@...dmis.org>
To: Jiri Olsa <olsajiri@...il.com>
Cc: Mark Rutland <mark.rutland@....com>, Steven Rostedt
<rostedt@...nel.org>, Florent Revest <revest@...gle.com>,
bpf@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-trace-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
Alexei Starovoitov <ast@...nel.org>, Daniel Borkmann
<daniel@...earbox.net>, Andrii Nakryiko <andrii@...nel.org>, Menglong Dong
<menglong8.dong@...il.com>, Naveen N Rao <naveen@...nel.org>, Michael
Ellerman <mpe@...erman.id.au>, Björn Töpel
<bjorn@...osinc.com>, Andy Chiu <andybnac@...il.com>, Alexandre Ghiti
<alexghiti@...osinc.com>, Palmer Dabbelt <palmer@...belt.com>
Subject: Re: [RFC 00/10] ftrace,bpf: Use single direct ops for bpf
trampolines
On Wed, 30 Jul 2025 13:19:51 +0200
Jiri Olsa <olsajiri@...il.com> wrote:
> so it's all work on PoC stage, the idea is to be able to attach many
> (like 20,30,40k) functions to their trampolines quickly, which at the
> moment is slow because all the involved interfaces work with just single
> function/tracempoline relation
Sounds like you are reinventing the ftrace mechanism itself. Which I warned
against when I first introduced direct trampolines, which were purposely
designed to do a few functions, not thousands. But, oh well.
> Steven, please correct me if/when I'm wrong ;-)
>
> IIUC in x86_64, IF there's just single ftrace_ops defined for the function,
> it will bypass ftrace trampoline and call directly the direct trampoline
> for the function, like:
>
> <foo>:
> call direct_trampoline
> ...
Yes.
And it will also do the same for normal ftrace functions. If you have:
struct ftrace_ops {
.func = myfunc;
};
It will create a trampoline that has:
<tramp>
...
call myfunc
...
ret
On x86, I believe the ftrace_ops for myfunc is added to the trampoline,
where as in arm, it's part of the function header. To modify it, it
requires converting to the list operation (which ignores the ops
parameter), then the ops at the function gets changed before it goes to the
new function.
And if it is the only ops attached to a function foo, the function foo
would have:
<foo>
call tramp
...
But what's nice about this is that if you have 12 different ftrace_ops that
each attach to a 1000 different functions, but no two ftrace_ops attach to
the same function, they all do the above. No hash needed!
>
> IF there are other ftrace_ops 'users' on the same function, we execute
> each of them like:
>
> <foo>:
> call ftrace_trampoline
> call ftrace_ops_1->func
> call ftrace_ops_2->func
> ...
>
> with our direct ftrace_ops->func currently using ftrace_ops->direct_call
> to return direct trampoline for the function:
>
> -static void call_direct_funcs(unsigned long ip, unsigned long pip,
> - struct ftrace_ops *ops, struct ftrace_regs *fregs)
> -{
> - unsigned long addr = READ_ONCE(ops->direct_call);
> -
> - if (!addr)
> - return;
> -
> - arch_ftrace_set_direct_caller(fregs, addr);
> -}
>
> in the new changes it will do hash lookup (based on ip) for the direct
> trampoline we want to execute:
>
> +static void call_direct_funcs_hash(unsigned long ip, unsigned long pip,
> + struct ftrace_ops *ops, struct ftrace_regs *fregs)
> +{
> + unsigned long addr;
> +
> + addr = ftrace_find_rec_direct(ip);
> + if (!addr)
> + return;
> +
> + arch_ftrace_set_direct_caller(fregs, addr);
> +}
I think the above will work.
>
> still this is the slow path for the case where multiple ftrace_ops objects use
> same function.. for the fast path we have the direct attachment as described above
>
> sorry I probably forgot/missed discussion on this, but doing the fast path like in
> x86_64 is not an option in arm, right?
That's a question for Mark, right?
-- Steve
Powered by blists - more mailing lists