lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250730095641.660800b1@gandalf.local.home>
Date: Wed, 30 Jul 2025 09:56:41 -0400
From: Steven Rostedt <rostedt@...dmis.org>
To: Jiri Olsa <olsajiri@...il.com>
Cc: Mark Rutland <mark.rutland@....com>, Steven Rostedt
 <rostedt@...nel.org>, Florent Revest <revest@...gle.com>,
 bpf@...r.kernel.org, linux-kernel@...r.kernel.org,
 linux-trace-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
 Alexei Starovoitov <ast@...nel.org>, Daniel Borkmann
 <daniel@...earbox.net>, Andrii Nakryiko <andrii@...nel.org>, Menglong Dong
 <menglong8.dong@...il.com>, Naveen N Rao <naveen@...nel.org>, Michael
 Ellerman <mpe@...erman.id.au>, Björn Töpel
 <bjorn@...osinc.com>, Andy Chiu <andybnac@...il.com>, Alexandre Ghiti
 <alexghiti@...osinc.com>, Palmer Dabbelt <palmer@...belt.com>
Subject: Re: [RFC 00/10] ftrace,bpf: Use single direct ops for bpf
 trampolines

On Wed, 30 Jul 2025 13:19:51 +0200
Jiri Olsa <olsajiri@...il.com> wrote:

> so it's all work on PoC stage, the idea is to be able to attach many
> (like 20,30,40k) functions to their trampolines quickly, which at the
> moment is slow because all the involved interfaces work with just single
> function/tracempoline relation

Sounds like you are reinventing the ftrace mechanism itself. Which I warned
against when I first introduced direct trampolines, which were purposely
designed to do a few functions, not thousands. But, oh well.


> Steven, please correct me if/when I'm wrong ;-)
> 
> IIUC in x86_64, IF there's just single ftrace_ops defined for the function,
> it will bypass ftrace trampoline and call directly the direct trampoline
> for the function, like:
> 
>    <foo>:
>      call direct_trampoline
>      ...

Yes.

And it will also do the same for normal ftrace functions. If you have:

struct ftrace_ops {
	.func = myfunc;
};

It will create a trampoline that has:

      <tramp>
	...
	call myfunc
	...
	ret

On x86, I believe the ftrace_ops for myfunc is added to the trampoline,
where as in arm, it's part of the function header. To modify it, it
requires converting to the list operation (which ignores the ops
parameter), then the ops at the function gets changed before it goes to the
new function.

And if it is the only ops attached to a function foo, the function foo
would have:

      <foo>
	call tramp
	...

But what's nice about this is that if you have 12 different ftrace_ops that
each attach to a 1000 different functions, but no two ftrace_ops attach to
the same function, they all do the above. No hash needed!

> 
> IF there are other ftrace_ops 'users' on the same function, we execute
> each of them like:
> 
>   <foo>:
>     call ftrace_trampoline
>       call ftrace_ops_1->func
>       call ftrace_ops_2->func
>       ...
> 
> with our direct ftrace_ops->func currently using ftrace_ops->direct_call
> to return direct trampoline for the function:
> 
> 	-static void call_direct_funcs(unsigned long ip, unsigned long pip,
> 	-                             struct ftrace_ops *ops, struct ftrace_regs *fregs)
> 	-{
> 	-       unsigned long addr = READ_ONCE(ops->direct_call);
> 	-
> 	-       if (!addr)
> 	-               return;
> 	-
> 	-       arch_ftrace_set_direct_caller(fregs, addr);
> 	-}
> 
> in the new changes it will do hash lookup (based on ip) for the direct
> trampoline we want to execute:
> 
> 	+static void call_direct_funcs_hash(unsigned long ip, unsigned long pip,
> 	+                                  struct ftrace_ops *ops, struct ftrace_regs *fregs)
> 	+{
> 	+       unsigned long addr;
> 	+
> 	+       addr = ftrace_find_rec_direct(ip);
> 	+       if (!addr)
> 	+               return;
> 	+
> 	+       arch_ftrace_set_direct_caller(fregs, addr);
> 	+}

I think the above will work.

> 
> still this is the slow path for the case where multiple ftrace_ops objects use
> same function.. for the fast path we have the direct attachment as described above
> 
> sorry I probably forgot/missed discussion on this, but doing the fast path like in
> x86_64 is not an option in arm, right?

That's a question for Mark, right?

-- Steve

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ