linux-kernel - Re: [RFC PATCH -next v2 3/4] arm64/ftrace: support dynamically allocated trampolines

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <YmF0xYpTMoWOIl00@lakrids>
Date:   Thu, 21 Apr 2022 16:14:13 +0100
From:   Mark Rutland <mark.rutland@....com>
To:     Steven Rostedt <rostedt@...dmis.org>
Cc:     Wang ShaoBo <bobo.shaobowang@...wei.com>, cj.chengjian@...wei.com,
        huawei.libin@...wei.com, xiexiuqi@...wei.com, liwei391@...wei.com,
        linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
        catalin.marinas@....com, will@...nel.org, zengshun.wu@...look.com
Subject: Re: [RFC PATCH -next v2 3/4] arm64/ftrace: support dynamically
 allocated trampolines

On Thu, Apr 21, 2022 at 10:06:39AM -0400, Steven Rostedt wrote:
> On Thu, 21 Apr 2022 14:10:04 +0100
> Mark Rutland <mark.rutland@....com> wrote:
> 
> > On Wed, Mar 16, 2022 at 06:01:31PM +0800, Wang ShaoBo wrote:
> > > From: Cheng Jian <cj.chengjian@...wei.com>
> > > 
> > > When tracing multiple functions customly, a list function is called
> > > in ftrace_(regs)_caller, which makes all the other traced functions
> > > recheck the hash of the ftrace_ops when tracing happend, apparently
> > > it is inefficient.  
> > 
> > ... and when does that actually matter? Who does this and why?
> 
> I don't think it was explained properly. What dynamically allocated
> trampolines give you is this.

Thanks for the, explanation, btw!

> Let's say you have 10 ftrace_ops registered (with bpf and kprobes this can
> be quite common). But each of these ftrace_ops traces a function (or
> functions) that are not being traced by the other ftrace_ops. That is, each
> ftrace_ops has its own unique function(s) that they are tracing. One could
> be tracing schedule, the other could be tracing ksoftirqd_should_run
> (whatever).

Ok, so that's when messing around with bpf or kprobes, and not generally
when using plain old ftrace functionality under /sys/kernel/tracing/
(unless that's concurrent with one of the former, as per your other
reply) ?

> Without this change, because the arch does not support dynamically
> allocated trampolines, it means that all these ftrace_ops will be
> registered to the same trampoline. That means, for every function that is
> traced, it will loop through all 10 of theses ftrace_ops and check their
> hashes to see if their callback should be called or not.

Sure; I can see how that can be quite expensive.

What I'm trying to figure out is who this matters to and when, since the
implementation is going to come with a bunch of subtle/fractal
complexities, and likely a substantial overhead too when enabling or
disabling tracing of a patch-site. I'd like to understand the trade-offs
better.

> With dynamically allocated trampolines, each ftrace_ops will have their own
> trampoline, and that trampoline will be called directly if the function
> is only being traced by the one ftrace_ops. This is much more efficient.
> 
> If a function is traced by more than one ftrace_ops, then it falls back to
> the loop.

I see -- so the dynamic trampoline is just to get the ops? Or is that
doing additional things?

There might be a middle-ground here where we patch the ftrace_ops
pointer into a literal pool at the patch-site, which would allow us to
handle this atomically, and would avoid the issues with out-of-range
trampolines.

Thanks,
Mark.