[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YmF0xYpTMoWOIl00@lakrids>
Date: Thu, 21 Apr 2022 16:14:13 +0100
From: Mark Rutland <mark.rutland@....com>
To: Steven Rostedt <rostedt@...dmis.org>
Cc: Wang ShaoBo <bobo.shaobowang@...wei.com>, cj.chengjian@...wei.com,
huawei.libin@...wei.com, xiexiuqi@...wei.com, liwei391@...wei.com,
linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
catalin.marinas@....com, will@...nel.org, zengshun.wu@...look.com
Subject: Re: [RFC PATCH -next v2 3/4] arm64/ftrace: support dynamically
allocated trampolines
On Thu, Apr 21, 2022 at 10:06:39AM -0400, Steven Rostedt wrote:
> On Thu, 21 Apr 2022 14:10:04 +0100
> Mark Rutland <mark.rutland@....com> wrote:
>
> > On Wed, Mar 16, 2022 at 06:01:31PM +0800, Wang ShaoBo wrote:
> > > From: Cheng Jian <cj.chengjian@...wei.com>
> > >
> > > When tracing multiple functions customly, a list function is called
> > > in ftrace_(regs)_caller, which makes all the other traced functions
> > > recheck the hash of the ftrace_ops when tracing happend, apparently
> > > it is inefficient.
> >
> > ... and when does that actually matter? Who does this and why?
>
> I don't think it was explained properly. What dynamically allocated
> trampolines give you is this.
Thanks for the, explanation, btw!
> Let's say you have 10 ftrace_ops registered (with bpf and kprobes this can
> be quite common). But each of these ftrace_ops traces a function (or
> functions) that are not being traced by the other ftrace_ops. That is, each
> ftrace_ops has its own unique function(s) that they are tracing. One could
> be tracing schedule, the other could be tracing ksoftirqd_should_run
> (whatever).
Ok, so that's when messing around with bpf or kprobes, and not generally
when using plain old ftrace functionality under /sys/kernel/tracing/
(unless that's concurrent with one of the former, as per your other
reply) ?
> Without this change, because the arch does not support dynamically
> allocated trampolines, it means that all these ftrace_ops will be
> registered to the same trampoline. That means, for every function that is
> traced, it will loop through all 10 of theses ftrace_ops and check their
> hashes to see if their callback should be called or not.
Sure; I can see how that can be quite expensive.
What I'm trying to figure out is who this matters to and when, since the
implementation is going to come with a bunch of subtle/fractal
complexities, and likely a substantial overhead too when enabling or
disabling tracing of a patch-site. I'd like to understand the trade-offs
better.
> With dynamically allocated trampolines, each ftrace_ops will have their own
> trampoline, and that trampoline will be called directly if the function
> is only being traced by the one ftrace_ops. This is much more efficient.
>
> If a function is traced by more than one ftrace_ops, then it falls back to
> the loop.
I see -- so the dynamic trampoline is just to get the ops? Or is that
doing additional things?
There might be a middle-ground here where we patch the ftrace_ops
pointer into a literal pool at the patch-site, which would allow us to
handle this atomically, and would avoid the issues with out-of-range
trampolines.
Thanks,
Mark.
Powered by blists - more mailing lists