linux-kernel - Re: [PATCH v9 00/36] tracing: fprobe: function_graph: Multi-function graph and fprobe on fgraph

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAEf4BzazmnBOr+sq7w_KeUQpP7v9o+k418tuzCMEbXXbUeg7bQ@mail.gmail.com>
Date: Mon, 29 Apr 2024 13:28:44 -0700
From: Andrii Nakryiko <andrii.nakryiko@...il.com>
To: Steven Rostedt <rostedt@...dmis.org>
Cc: "Masami Hiramatsu (Google)" <mhiramat@...nel.org>, Alexei Starovoitov <alexei.starovoitov@...il.com>, 
	Florent Revest <revest@...omium.org>, linux-trace-kernel@...r.kernel.org, 
	LKML <linux-kernel@...r.kernel.org>, Martin KaFai Lau <martin.lau@...ux.dev>, 
	bpf <bpf@...r.kernel.org>, Sven Schnelle <svens@...ux.ibm.com>, 
	Alexei Starovoitov <ast@...nel.org>, Jiri Olsa <jolsa@...nel.org>, 
	Arnaldo Carvalho de Melo <acme@...nel.org>, Daniel Borkmann <daniel@...earbox.net>, 
	Alan Maguire <alan.maguire@...cle.com>, Mark Rutland <mark.rutland@....com>, 
	Peter Zijlstra <peterz@...radead.org>, Thomas Gleixner <tglx@...utronix.de>, Guo Ren <guoren@...nel.org>
Subject: Re: [PATCH v9 00/36] tracing: fprobe: function_graph: Multi-function
 graph and fprobe on fgraph

On Sun, Apr 28, 2024 at 4:25 PM Steven Rostedt <rostedt@...dmis.org> wrote:
>
> On Thu, 25 Apr 2024 13:31:53 -0700
> Andrii Nakryiko <andrii.nakryiko@...il.com> wrote:
>
> I'm just coming back from Japan (work and then a vacation), and
> catching up on my email during the 6 hour layover in Detroit.
>
> > Hey Masami,
> >
> > I can't really review most of that code as I'm completely unfamiliar
> > with all those inner workings of fprobe/ftrace/function_graph. I left
> > a few comments where there were somewhat more obvious BPF-related
> > pieces.
> >
> > But I also did run our BPF benchmarks on probes/for-next as a baseline
> > and then with your series applied on top. Just to see if there are any
> > regressions. I think it will be a useful data point for you.
> >
> > You should be already familiar with the bench tool we have in BPF
> > selftests (I used it on some other patches for your tree).
>
> I should get familiar with your tools too.
>

It's a nifty and self-contained tool to do some micro-benchmarking, I
replied to Masami with a few details on how to build and use it.

> >
> > BASELINE
> > ========
> > kprobe         :   24.634 ± 0.205M/s
> > kprobe-multi   :   28.898 ± 0.531M/s
> > kretprobe      :   10.478 ± 0.015M/s
> > kretprobe-multi:   11.012 ± 0.063M/s
> >
> > THIS PATCH SET ON TOP
> > =====================
> > kprobe         :   25.144 ± 0.027M/s (+2%)
> > kprobe-multi   :   28.909 ± 0.074M/s
> > kretprobe      :    9.482 ± 0.008M/s (-9.5%)
> > kretprobe-multi:   13.688 ± 0.027M/s (+24%)
> >
> > These numbers are pretty stable and look to be more or less representative.
>
> Thanks for running this.
>
> >
> > As you can see, kprobes got a bit faster, kprobe-multi seems to be
> > about the same, though.
> >
> > Then (I suppose they are "legacy") kretprobes got quite noticeably
> > slower, almost by 10%. Not sure why, but looks real after re-running
> > benchmarks a bunch of times and getting stable results.
> >
> > On the other hand, multi-kretprobes got significantly faster (+24%!).
> > Again, I don't know if it is expected or not, but it's a nice
> > improvement.
> >
> > If you have any idea why kretprobes would get so much slower, it would
> > be nice to look into that and see if you can mitigate the regression
> > somehow. Thanks!
>
> My guess is that this patch set helps generic use cases for tracing the
> return of functions, but will likely add more overhead for single use
> cases. That is, kretprobe is made to be specific for a single function,
> but kretprobe-multi is more generic. Hence the generic version will
> improve at the sacrifice of the specific function. I did expect as much.
>
> That said, I think there's probably a lot of low hanging fruit that can
> be done to this series to help improve the kretprobe performance. I'm
> not sure we can get back to the baseline, but I'm hoping we can at
> least make it much better than that 10% slowdown.

That would certainly be appreciated, thanks!

But I'm also considering trying to switch to multi-kprobe/kretprobe
automatically on libbpf side, whenever possible, so that users can get
the best performance. There might still be situations where this can't
be done, so singular kprobe/kretprobe can't be completely deprecated,
but multi variants seems to be universally faster, so I'm going to
make them a default (I need to handle some backwards compat aspect,
but that's libbpf-specific stuff you shouldn't be concerned with).

>
> I'll be reviewing this patch set this week as I recover from jetlag.
>
> -- Steve