[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAM9d7ci7-oMJXg6QRb_9YGg49EAWNPS8CkFLF4Wt8Gazq1v90w@mail.gmail.com>
Date: Sun, 12 Dec 2021 23:23:07 -0800
From: Namhyung Kim <namhyung@...nel.org>
To: Song Liu <songliubraving@...com>
Cc: Arnaldo Carvalho de Melo <acme@...nel.org>,
Jiri Olsa <jolsa@...hat.com>, Ingo Molnar <mingo@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
LKML <linux-kernel@...r.kernel.org>,
Andi Kleen <ak@...ux.intel.com>,
Ian Rogers <irogers@...gle.com>,
Stephane Eranian <eranian@...gle.com>,
Changbin Du <changbin.du@...il.com>
Subject: Re: [PATCH 4/5] perf ftrace: Add -b/--use-bpf option for latency subcommand
On Tue, Dec 7, 2021 at 10:00 AM Namhyung Kim <namhyung@...nel.org> wrote:
>
> Hi Song,
>
> On Mon, Dec 6, 2021 at 5:06 PM Song Liu <songliubraving@...com> wrote:
> >
> >
> >
> > > On Nov 29, 2021, at 3:18 PM, Namhyung Kim <namhyung@...nel.org> wrote:
> > >
> > > The -b/--use-bpf option is to use BPF to get latency info of kernel
> > > functions. It'd have better performance impact and I observed that
> > > latency of same function is smaller than before when using BPF.
> > >
> > > Signed-off-by: Namhyung Kim <namhyung@...nel.org>
> > > ---
> >
> > We can actually get something similar with a bpftrace one-liner, like:
> >
> > bpftrace -e 'kprobe:mutex_lock { @start[tid] = nsecs; } kretprobe:mutex_lock /@...rt[tid] != 0/ { @delay = hist(nsecs - @start[tid]); delete(@start[tid]); } END {clear(@start); }'
> > Attaching 3 probes...
> > ^C
> >
> > @delay:
> > [256, 512) 1553006 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
> > [512, 1K) 89171 |@@ |
> > [1K, 2K) 37522 |@ |
> > [2K, 4K) 3308 | |
> > [4K, 8K) 415 | |
> > [8K, 16K) 38 | |
> > [16K, 32K) 47 | |
> > [32K, 64K) 2 | |
> > [64K, 128K) 0 | |
> > [128K, 256K) 0 | |
> > [256K, 512K) 0 | |
> > [512K, 1M) 0 | |
> > [1M, 2M) 0 | |
> > [2M, 4M) 0 | |
> > [4M, 8M) 1 | |
> >
> >
> > So I am not quite sure whether we need this for systems with BPF features.
>
> Yeah, bpftrace can do this too but there are situations one cannot
> use the tool for some reason. On the other hand, we have been
> using perf tools widely to collect performance profiles on the fleet.
>
> So it'd be really nice if we can use it as a vehicle to carry various
> innovative features using BPF. I plan to add more functionalities
> in BCC/bpftrace to the perf tools in this regard.
>
>
> >
> > Other than this, a few comments and nitpicks below.
> >
> > > diff --git a/tools/perf/util/Build b/tools/perf/util/Build
> > > index 2e5bfbb69960..294b12430d73 100644
> > > --- a/tools/perf/util/Build
> > > +++ b/tools/perf/util/Build
> > > @@ -144,6 +144,7 @@ perf-$(CONFIG_LIBBPF) += bpf-loader.o
> > > perf-$(CONFIG_LIBBPF) += bpf_map.o
> > > perf-$(CONFIG_PERF_BPF_SKEL) += bpf_counter.o
> > > perf-$(CONFIG_PERF_BPF_SKEL) += bpf_counter_cgroup.o
> > > +perf-$(CONFIG_PERF_BPF_SKEL) += bpf_ftrace.o
> > > perf-$(CONFIG_BPF_PROLOGUE) += bpf-prologue.o
> > > perf-$(CONFIG_LIBELF) += symbol-elf.o
> > > perf-$(CONFIG_LIBELF) += probe-file.o
> > > diff --git a/tools/perf/util/bpf_ftrace.c b/tools/perf/util/bpf_ftrace.c
> > > new file mode 100644
> > > index 000000000000..1975a6fe73c9
> > > --- /dev/null
> > > +++ b/tools/perf/util/bpf_ftrace.c
> > > @@ -0,0 +1,113 @@
> > > +#include <stdio.h>
> > > +#include <fcntl.h>
> > > +#include <stdint.h>
> > > +#include <stdlib.h>
> > > +
> > > +#include <linux/err.h>
> > > +
> > > +#include "util/ftrace.h"
> > > +#include "util/debug.h"
> > > +#include "util/bpf_counter.h"
> > > +
> > > +#include "util/bpf_skel/func_latency.skel.h"
> > > +
> > > +static struct func_latency_bpf *skel;
> > > +
> > > +int perf_ftrace__latency_prepare_bpf(struct perf_ftrace *ftrace)
> > > +{
> > > + int fd, err;
> > > + struct filter_entry *func;
> > > + struct bpf_link *begin_link, *end_link;
> > > +
> > > + if (!list_is_singular(&ftrace->filters)) {
> > > + pr_err("ERROR: %s target function(s).\n",
> > > + list_empty(&ftrace->filters) ? "No" : "Too many");
> > > + return -1;
> > > + }
> > > +
> > > + func = list_first_entry(&ftrace->filters, struct filter_entry, list);
> > > +
> > > + skel = func_latency_bpf__open();
> > > + if (!skel) {
> > > + pr_err("Failed to open func latency skeleton\n");
> > > + return -1;
> > > + }
> > > +
> > > + set_max_rlimit();
> > > +
> > > + err = func_latency_bpf__load(skel);
> >
> > We can do func_latency_bpf__open_and_load() to save a few lines.
>
> Sure, but I was thinking to add some modifications between
> them like in patch 5/5.
>
> >
> > > + if (err) {
> > > + pr_err("Failed to load func latency skeleton\n");
> > > + goto out;
> > > + }
> > > +
> > > + begin_link = bpf_program__attach_kprobe(skel->progs.func_begin,
> > > + false, func->name);
> > > + if (IS_ERR(begin_link)) {
> > > + pr_err("Failed to attach fentry program\n");
> > > + err = PTR_ERR(begin_link);
> > > + goto out;
> > > + }
> > > +
> > > + end_link = bpf_program__attach_kprobe(skel->progs.func_end,
> > > + true, func->name);
> > > + if (IS_ERR(end_link)) {
> > > + pr_err("Failed to attach fexit program\n");
> > > + err = PTR_ERR(end_link);
> > > + bpf_link__destroy(begin_link);
> > > + goto out;
> > > + }
> >
> > I think we are leaking begin_link and end_link here? (They will be released
> > on perf termination, but we are not freeing them in the code).
>
> Right, I'll keep them and destroy at the end.
Oh, it seems I can just use skel->links.func_begin (and _end).
>
> >
> > [...]
> >
> > > diff --git a/tools/perf/util/bpf_skel/func_latency.bpf.c b/tools/perf/util/bpf_skel/func_latency.bpf.c
> > > new file mode 100644
> > > index 000000000000..d7d31cfeabf8
> > > --- /dev/null
> > > +++ b/tools/perf/util/bpf_skel/func_latency.bpf.c
> > > @@ -0,0 +1,92 @@
> > > +// SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> > > +// Copyright (c) 2021 Google
> > > +#include "vmlinux.h"
> > > +#include <bpf/bpf_helpers.h>
> > > +#include <bpf/bpf_tracing.h>
> > > +
> > > +#define NUM_BUCKET 22
> >
> > We define NUM_BUCKET twice, which might cause issue when we change it.
> > Maybe just use bpf_map__set_max_entries() in user space?
It's also used for the loop count so I need to keep it..
Maybe I can add a comment to inform that it should be in sync
with the userspace.
Thanks,
Namhyung
Powered by blists - more mailing lists