linux-kernel - Re: [PATCH 4/5] perf ftrace: Add -b/--use-bpf option for latency subcommand

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <738F7D58-264D-48A2-9A83-E7D126A50471@fb.com>
Date:   Tue, 7 Dec 2021 01:05:43 +0000
From:   Song Liu <songliubraving@...com>
To:     Namhyung Kim <namhyung@...nel.org>
CC:     Arnaldo Carvalho de Melo <acme@...nel.org>,
        Jiri Olsa <jolsa@...hat.com>, Ingo Molnar <mingo@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Andi Kleen <ak@...ux.intel.com>,
        Ian Rogers <irogers@...gle.com>,
        Stephane Eranian <eranian@...gle.com>,
        Changbin Du <changbin.du@...il.com>
Subject: Re: [PATCH 4/5] perf ftrace: Add -b/--use-bpf option for latency
 subcommand



> On Nov 29, 2021, at 3:18 PM, Namhyung Kim <namhyung@...nel.org> wrote:
> 
> The -b/--use-bpf option is to use BPF to get latency info of kernel
> functions.  It'd have better performance impact and I observed that
> latency of same function is smaller than before when using BPF.
> 
> Signed-off-by: Namhyung Kim <namhyung@...nel.org>
> ---

We can actually get something similar with a bpftrace one-liner, like:

bpftrace -e 'kprobe:mutex_lock { @start[tid] = nsecs; } kretprobe:mutex_lock /@...rt[tid] != 0/ { @delay = hist(nsecs - @start[tid]); delete(@start[tid]); } END {clear(@start); }'
Attaching 3 probes...
^C

@delay:
[256, 512)       1553006 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[512, 1K)          89171 |@@                                                  |
[1K, 2K)           37522 |@                                                   |
[2K, 4K)            3308 |                                                    |
[4K, 8K)             415 |                                                    |
[8K, 16K)             38 |                                                    |
[16K, 32K)            47 |                                                    |
[32K, 64K)             2 |                                                    |
[64K, 128K)            0 |                                                    |
[128K, 256K)           0 |                                                    |
[256K, 512K)           0 |                                                    |
[512K, 1M)             0 |                                                    |
[1M, 2M)               0 |                                                    |
[2M, 4M)               0 |                                                    |
[4M, 8M)               1 |                                                    |


So I am not quite sure whether we need this for systems with BPF features. 

Other than this, a few comments and nitpicks below. 

> diff --git a/tools/perf/util/Build b/tools/perf/util/Build
> index 2e5bfbb69960..294b12430d73 100644
> --- a/tools/perf/util/Build
> +++ b/tools/perf/util/Build
> @@ -144,6 +144,7 @@ perf-$(CONFIG_LIBBPF) += bpf-loader.o
> perf-$(CONFIG_LIBBPF) += bpf_map.o
> perf-$(CONFIG_PERF_BPF_SKEL) += bpf_counter.o
> perf-$(CONFIG_PERF_BPF_SKEL) += bpf_counter_cgroup.o
> +perf-$(CONFIG_PERF_BPF_SKEL) += bpf_ftrace.o
> perf-$(CONFIG_BPF_PROLOGUE) += bpf-prologue.o
> perf-$(CONFIG_LIBELF) += symbol-elf.o
> perf-$(CONFIG_LIBELF) += probe-file.o
> diff --git a/tools/perf/util/bpf_ftrace.c b/tools/perf/util/bpf_ftrace.c
> new file mode 100644
> index 000000000000..1975a6fe73c9
> --- /dev/null
> +++ b/tools/perf/util/bpf_ftrace.c
> @@ -0,0 +1,113 @@
> +#include <stdio.h>
> +#include <fcntl.h>
> +#include <stdint.h>
> +#include <stdlib.h>
> +
> +#include <linux/err.h>
> +
> +#include "util/ftrace.h"
> +#include "util/debug.h"
> +#include "util/bpf_counter.h"
> +
> +#include "util/bpf_skel/func_latency.skel.h"
> +
> +static struct func_latency_bpf *skel;
> +
> +int perf_ftrace__latency_prepare_bpf(struct perf_ftrace *ftrace)
> +{
> +	int fd, err;
> +	struct filter_entry *func;
> +	struct bpf_link *begin_link, *end_link;
> +
> +	if (!list_is_singular(&ftrace->filters)) {
> +		pr_err("ERROR: %s target function(s).\n",
> +		       list_empty(&ftrace->filters) ? "No" : "Too many");
> +		return -1;
> +	}
> +
> +	func = list_first_entry(&ftrace->filters, struct filter_entry, list);
> +
> +	skel = func_latency_bpf__open();
> +	if (!skel) {
> +		pr_err("Failed to open func latency skeleton\n");
> +		return -1;
> +	}
> +
> +	set_max_rlimit();
> +
> +	err = func_latency_bpf__load(skel);

We can do func_latency_bpf__open_and_load() to save a few lines. 

> +	if (err) {
> +		pr_err("Failed to load func latency skeleton\n");
> +		goto out;
> +	}
> +
> +	begin_link = bpf_program__attach_kprobe(skel->progs.func_begin,
> +						 false, func->name);
> +	if (IS_ERR(begin_link)) {
> +		pr_err("Failed to attach fentry program\n");
> +		err = PTR_ERR(begin_link);
> +		goto out;
> +	}
> +
> +	end_link = bpf_program__attach_kprobe(skel->progs.func_end,
> +					      true, func->name);
> +	if (IS_ERR(end_link)) {
> +		pr_err("Failed to attach fexit program\n");
> +		err = PTR_ERR(end_link);
> +		bpf_link__destroy(begin_link);
> +		goto out;
> +	}

I think we are leaking begin_link and end_link here? (They will be released
on perf termination, but we are not freeing them in the code). 

[...]

> diff --git a/tools/perf/util/bpf_skel/func_latency.bpf.c b/tools/perf/util/bpf_skel/func_latency.bpf.c
> new file mode 100644
> index 000000000000..d7d31cfeabf8
> --- /dev/null
> +++ b/tools/perf/util/bpf_skel/func_latency.bpf.c
> @@ -0,0 +1,92 @@
> +// SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> +// Copyright (c) 2021 Google
> +#include "vmlinux.h"
> +#include <bpf/bpf_helpers.h>
> +#include <bpf/bpf_tracing.h>
> +
> +#define NUM_BUCKET  22

We define NUM_BUCKET twice, which might cause issue when we change it. 
Maybe just use bpf_map__set_max_entries() in user space?

[...]