[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAP-5=fXCt4axmdjqDZXZzjW-gZK_KZ8R--OTR-M1sZYxZ94k5A@mail.gmail.com>
Date: Thu, 15 Aug 2024 11:28:17 -0700
From: Ian Rogers <irogers@...gle.com>
To: Howard Chu <howardchu95@...il.com>
Cc: acme@...nel.org, adrian.hunter@...el.com, jolsa@...nel.org,
kan.liang@...ux.intel.com, namhyung@...nel.org,
linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 01/10] perf trace: Fix perf trace -p <PID>
On Wed, Aug 14, 2024 at 6:36 PM Howard Chu <howardchu95@...il.com> wrote:
>
> perf trace -p <PID> work on a syscall that is unaugmented, but doesn't
> work on a syscall that's augmented (when it calls perf_event_output() in
> BPF).
>
> Let's take open() as an example. open() is augmented in perf trace.
>
> Before:
> ```
> perf $ perf trace -e open -p 3792392
> ? ( ): ... [continued]: open()) = -1 ENOENT (No such file or directory)
> ? ( ): ... [continued]: open()) = -1 ENOENT (No such file or directory)
> ```
>
> We can see there's no output.
>
> After:
> ```
> perf $ perf trace -e open -p 3792392
> 0.000 ( 0.123 ms): a.out/3792392 open(filename: "DINGZHEN", flags: WRONLY) = -1 ENOENT (No such file or directory)
> 1000.398 ( 0.116 ms): a.out/3792392 open(filename: "DINGZHEN", flags: WRONLY) = -1 ENOENT (No such file or directory)
> ```
>
> Reason:
>
> bpf_perf_event_output() will fail when you specify a pid in perf trace (EOPNOTSUPP).
>
> When using perf trace -p 114, before perf_event_open(), we'll have PID
> = 114, and CPU = -1.
>
> This is bad for bpf-output event, because the ring buffer won't accept
> output from BPF's perf_event_output(), making it fail. I'm still trying
> to find out why.
>
> If we open bpf-output for every cpu, instead of setting it to -1, like
> this:
> PID = <PID>, CPU = 0
> PID = <PID>, CPU = 1
> PID = <PID>, CPU = 2
> PID = <PID>, CPU = 3
> ...
>
> Everything works.
>
> You can test it with this script:
> ```
> #include <unistd.h>
> #include <sys/syscall.h>
>
> int main()
> {
> int i1 = 1, i2 = 2, i3 = 3, i4 = 4;
> char s1[] = "DINGZHEN", s2[] = "XUEBAO";
>
> while (1) {
> syscall(SYS_open, s1, i1, i2);
> sleep(1);
> }
>
> return 0;
> }
> ```
>
> save, compile
> ```
> gcc open.c
> ```
>
> perf trace
> ```
> perf trace -e open <path-to-the-executable>
> ```
>
> Signed-off-by: Howard Chu <howardchu95@...il.com>
> ---
> tools/perf/util/evlist.c | 14 +++++++++++++-
> tools/perf/util/evlist.h | 1 +
> 2 files changed, 14 insertions(+), 1 deletion(-)
>
> diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
> index c0379fa1ccfe..f14b7e6ff1dc 100644
> --- a/tools/perf/util/evlist.c
> +++ b/tools/perf/util/evlist.c
> @@ -1067,7 +1067,7 @@ int evlist__create_maps(struct evlist *evlist, struct target *target)
> if (!threads)
> return -1;
>
> - if (target__uses_dummy_map(target))
> + if (target__uses_dummy_map(target) && !evlist__has_bpf_output(evlist))
I'm wondering if there should be a comment above this, something like:
Do we need the "any"/-1 CPU value or do we need to open an event on
all target CPUs (the default NULL implies all online CPUs). The dummy
map indicates that we need sideband perf record events in order to
symbolize samples. The mmap-ed ring buffers need CPUs. Similarly BPF
output events need ring buffers.
I'm not 100% on the ring buffer perf_event_open restrictions, and the
man page doesn't cover it, my knowledge is implied from failures and
seeing what the tool does.
Thanks,
Ian
> cpus = perf_cpu_map__new_any_cpu();
> else
> cpus = perf_cpu_map__new(target->cpu_list);
> @@ -2627,3 +2627,15 @@ void evlist__uniquify_name(struct evlist *evlist)
> }
> }
> }
> +
> +bool evlist__has_bpf_output(struct evlist *evlist)
> +{
> + struct evsel *evsel;
> +
> + evlist__for_each_entry(evlist, evsel) {
> + if (evsel__is_bpf_output(evsel))
> + return true;
> + }
> +
> + return false;
> +}
> diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
> index b46f1a320783..bcc1c6984bb5 100644
> --- a/tools/perf/util/evlist.h
> +++ b/tools/perf/util/evlist.h
> @@ -447,5 +447,6 @@ int evlist__scnprintf_evsels(struct evlist *evlist, size_t size, char *bf);
> void evlist__check_mem_load_aux(struct evlist *evlist);
> void evlist__warn_user_requested_cpus(struct evlist *evlist, const char *cpu_list);
> void evlist__uniquify_name(struct evlist *evlist);
> +bool evlist__has_bpf_output(struct evlist *evlist);
>
> #endif /* __PERF_EVLIST_H */
> --
> 2.45.2
>
Powered by blists - more mailing lists