lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAP-5=fXCt4axmdjqDZXZzjW-gZK_KZ8R--OTR-M1sZYxZ94k5A@mail.gmail.com>
Date: Thu, 15 Aug 2024 11:28:17 -0700
From: Ian Rogers <irogers@...gle.com>
To: Howard Chu <howardchu95@...il.com>
Cc: acme@...nel.org, adrian.hunter@...el.com, jolsa@...nel.org, 
	kan.liang@...ux.intel.com, namhyung@...nel.org, 
	linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 01/10] perf trace: Fix perf trace -p <PID>

On Wed, Aug 14, 2024 at 6:36 PM Howard Chu <howardchu95@...il.com> wrote:
>
> perf trace -p <PID> work on a syscall that is unaugmented, but doesn't
> work on a syscall that's augmented (when it calls perf_event_output() in
> BPF).
>
> Let's take open() as an example. open() is augmented in perf trace.
>
> Before:
> ```
> perf $ perf trace -e open -p 3792392
>          ? (         ):  ... [continued]: open())                                             = -1 ENOENT (No such file or directory)
>          ? (         ):  ... [continued]: open())                                             = -1 ENOENT (No such file or directory)
> ```
>
> We can see there's no output.
>
> After:
> ```
> perf $ perf trace -e open -p 3792392
>      0.000 ( 0.123 ms): a.out/3792392 open(filename: "DINGZHEN", flags: WRONLY)                             = -1 ENOENT (No such file or directory)
>   1000.398 ( 0.116 ms): a.out/3792392 open(filename: "DINGZHEN", flags: WRONLY)                             = -1 ENOENT (No such file or directory)
> ```
>
> Reason:
>
> bpf_perf_event_output() will fail when you specify a pid in perf trace (EOPNOTSUPP).
>
> When using perf trace -p 114, before perf_event_open(), we'll have PID
> = 114, and CPU = -1.
>
> This is bad for bpf-output event, because the ring buffer won't accept
> output from BPF's perf_event_output(), making it fail. I'm still trying
> to find out why.
>
> If we open bpf-output for every cpu, instead of setting it to -1, like
> this:
> PID = <PID>, CPU = 0
> PID = <PID>, CPU = 1
> PID = <PID>, CPU = 2
> PID = <PID>, CPU = 3
> ...
>
> Everything works.
>
> You can test it with this script:
> ```
>  #include <unistd.h>
>  #include <sys/syscall.h>
>
> int main()
> {
>         int i1 = 1, i2 = 2, i3 = 3, i4 = 4;
>         char s1[] = "DINGZHEN", s2[] = "XUEBAO";
>
>         while (1) {
>                 syscall(SYS_open, s1, i1, i2);
>                 sleep(1);
>         }
>
>         return 0;
> }
> ```
>
> save, compile
> ```
> gcc open.c
> ```
>
> perf trace
> ```
> perf trace -e open <path-to-the-executable>
> ```
>
> Signed-off-by: Howard Chu <howardchu95@...il.com>
> ---
>  tools/perf/util/evlist.c | 14 +++++++++++++-
>  tools/perf/util/evlist.h |  1 +
>  2 files changed, 14 insertions(+), 1 deletion(-)
>
> diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
> index c0379fa1ccfe..f14b7e6ff1dc 100644
> --- a/tools/perf/util/evlist.c
> +++ b/tools/perf/util/evlist.c
> @@ -1067,7 +1067,7 @@ int evlist__create_maps(struct evlist *evlist, struct target *target)
>         if (!threads)
>                 return -1;
>
> -       if (target__uses_dummy_map(target))
> +       if (target__uses_dummy_map(target) && !evlist__has_bpf_output(evlist))

I'm wondering if there should be a comment above this, something like:

Do we need the "any"/-1 CPU value or do we need to open an event on
all target CPUs (the default NULL implies all online CPUs). The dummy
map indicates that we need sideband perf record events in order to
symbolize samples. The mmap-ed ring buffers need CPUs. Similarly BPF
output events need ring buffers.

I'm not 100% on the ring buffer perf_event_open restrictions, and the
man page doesn't cover it, my knowledge is implied from failures and
seeing what the tool does.

Thanks,
Ian


>                 cpus = perf_cpu_map__new_any_cpu();
>         else
>                 cpus = perf_cpu_map__new(target->cpu_list);
> @@ -2627,3 +2627,15 @@ void evlist__uniquify_name(struct evlist *evlist)
>                 }
>         }
>  }
> +
> +bool evlist__has_bpf_output(struct evlist *evlist)
> +{
> +       struct evsel *evsel;
> +
> +       evlist__for_each_entry(evlist, evsel) {
> +               if (evsel__is_bpf_output(evsel))
> +                       return true;
> +       }
> +
> +       return false;
> +}
> diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
> index b46f1a320783..bcc1c6984bb5 100644
> --- a/tools/perf/util/evlist.h
> +++ b/tools/perf/util/evlist.h
> @@ -447,5 +447,6 @@ int evlist__scnprintf_evsels(struct evlist *evlist, size_t size, char *bf);
>  void evlist__check_mem_load_aux(struct evlist *evlist);
>  void evlist__warn_user_requested_cpus(struct evlist *evlist, const char *cpu_list);
>  void evlist__uniquify_name(struct evlist *evlist);
> +bool evlist__has_bpf_output(struct evlist *evlist);
>
>  #endif /* __PERF_EVLIST_H */
> --
> 2.45.2
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ