linux-kernel - Re: [PATCH v6 3/4] perf-stat: enable counting events for BPF programs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <A6A0ED56-9CC0-4361-BE7F-CDA7A40A9822@fb.com>
Date:   Tue, 29 Dec 2020 05:53:47 +0000
From:   Song Liu <songliubraving@...com>
To:     Arnaldo Carvalho de Melo <acme@...nel.org>
CC:     lkml <linux-kernel@...r.kernel.org>,
        "peterz@...radead.org" <peterz@...radead.org>,
        "mingo@...hat.com" <mingo@...hat.com>,
        "alexander.shishkin@...ux.intel.com" 
        <alexander.shishkin@...ux.intel.com>,
        "namhyung@...nel.org" <namhyung@...nel.org>,
        "mark.rutland@....com" <mark.rutland@....com>,
        "jolsa@...hat.com" <jolsa@...hat.com>,
        Kernel Team <Kernel-team@...com>
Subject: Re: [PATCH v6 3/4] perf-stat: enable counting events for BPF programs



> On Dec 28, 2020, at 3:43 PM, Song Liu <songliubraving@...com> wrote:
> 
> 
> 
>> On Dec 28, 2020, at 12:11 PM, Arnaldo Carvalho de Melo <acme@...nel.org> wrote:
>> 
>> Em Mon, Dec 28, 2020 at 09:40:53AM -0800, Song Liu escreveu:
>>> Introduce perf-stat -b option, which counts events for BPF programs, like:
>>> 
>>> [root@...alhost ~]# ~/perf stat -e ref-cycles,cycles -b 254 -I 1000
>>>    1.487903822            115,200      ref-cycles
>>>    1.487903822             86,012      cycles
>>>    2.489147029             80,560      ref-cycles
>>>    2.489147029             73,784      cycles
>>>    3.490341825             60,720      ref-cycles
>>>    3.490341825             37,797      cycles
>>>    4.491540887             37,120      ref-cycles
>>>    4.491540887             31,963      cycles
>>> 
>>> The example above counts cycles and ref-cycles of BPF program of id 254.
>>> This is similar to bpftool-prog-profile command, but more flexible.
>>> 
>>> perf-stat -b creates per-cpu perf_event and loads fentry/fexit BPF
>>> programs (monitor-progs) to the target BPF program (target-prog). The
>>> monitor-progs read perf_event before and after the target-prog, and
>>> aggregate the difference in a BPF map. Then the user space reads data
>>> from these maps.
>>> 
>>> A new struct bpf_counter is introduced to provide common interface that
>>> uses BPF programs/maps to count perf events.
>> 
>> Segfaulting here:
>> 
>> [root@...e ~]# bpftool prog  | grep tracepoint
>> 110: tracepoint  name syscall_unaugme  tag 57cd311f2e27366b  gpl
>> 111: tracepoint  name sys_enter_conne  tag 3555418ac9476139  gpl
>> 112: tracepoint  name sys_enter_sendt  tag bc7fcadbaf7b8145  gpl
>> 113: tracepoint  name sys_enter_open  tag 0e59c3ac2bea5280  gpl
>> 114: tracepoint  name sys_enter_opena  tag 0baf443610f59837  gpl
>> 115: tracepoint  name sys_enter_renam  tag 24664e4aca62d7fa  gpl
>> 116: tracepoint  name sys_enter_renam  tag 20093e51a8634ebb  gpl
>> 117: tracepoint  name sys_enter  tag 0bc3fc9d11754ba1  gpl
>> 118: tracepoint  name sys_exit  tag 29c7ae234d79bd5c  gpl
>> [root@...e ~]#
>> [root@...e ~]# gdb perf
>> GNU gdb (GDB) Fedora 10.1-2.fc33
>> Reading symbols from perf...
>> (gdb) run stat -e instructions,cycles -b 113 -I 1000
>> Starting program: /root/bin/perf stat -e instructions,cycles -b 113 -I 1000
>> [Thread debugging using libthread_db enabled]
>> Using host libthread_db library "/lib64/libthread_db.so.1".
>> libbpf: elf: skipping unrecognized data section(9) .eh_frame
>> libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
>> libbpf: elf: skipping unrecognized data section(9) .eh_frame
>> libbpf: elf: skipping relo section(15) .rel.eh_frame for section(9) .eh_frame
>> 
>> Program received signal SIGSEGV, Segmentation fault.
>> 0x000000000058d55b in bpf_program_profiler__read (evsel=0xc612c0) at util/bpf_counter.c:217
>> 217			reading_map_fd = bpf_map__fd(skel->maps.accum_readings);
>> (gdb) bt
>> #0  0x000000000058d55b in bpf_program_profiler__read (evsel=0xc612c0) at util/bpf_counter.c:217
>> #1  0x0000000000000000 in ?? ()
>> (gdb)
>> 
>> [acme@...e perf]$ clang -v |& head -2
>> clang version 11.0.0 (Fedora 11.0.0-2.fc33)
>> Target: x86_64-unknown-linux-gnu
>> [acme@...e perf]$
>> 
>> Do you need any extra info?
> 
> Hmm... I am not able to reproduce this. I am trying to setup an environment similar
> to fc33 (clang 11, etc.). Does this segfault every time, and on all programs? 
> 

I tried it on CentOS Stream release 8, with 

  gcc version 8.4.1 20200928 (Red Hat 8.4.1-1) (GCC)
  clang version 11.0.0 (Red Hat 11.0.0-0.2.rc2.module_el8.4.0+533+50191577)

Unfortunately, I still cannot repro it. 

I didn't find the issue while looking through the code. AFAICS, the code
fail over when the skeleton is not ready, so bpf_program_profiler__read()
should find a valid skeleton. 

Could you please help run the test with the following patch? The patch is 
also available at 

    https://git.kernel.org/pub/scm/linux/kernel/git/song/linux.git perf-dash-b

Thanks,
Song


diff --git i/tools/perf/util/bpf_counter.c w/tools/perf/util/bpf_counter.c
index f2cb86a40c882..e09c571365b56 100644
--- i/tools/perf/util/bpf_counter.c
+++ w/tools/perf/util/bpf_counter.c
@@ -46,8 +46,10 @@ static int bpf_program_profiler__destroy(struct evsel *evsel)
 {
        struct bpf_counter *counter;

-       list_for_each_entry(counter, &evsel->bpf_counter_list, list)
+       list_for_each_entry(counter, &evsel->bpf_counter_list, list) {
+               pr_debug("%s counter = %lx\n", __func__, (unsigned long)counter);
                bpf_prog_profiler_bpf__destroy(counter->skel);
+       }
        INIT_LIST_HEAD(&evsel->bpf_counter_list);
        return 0;
 }
@@ -141,8 +143,14 @@ static int bpf_program_profiler_load_one(struct evsel *evsel, u32 prog_id)
        counter->skel = skel;
        list_add(&counter->list, &evsel->bpf_counter_list);
        close(prog_fd);
+       pr_debug("%s counter = %lx\n", __func__, (unsigned long)counter);
+       pr_debug("%s skel = %lx\n", __func__, (unsigned long)skel);
+       pr_debug("%s return 0\n", __func__);
        return 0;
 err_out:
+       pr_debug("%s counter = %lx\n", __func__, (unsigned long)counter);
+       pr_debug("%s skel = %lx\n", __func__, (unsigned long)skel);
+       pr_debug("%s return -1\n", __func__);
        free(counter);
        close(prog_fd);
        return -1;
@@ -214,11 +222,22 @@ static int bpf_program_profiler__read(struct evsel *evsel)
        list_for_each_entry(counter, &evsel->bpf_counter_list, list) {
                struct bpf_prog_profiler_bpf *skel = counter->skel;

+               pr_debug("%s counter = %lx\n", __func__, (unsigned long)counter);
+               pr_debug("%s skel = %lx\n", __func__, (unsigned long)skel);
+               if (!skel) {
+                       pr_err("%s !skel\n", __func__);
+                       continue;
+               }
+               if (!skel->maps.accum_readings) {
+                       pr_err("%s !skel->maps.accum_readings", __func__);
+                       continue;
+               }
+
                reading_map_fd = bpf_map__fd(skel->maps.accum_readings);

                err = bpf_map_lookup_elem(reading_map_fd, &key, values);
                if (err) {
-                       fprintf(stderr, "failed to read value\n");
+                       pr_err("failed to read value\n");
                        return err;
                }

@@ -240,6 +259,8 @@ static int bpf_program_profiler__install_pe(struct evsel *evsel, int cpu,

        list_for_each_entry(counter, &evsel->bpf_counter_list, list) {
                skel = counter->skel;
+               pr_debug("%s counter = %lx\n", __func__, (unsigned long)counter);
+               pr_debug("%s skel = %lx\n", __func__, (unsigned long)skel);
                ret = bpf_map_update_elem(bpf_map__fd(skel->maps.events),
                                          &cpu, &fd, BPF_ANY);
                if (ret)