linux-kernel - Re: [PATCH] perf-stat: introduce bperf, share hardware PMCs with BPF

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <B7934C3F-4414-45AA-BE39-8FE3C64B7E7D@fb.com>
Date:   Fri, 12 Mar 2021 16:09:53 +0000
From:   Song Liu <songliubraving@...com>
To:     Jiri Olsa <jolsa@...hat.com>
CC:     linux-kernel <linux-kernel@...r.kernel.org>,
        Kernel Team <Kernel-team@...com>,
        "acme@...nel.org" <acme@...nel.org>,
        "acme@...hat.com" <acme@...hat.com>,
        "namhyung@...nel.org" <namhyung@...nel.org>,
        "jolsa@...nel.org" <jolsa@...nel.org>
Subject: Re: [PATCH] perf-stat: introduce bperf, share hardware PMCs with BPF



> On Mar 12, 2021, at 7:45 AM, Song Liu <songliubraving@...com> wrote:
> 
> 
> 
>> On Mar 12, 2021, at 4:12 AM, Jiri Olsa <jolsa@...hat.com> wrote:
>> 
>> On Thu, Mar 11, 2021 at 06:02:57PM -0800, Song Liu wrote:
>>> perf uses performance monitoring counters (PMCs) to monitor system
>>> performance. The PMCs are limited hardware resources. For example,
>>> Intel CPUs have 3x fixed PMCs and 4x programmable PMCs per cpu.
>>> 
>>> Modern data center systems use these PMCs in many different ways:
>>> system level monitoring, (maybe nested) container level monitoring, per
>>> process monitoring, profiling (in sample mode), etc. In some cases,
>>> there are more active perf_events than available hardware PMCs. To allow
>>> all perf_events to have a chance to run, it is necessary to do expensive
>>> time multiplexing of events.
>>> 
>>> On the other hand, many monitoring tools count the common metrics (cycles,
>>> instructions). It is a waste to have multiple tools create multiple
>>> perf_events of "cycles" and occupy multiple PMCs.
>>> 
>>> bperf tries to reduce such wastes by allowing multiple perf_events of
>>> "cycles" or "instructions" (at different scopes) to share PMUs. Instead
>>> of having each perf-stat session to read its own perf_events, bperf uses
>>> BPF programs to read the perf_events and aggregate readings to BPF maps.
>>> Then, the perf-stat session(s) reads the values from these BPF maps.
>>> 
>>> Please refer to the comment before the definition of bperf_ops for the
>>> description of bperf architecture.
>>> 
>>> bperf is off by default. To enable it, pass --use-bpf option to perf-stat.
>>> bperf uses a BPF hashmap to share information about BPF programs and maps
>>> used by bperf. This map is pinned to bpffs. The default address is
>>> /sys/fs/bpf/bperf_attr_map. The user could change the address with option
>>> --attr-map.
>> 
>> nice, I recall the presentation about that and was wondering
>> when this will come up ;-)
> 
> The progress is slower than I expected. But I finished some dependencies of 
> this in the last year: 
> 
>  1. BPF_PROG_TEST_RUN for raw_tp event;
>  2. perf-stat -b, which introduced skeleton and bpf_counter;
>  3. BPF task local storage, I didn't use it in this version, but it could,
>     help optimize bperf in the future. 
> 
>> 
>>> 
>>> ---
>>> Known limitations:
>>> 1. Do not support per cgroup events;
>>> 2. Do not support monitoring of BPF program (perf-stat -b);
>>> 3. Do not support event groups.
>>> 
>>> The following commands have been tested:
>>> 
>>>  perf stat --use-bpf -e cycles -a
>>>  perf stat --use-bpf -e cycles -C 1,3,4
>>>  perf stat --use-bpf -e cycles -p 123
>>>  perf stat --use-bpf -e cycles -t 100,101
>> 
>> I assume the output is same as standard perf?

Btw, please give it a try. :) 

It worked pretty well in my tests. If it doesn't work for some combination 
of options, please let me know. 

Thanks,
Song

> 
> Yes, the output is identical to that without --use-bpf option.