[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200215004500.gs3ylstfo3aksfbp@ast-mbp>
Date: Fri, 14 Feb 2020 16:45:02 -0800
From: Alexei Starovoitov <alexei.starovoitov@...il.com>
To: Song Liu <songliubraving@...com>
Cc: netdev@...r.kernel.org, bpf@...r.kernel.org, kernel-team@...com,
ast@...nel.org, daniel@...earbox.net
Subject: Re: [RFC bpf-next 3/4] bpftool: introduce "prog profile" command
On Thu, Feb 13, 2020 at 01:01:14PM -0800, Song Liu wrote:
> With fentry/fexit programs, it is possible to profile BPF program with
> hardware counters. Introduce bpftool "prog profile", which measures key
> metrics of a BPF program.
>
> bpftool prog profile command creates per-cpu perf events. Then it attaches
> fentry/fexit programs to the target BPF program. The fentry program saves
> perf event value to a map. The fexit program reads the perf event again,
> and calculates the difference, which is the instructions/cycles used by
> the target program.
>
> Example input and output:
>
> ./bpftool prog profile 20 id 810 cycles instructions
> cycles: duration 20 run_cnt 1368 miss_cnt 665
> counter 503377 enabled 668202 running 351857
> instructions: duration 20 run_cnt 1368 miss_cnt 707
> counter 398625 enabled 502330 running 272014
>
> This command measures cycles and instructions for BPF program with id
> 810 for 20 seconds. The program has triggered 1368 times. cycles was not
> measured in 665 out of these runs, because of perf event multiplexing
> (some perf commands are running in the background). In these runs, the BPF
> program consumed 503377 cycles. The perf_event enabled and running time
> are 668202 and 351857 respectively.
if (diff.enabled > diff.running) increment miss_cnt.
Why show this to users?
I think 'miss_cnt' the users will interpret as data is bogus,
but it only means that the counter was multiplexed.
The data is still accurate, no?
This condition will probably be hit fairly often, no?
> tools/bpf/bpftool/profiler.skel.h | 820 ++++++++++++++++++++++
I think bpftool needs to be build twice to avoid this.
Could you change the output format to be 'perf stat' like:
55,766.51 msec task-clock # 0.996 CPUs utilized
4,891 context-switches # 0.088 K/sec
31 cpu-migrations # 0.001 K/sec
1,806,065 page-faults # 0.032 M/sec
166,819,295,451 cycles # 2.991 GHz (50.12%)
251,115,795,764 instructions # 1.51 insn per cycle (50.10%)
Also printing 'duration' is unnecessary. The user specified it at the command
line and it doesn't need to be reported back to the user.
Can you also make it optional? Until users Ctrl-C's bpftool ?
So it may look like:
$ ./bpftool prog profile id 810 cycles instructions
1,368 run_cnt
503,377 cycles
398,625 instructions # 0.79 insn per cycle
Computing additional things like 'insn per cycle' do help humans to
pay attention the issue. Like <1 ipc is not great and the next step
would be to profile this program for cache misses.
Powered by blists - more mailing lists