[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20170401224255.4f8780f1@redhat.com>
Date: Sat, 1 Apr 2017 22:42:55 +0200
From: Jesper Dangaard Brouer <brouer@...hat.com>
To: Alexei Starovoitov <ast@...com>
Cc: "David S . Miller" <davem@...emloft.net>,
Daniel Borkmann <daniel@...earbox.net>,
Wang Nan <wangnan0@...wei.com>,
Martin KaFai Lau <kafai@...com>, <netdev@...r.kernel.org>,
<kernel-team@...com>, brouer@...hat.com
Subject: Re: [PATCH v2 net-next 1/6] bpf: introduce BPF_PROG_TEST_RUN
command
On Sat, 1 Apr 2017 08:45:01 -0700
Alexei Starovoitov <ast@...com> wrote:
> On 4/1/17 12:14 AM, Jesper Dangaard Brouer wrote:
> > On Thu, 30 Mar 2017 21:45:38 -0700
> > Alexei Starovoitov <ast@...com> wrote:
> >
> >> static u32 bpf_test_run(struct bpf_prog *prog, void *ctx, u32 repeat, u32 *time)
> >> +{
> >> + u64 time_start, time_spent = 0;
> >> + u32 ret = 0, i;
> >> +
> >> + if (!repeat)
> >> + repeat = 1;
> >> + time_start = ktime_get_ns();
> >
> > I've found that is useful to record the CPU cycles, as it is more
> > useful for comparing between CPUs. The nanosec time measurement varies
> > too much between CPUs and GHz. I do use nanosec measurements myself a
> > lot, but that is mostly because it is easier to relate to pps rates.
> > For eBPF code execution I think it is more useful to get a cycles cost
> > count?
>
> for micro-benchmarking of an instruction or small primitives
> like spin_lock and irq_save/restore, yes. Cycles are more interesting
> to look at. Here it's the whole program which in case of networking
> likely does at least a few map lookups.
> Also this duration field is more of sanity test then actual metric.
Okay, if it was only a sanity metric.
> > I've been using tsc[1] (rdtsc) to get the CPU cycles, I believe
> > get_cycles() the more generic call, which have arch specific impl. (but
> > can return 0 if no arch support).
> >
> > The best solution would be to use the perf infrastructure and PMU
> > counter to get both PMU cycles and instructions, as that also tell you
> > about the pipeline efficiency like instructions per cycles. I only got
> > this partly working in [1][2].
>
> to use get_cycles() or perf_event_create_kernel_counter() the current
> simple loop would become kthread pinned to cpu and so on.
> imo it's an overkill.
> The only reason 'duration' being reported is a sanity test with user
> space measurements.
> What this command allows to do is:
> $ time ./my_bpf_benchmark
> The reported time should match the kernel reported 'duration'.
> The tiny difference will come from resched. That's sanity part.
> Now we can also do
> $ perf record ./my_bpf_benchmark
Make perfect sense, to handle it this way.
> and get all perf goodness for free without adding any kernel code.
> I want this test_run command to stay execution only. All pmu and
> performance metrics should stay on perf side.
> In case of performance optimization of bpf programs we're trying
> to improve perf by changing the way program is written, hence
> we need perf to point out which line of C code is costly.
> Second is improving performance by changing JIT, map implementations
> and so on. Here we also want full perf tool power.
>
> Unfortunately there is an issue with perf today, since as soon as
> my_bpf_benchmark exits, bpf prog is unloaded and ksym is gone, so
> 'perf report' cannot associate addresses back to source code.
> We discussed a solution with Arnaldo. So that's orthogonal work in
> progress which is needed regardless of this test_run command.
Yes, that is rather unfortunate. Good to hear there is work in this area.
I've started using:
sysctl net/core/bpf_jit_kallsyms=1
and adding --kallsyms=/proc/kallsyms to perf report, which is helpful.
> User space can also pin itself to cpu instead of asking kernel to
> do it and run the same program on multiple cpus in parallel testing
> interaction between concurrent map accesses and so on.
> So by keeping test_run command as execution only primitive we allow
> user space to do all the fancy tricks and measurements.
Sound good to me! :-)
Acked-by: Jesper Dangaard Brouer <brouer@...hat.com>
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
LinkedIn: http://www.linkedin.com/in/brouer
Powered by blists - more mailing lists