[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <33778f26-8381-8607-4e14-d59c6e5f320c@fb.com>
Date: Tue, 12 May 2020 18:10:32 -0700
From: Yonghong Song <yhs@...com>
To: Andrii Nakryiko <andriin@...com>, <bpf@...r.kernel.org>,
<netdev@...r.kernel.org>, <ast@...com>, <daniel@...earbox.net>
CC: <andrii.nakryiko@...il.com>, <kernel-team@...com>,
John Fastabend <john.fastabend@...il.com>
Subject: Re: [PATCH v3 bpf-next 3/4] selftest/bpf: fmod_ret prog and implement
test_overhead as part of bench
On 5/12/20 12:24 PM, Andrii Nakryiko wrote:
> Add fmod_ret BPF program to existing test_overhead selftest. Also re-implement
> user-space benchmarking part into benchmark runner to compare results. Results
> with ./bench are consistently somewhat lower than test_overhead's, but relative
> performance of various types of BPF programs stay consisten (e.g., kretprobe is
> noticeably slower). This slowdown seems to be coming from the fact that
> test_overhead is single-threaded, while benchmark always spins off at least
> one thread for producer. This has been confirmed by hacking multi-threaded
> test_overhead variant and also single-threaded bench variant. Resutls are
> below. run_bench_rename.sh script from benchs/ subdirectory was used to
> produce results for ./bench.
>
> Single-threaded implementations
> ===============================
>
> /* bench: single-threaded, atomics */
> base : 4.622 ± 0.049M/s
> kprobe : 3.673 ± 0.052M/s
> kretprobe : 2.625 ± 0.052M/s
> rawtp : 4.369 ± 0.089M/s
> fentry : 4.201 ± 0.558M/s
> fexit : 4.309 ± 0.148M/s
> fmodret : 4.314 ± 0.203M/s
>
> /* selftest: single-threaded, no atomics */
> task_rename base 4555K events per sec
> task_rename kprobe 3643K events per sec
> task_rename kretprobe 2506K events per sec
> task_rename raw_tp 4303K events per sec
> task_rename fentry 4307K events per sec
> task_rename fexit 4010K events per sec
> task_rename fmod_ret 3984K events per sec
>
> Multi-threaded implementations
> ==============================
>
> /* bench: multi-threaded w/ atomics */
> base : 3.910 ± 0.023M/s
> kprobe : 3.048 ± 0.037M/s
> kretprobe : 2.300 ± 0.015M/s
> rawtp : 3.687 ± 0.034M/s
> fentry : 3.740 ± 0.087M/s
> fexit : 3.510 ± 0.009M/s
> fmodret : 3.485 ± 0.050M/s
>
> /* selftest: multi-threaded w/ atomics */
> task_rename base 3872K events per sec
> task_rename kprobe 3068K events per sec
> task_rename kretprobe 2350K events per sec
> task_rename raw_tp 3731K events per sec
> task_rename fentry 3639K events per sec
> task_rename fexit 3558K events per sec
> task_rename fmod_ret 3511K events per sec
>
> /* selftest: multi-threaded, no atomics */
> task_rename base 3945K events per sec
> task_rename kprobe 3298K events per sec
> task_rename kretprobe 2451K events per sec
> task_rename raw_tp 3718K events per sec
> task_rename fentry 3782K events per sec
> task_rename fexit 3543K events per sec
> task_rename fmod_ret 3526K events per sec
>
> Note that the fact that ./bench benchmark always uses atomic increments for
> counting, while test_overhead doesn't, doesn't influence test results all that
> much.
>
> Acked-by: John Fastabend <john.fastabend@...il.com>
> Signed-off-by: Andrii Nakryiko <andriin@...com>
Acked-by: Yonghong Song <yhs@...com>
Powered by blists - more mailing lists