[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <69dffc45-a6e1-d422-0a78-9553ed87ab15@fb.com>
Date: Sat, 9 May 2020 10:43:43 -0700
From: Yonghong Song <yhs@...com>
To: Andrii Nakryiko <andriin@...com>, <bpf@...r.kernel.org>,
<netdev@...r.kernel.org>, <ast@...com>, <daniel@...earbox.net>
CC: <andrii.nakryiko@...il.com>, <kernel-team@...com>,
John Fastabend <john.fastabend@...il.com>
Subject: Re: [PATCH v2 bpf-next 3/3] selftest/bpf: add BPF triggering
benchmark
On 5/8/20 4:20 PM, Andrii Nakryiko wrote:
> It is sometimes desirable to be able to trigger BPF program from user-space
> with minimal overhead. sys_enter would seem to be a good candidate, yet in
Probably "with minimal external noise"? Typically, overhead means the
overhead from test infrastructure itself?
> a lot of cases there will be a lot of noise from syscalls triggered by other
> processes on the system. So while searching for low-overhead alternative, I've
> stumbled upon getpgid() syscall, which seems to be specific enough to not
> suffer from accidental syscall by other apps.
>
> This set of benchmarks compares tp, raw_tp w/ filtering by syscall ID, kprobe,
> fentry and fmod_ret with returning error (so that syscall would not be
> executed), to determine the lowest-overhead way. Here are results on my
> machine (using benchs/run_bench_trigger.sh script):
>
> base : 9.200 ± 0.319M/s
> tp : 6.690 ± 0.125M/s
> rawtp : 8.571 ± 0.214M/s
> kprobe : 6.431 ± 0.048M/s
> fentry : 8.955 ± 0.241M/s
> fmodret : 8.903 ± 0.135M/s
The relative ranking of different approaches is still similar to patch
#2. But this patch reinforces that benchmarking really needs to reduce
the noise to get highest number.
>
> So it seems like fmodret doesn't give much benefit for such lightweight
> syscall. Raw tracepoint is pretty decent despite additional filtering logic,
> but it will be called for any other syscall in the system, which rules it out.
> Fentry, though, seems to be adding the least amoung of overhead and achieves
> 97.3% of performance of baseline no-BPF-attached syscall.
>
> Using getpgid() seems to be preferable to set_task_comm() approach from
> test_overhead, as it's about 2.35x faster in a baseline performance.
>
> Acked-by: John Fastabend <john.fastabend@...il.com>
> Signed-off-by: Andrii Nakryiko <andriin@...com>
Acked-by: Yonghong Song <yhs@...com>
> ---
> tools/testing/selftests/bpf/Makefile | 4 +-
> tools/testing/selftests/bpf/bench.c | 12 ++
> .../selftests/bpf/benchs/bench_trigger.c | 167 ++++++++++++++++++
> .../selftests/bpf/benchs/run_bench_trigger.sh | 9 +
> .../selftests/bpf/progs/trigger_bench.c | 47 +++++
> 5 files changed, 238 insertions(+), 1 deletion(-)
> create mode 100644 tools/testing/selftests/bpf/benchs/bench_trigger.c
> create mode 100755 tools/testing/selftests/bpf/benchs/run_bench_trigger.sh
> create mode 100644 tools/testing/selftests/bpf/progs/trigger_bench.c
>
[...]
Powered by blists - more mailing lists