[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <cbbd5f73-f20f-42fc-8b21-8d6f97d52cf9@iogearbox.net>
Date: Sat, 23 Nov 2019 00:47:25 +0100
From: Daniel Borkmann <daniel@...earbox.net>
To: Alexei Starovoitov <ast@...nel.org>, davem@...emloft.net
Cc: netdev@...r.kernel.org, bpf@...r.kernel.org, kernel-team@...com
Subject: Re: [PATCH bpf-next] selftests/bpf: Add BPF trampoline performance
test
On 11/22/19 2:15 AM, Alexei Starovoitov wrote:
> Add a test that benchmarks different ways of attaching BPF program to a kernel function.
> Here are the results for 2.4Ghz x86 cpu on a kernel without mitigations:
> $ ./test_progs -n 49 -v|grep events
> task_rename base 2743K events per sec
> task_rename kprobe 2419K events per sec
> task_rename kretprobe 1876K events per sec
> task_rename raw_tp 2578K events per sec
> task_rename fentry 2710K events per sec
> task_rename fexit 2685K events per sec
>
> On a kernel with retpoline:
> $ ./test_progs -n 49 -v|grep events
> task_rename base 2401K events per sec
> task_rename kprobe 1930K events per sec
> task_rename kretprobe 1485K events per sec
> task_rename raw_tp 2053K events per sec
> task_rename fentry 2351K events per sec
> task_rename fexit 2185K events per sec
>
> All 5 approaches:
> - kprobe/kretprobe in __set_task_comm()
> - raw tracepoint in trace_task_rename()
> - fentry/fexit in __set_task_comm()
> are roughly equivalent.
>
> __set_task_comm() by itself is quite fast, so any extra instructions add up.
> Until BPF trampoline was introduced the fastest mechanism was raw tracepoint.
> kprobe via ftrace was second best. kretprobe is slow due to trap. New
> fentry/fexit methods via BPF trampoline are clearly the fastest and the
> difference is more pronounced with retpoline on, since BPF trampoline doesn't
> use indirect jumps.
>
> Signed-off-by: Alexei Starovoitov <ast@...nel.org>
Applied, thanks!
Powered by blists - more mailing lists