[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <202006011106.8766849C2@keescook>
Date: Mon, 1 Jun 2020 11:16:14 -0700
From: Kees Cook <keescook@...omium.org>
To: Alexei Starovoitov <alexei.starovoitov@...il.com>
Cc: "zhujianwei (C)" <zhujianwei7@...wei.com>,
"bpf@...r.kernel.org" <bpf@...r.kernel.org>,
"linux-security-module@...r.kernel.org"
<linux-security-module@...r.kernel.org>,
Hehuazhen <hehuazhen@...wei.com>,
Lennart Poettering <lennart@...ttering.net>,
Christian Ehrhardt <christian.ehrhardt@...onical.com>,
Zbigniew Jędrzejewski-Szmek <zbyszek@...waw.pl>,
daniel@...earbox.net, netdev@...r.kernel.org
Subject: Re: new seccomp mode aims to improve performance
On Sun, May 31, 2020 at 10:19:15AM -0700, Alexei Starovoitov wrote:
> Thank you for crafting a benchmark.
> The only thing that it's not doing a fair comparison.
> The problem with that patch [1] that is using:
>
> static noinline u32 __seccomp_benchmark(struct bpf_prog *prog,
> const struct seccomp_data *sd)
> {
> return SECCOMP_RET_ALLOW;
> }
>
> as a benchmarking function.
> The 'noinline' keyword tells the compiler to keep the body of the function, but
> the compiler is still doing full control and data flow analysis though this
> function and it is smart enough to optimize its usage in seccomp_run_filters()
> and in __seccomp_filter() because all functions are in a single .c file.
> Lots of code gets optimized away when 'f->benchmark' is on.
>
> To make it into fair comparison I've added the following patch
> on top of your [1].
>
> diff --git a/kernel/seccomp.c b/kernel/seccomp.c
> index 2fdbf5ad8372..86204422e096 100644
> --- a/kernel/seccomp.c
> +++ b/kernel/seccomp.c
> @@ -244,7 +244,7 @@ static int seccomp_check_filter(struct sock_filter *filter, unsigned int flen)
> return 0;
> }
>
> -static noinline u32 __seccomp_benchmark(struct bpf_prog *prog,
> +__weak noinline u32 __seccomp_benchmark(struct bpf_prog *prog,
> const struct seccomp_data *sd)
>
> Please take a look at 'make kernel/seccomp.s' before and after to see the difference
> __weak keyword makes.
Ah yeah, thanks. That does bring it up to the same overhead. Nice!
> And here is what seccomp_benchmark now reports:
>
> Benchmarking 33554432 samples...
> 22.618269641 - 15.030812794 = 7587456847
> getpid native: 226 ns
> 30.792042986 - 22.619048831 = 8172994155
> getpid RET_ALLOW 1 filter: 243 ns
> 39.451435038 - 30.792836778 = 8658598260
> getpid RET_ALLOW 2 filters: 258 ns
> 47.616011529 - 39.452190830 = 8163820699
> getpid BPF-less allow: 243 ns
> Estimated total seccomp overhead for 1 filter: 17 ns
> Estimated total seccomp overhead for 2 filters: 32 ns
> Estimated seccomp per-filter overhead: 15 ns
> Estimated seccomp entry overhead: 2 ns
> Estimated BPF overhead per filter: 0 ns
>
> [...]
>
> > So, with the layered nature of seccomp filters there's a reasonable gain
> > to be seen for a O(1) bitmap lookup to skip running even a single filter,
> > even for the fastest BPF mode.
>
> This is not true.
> The O(1) bitmap implemented as kernel C code will have exactly the same speed
> as O(1) bitmap implemented as eBPF program.
Yes, that'd be true if it was the first (and only) filter. What I'm
trying to provide is a mechanism to speed up the syscalls for all
attached filters (i.e. create a seccomp fast-path). The reality of
seccomp usage is that it's very layered: systemd sets some (or many!),
then container runtime sets some, then the process itself might set
some.
--
Kees Cook
Powered by blists - more mailing lists