lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAEf4BzZLsV_MoUz4VwspzVUbJaXVn0YVsKvf=bL-WPspbw6WGA@mail.gmail.com>
Date:   Mon, 6 Dec 2021 19:01:28 -0800
From:   Andrii Nakryiko <andrii.nakryiko@...il.com>
To:     Hou Tao <houtao1@...wei.com>
Cc:     Alexei Starovoitov <ast@...nel.org>,
        Martin KaFai Lau <kafai@...com>, Yonghong Song <yhs@...com>,
        Daniel Borkmann <daniel@...earbox.net>,
        Andrii Nakryiko <andrii@...nel.org>,
        Networking <netdev@...r.kernel.org>, bpf <bpf@...r.kernel.org>
Subject: Re: [PATCH bpf-next 4/5] selftests/bpf: add benchmark for
 bpf_strncmp() helper

On Tue, Nov 30, 2021 at 6:07 AM Hou Tao <houtao1@...wei.com> wrote:
>
> Add benchmark to compare the performance between home-made strncmp()
> in bpf program and bpf_strncmp() helper. In summary, the performance
> win of bpf_strncmp() under x86-64 is greater than 18% when the compared
> string length is greater than 64, and is 179% when the length is 4095.
> Under arm64 the performance win is even bigger: 33% when the length
> is greater than 64 and 600% when the length is 4095.
>
> The following is the details:
>
> no-helper-X: use home-made strncmp() to compare X-sized string
> helper-Y: use bpf_strncmp() to compare Y-sized string
>
> Under x86-64:
>
> no-helper-1          3.504 ± 0.000M/s (drops 0.000 ± 0.000M/s)
> helper-1             3.347 ± 0.001M/s (drops 0.000 ± 0.000M/s)
>
> no-helper-8          3.357 ± 0.001M/s (drops 0.000 ± 0.000M/s)
> helper-8             3.307 ± 0.001M/s (drops 0.000 ± 0.000M/s)
>
> no-helper-32         3.064 ± 0.000M/s (drops 0.000 ± 0.000M/s)
> helper-32            3.253 ± 0.001M/s (drops 0.000 ± 0.000M/s)
>
> no-helper-64         2.563 ± 0.001M/s (drops 0.000 ± 0.000M/s)
> helper-64            3.040 ± 0.001M/s (drops 0.000 ± 0.000M/s)
>
> no-helper-128        1.975 ± 0.000M/s (drops 0.000 ± 0.000M/s)
> helper-128           2.641 ± 0.000M/s (drops 0.000 ± 0.000M/s)
>
> no-helper-512        0.759 ± 0.000M/s (drops 0.000 ± 0.000M/s)
> helper-512           1.574 ± 0.000M/s (drops 0.000 ± 0.000M/s)
>
> no-helper-2048       0.329 ± 0.000M/s (drops 0.000 ± 0.000M/s)
> helper-2048          0.602 ± 0.000M/s (drops 0.000 ± 0.000M/s)
>
> no-helper-4095       0.117 ± 0.000M/s (drops 0.000 ± 0.000M/s)
> helper-4095          0.327 ± 0.000M/s (drops 0.000 ± 0.000M/s)
>
> Under arm64:
>
> no-helper-1          2.806 ± 0.004M/s (drops 0.000 ± 0.000M/s)
> helper-1             2.819 ± 0.002M/s (drops 0.000 ± 0.000M/s)
>
> no-helper-8          2.797 ± 0.109M/s (drops 0.000 ± 0.000M/s)
> helper-8             2.786 ± 0.025M/s (drops 0.000 ± 0.000M/s)
>
> no-helper-32         2.399 ± 0.011M/s (drops 0.000 ± 0.000M/s)
> helper-32            2.703 ± 0.002M/s (drops 0.000 ± 0.000M/s)
>
> no-helper-64         2.020 ± 0.015M/s (drops 0.000 ± 0.000M/s)
> helper-64            2.702 ± 0.073M/s (drops 0.000 ± 0.000M/s)
>
> no-helper-128        1.604 ± 0.001M/s (drops 0.000 ± 0.000M/s)
> helper-128           2.516 ± 0.002M/s (drops 0.000 ± 0.000M/s)
>
> no-helper-512        0.699 ± 0.000M/s (drops 0.000 ± 0.000M/s)
> helper-512           2.106 ± 0.003M/s (drops 0.000 ± 0.000M/s)
>
> no-helper-2048       0.215 ± 0.000M/s (drops 0.000 ± 0.000M/s)
> helper-2048          1.223 ± 0.003M/s (drops 0.000 ± 0.000M/s)
>
> no-helper-4095       0.112 ± 0.000M/s (drops 0.000 ± 0.000M/s)
> helper-4095          0.796 ± 0.000M/s (drops 0.000 ± 0.000M/s)
>
> Signed-off-by: Hou Tao <houtao1@...wei.com>
> ---
>  tools/testing/selftests/bpf/Makefile          |   4 +-
>  tools/testing/selftests/bpf/bench.c           |   6 +
>  .../selftests/bpf/benchs/bench_strncmp.c      | 150 ++++++++++++++++++
>  .../selftests/bpf/benchs/run_bench_strncmp.sh |  12 ++
>  .../selftests/bpf/progs/strncmp_bench.c       |  50 ++++++
>  5 files changed, 221 insertions(+), 1 deletion(-)
>  create mode 100644 tools/testing/selftests/bpf/benchs/bench_strncmp.c
>  create mode 100755 tools/testing/selftests/bpf/benchs/run_bench_strncmp.sh
>  create mode 100644 tools/testing/selftests/bpf/progs/strncmp_bench.c
>

[...]

> diff --git a/tools/testing/selftests/bpf/progs/strncmp_bench.c b/tools/testing/selftests/bpf/progs/strncmp_bench.c
> new file mode 100644
> index 000000000000..18373a7df76e
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/progs/strncmp_bench.c
> @@ -0,0 +1,50 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/* Copyright (C) 2021. Huawei Technologies Co., Ltd */
> +#include <linux/types.h>
> +#include <linux/bpf.h>
> +#include <bpf/bpf_helpers.h>
> +#include <bpf/bpf_tracing.h>
> +
> +#define STRNCMP_STR_SZ 4096
> +
> +/* Will be updated by benchmark before program loading */
> +const volatile unsigned int cmp_str_len = 1;
> +const char target[STRNCMP_STR_SZ];
> +
> +long hits = 0;
> +char str[STRNCMP_STR_SZ];
> +
> +char _license[] SEC("license") = "GPL";
> +
> +static __always_inline int local_strncmp(const char *s1, unsigned int sz,
> +                                        const char *s2)
> +{
> +       int ret = 0;
> +       unsigned int i;
> +
> +       for (i = 0; i < sz; i++) {
> +               /* E.g. 0xff > 0x31 */
> +               ret = (unsigned char)s1[i] - (unsigned char)s2[i];

I'm actually not sure if it will perform subtraction in unsigned form
(and thus you'll never have a negative result) and then cast to int,
or not. Why not cast to int instead of unsigned char to be sure?

> +               if (ret || !s1[i])
> +                       break;
> +       }
> +
> +       return ret;
> +}
> +
> +SEC("tp/syscalls/sys_enter_getpgid")
> +int strncmp_no_helper(void *ctx)
> +{
> +       if (local_strncmp(str, cmp_str_len + 1, target) < 0)
> +               __sync_add_and_fetch(&hits, 1);
> +       return 0;
> +}
> +
> +SEC("tp/syscalls/sys_enter_getpgid")
> +int strncmp_helper(void *ctx)
> +{
> +       if (bpf_strncmp(str, cmp_str_len + 1, target) < 0)
> +               __sync_add_and_fetch(&hits, 1);
> +       return 0;
> +}
> +
> --
> 2.29.2
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ