lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAADnVQKE6y9h2fwX6OS837v-Uf+aBXnT_JXiN_bbo2gitZQ3tA@mail.gmail.com>
Date:   Tue, 18 Aug 2020 11:19:03 -0700
From:   Alexei Starovoitov <alexei.starovoitov@...il.com>
To:     Jakub Sitnicki <jakub@...udflare.com>
Cc:     bpf <bpf@...r.kernel.org>,
        Network Development <netdev@...r.kernel.org>,
        kernel-team <kernel-team@...udflare.com>,
        Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        "David S. Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>,
        Andrii Nakryiko <andriin@...com>,
        Lorenz Bauer <lmb@...udflare.com>,
        Marek Majkowski <marek@...udflare.com>,
        Martin KaFai Lau <kafai@...com>, Yonghong Song <yhs@...com>
Subject: Re: BPF sk_lookup v5 - TCP SYN and UDP 0-len flood benchmarks

On Tue, Aug 18, 2020 at 8:49 AM Jakub Sitnicki <jakub@...udflare.com> wrote:
>          :                      rcu_read_lock();
>          :                      run_array = rcu_dereference(net->bpf.run_array[NETNS_BPF_SK_LOOKUP]);
>     0.01 :   ffffffff817f8624:       mov    0xd68(%r12),%rsi
>          :                      if (run_array) {
>     0.00 :   ffffffff817f862c:       test   %rsi,%rsi
>     0.00 :   ffffffff817f862f:       je     ffffffff817f87a9 <__udp4_lib_lookup+0x2c9>
>          :                      struct bpf_sk_lookup_kern ctx = {
>     1.05 :   ffffffff817f8635:       xor    %eax,%eax
>     0.00 :   ffffffff817f8637:       mov    $0x6,%ecx
>     0.01 :   ffffffff817f863c:       movl   $0x110002,0x40(%rsp)
>     0.00 :   ffffffff817f8644:       lea    0x48(%rsp),%rdi
>    18.76 :   ffffffff817f8649:       rep stos %rax,%es:(%rdi)
>     1.12 :   ffffffff817f864c:       mov    0xc(%rsp),%eax
>     0.00 :   ffffffff817f8650:       mov    %ebp,0x48(%rsp)
>     0.00 :   ffffffff817f8654:       mov    %eax,0x44(%rsp)
>     0.00 :   ffffffff817f8658:       movzwl 0x10(%rsp),%eax
>     1.21 :   ffffffff817f865d:       mov    %ax,0x60(%rsp)
>     0.00 :   ffffffff817f8662:       movzwl 0x20(%rsp),%eax
>     0.00 :   ffffffff817f8667:       mov    %ax,0x62(%rsp)
>          :                      .sport          = sport,
>          :                      .dport          = dport,
>          :                      };

Such heavy hit to zero init 56-byte structure is surprising.
There are two 4-byte holes in this struct. You can try to pack it and
make sure that 'rep stoq' is used instead of 'rep stos' (8 byte at a time vs 4).

Long term we should probably stop doing *_kern style of ctx passing
into bpf progs.
We have BTF, CO-RE and freplace now. This old style of memset *_kern and manual
ctx conversion has performance implications and annoying copy-paste of ctx
conversion routines.
For this particular case instead of introducing udp4_lookup_run_bpf()
and copying registers into stack we could have used freplace of
udp4_lib_lookup2.
More verifier work needed, of course.
My main point that existing approach "lets prep args for bpf prog to
run" that is used
pretty much in every bpf hook is no longer necessary.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ