netdev - Re: [PATCH bpf-next v5 1/3] bpf: remove extra lock_sock for TCP_ZEROCOPY

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAKH8qBuGi_7eFpX0y+HdJznMvUxZsrJtdz2O5P4WK-4H_8s8Xw@mail.gmail.com>
Date:   Fri, 8 Jan 2021 11:08:46 -0800
From:   Stanislav Fomichev <sdf@...gle.com>
To:     Eric Dumazet <edumazet@...gle.com>
Cc:     netdev <netdev@...r.kernel.org>, bpf <bpf@...r.kernel.org>,
        Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        Martin KaFai Lau <kafai@...com>,
        Song Liu <songliubraving@...com>,
        Brian Vazquez <brianvv@...gle.com>
Subject: Re: [PATCH bpf-next v5 1/3] bpf: remove extra lock_sock for TCP_ZEROCOPY_RECEIVE

On Fri, Jan 8, 2021 at 10:41 AM Eric Dumazet <edumazet@...gle.com> wrote:
>
> On Fri, Jan 8, 2021 at 7:26 PM Stanislav Fomichev <sdf@...gle.com> wrote:
> >
> > On Fri, Jan 8, 2021 at 10:10 AM Eric Dumazet <edumazet@...gle.com> wrote:
> > >
> > > On Fri, Jan 8, 2021 at 7:03 PM Stanislav Fomichev <sdf@...gle.com> wrote:
> > > >
> > > > Add custom implementation of getsockopt hook for TCP_ZEROCOPY_RECEIVE.
> > > > We skip generic hooks for TCP_ZEROCOPY_RECEIVE and have a custom
> > > > call in do_tcp_getsockopt using the on-stack data. This removes
> > > > 3% overhead for locking/unlocking the socket.
> > > >
> > > > Without this patch:
> > > >      3.38%     0.07%  tcp_mmap  [kernel.kallsyms]  [k] __cgroup_bpf_run_filter_getsockopt
> > > >             |
> > > >              --3.30%--__cgroup_bpf_run_filter_getsockopt
> > > >                        |
> > > >                         --0.81%--__kmalloc
> > > >
> > > > With the patch applied:
> > > >      0.52%     0.12%  tcp_mmap  [kernel.kallsyms]  [k] __cgroup_bpf_run_filter_getsockopt_kern
> > > >
> > >
> > >
> > > OK but we are adding yet another indirect call.
> > >
> > > Can you add a patch on top of it adding INDIRECT_CALL_INET() avoidance ?
> > Sure, but do you think it will bring any benefit?
>
> Sure, avoiding an indirect call might be the same gain than the
> lock_sock() avoidance :)
>
> > We don't have any indirect avoidance in __sys_getsockopt for the
> > sock->ops->getsockopt() call.
> > If we add it for this new bpf_bypass_getsockopt, we might as well add
> > it for sock->ops->getsockopt?
>
> Well, that is orthogonal to this patch.
> As you may know, Google kernels do have a mitigation there already and
> Brian may upstream it.
I guess my point here was that if I send it out only for bpf_bypass_getsockopt
it might look a bit strange because the rest of the getsockopt still
suffers the indirect costs. If Brian has plans to upstream the rest, maybe
it's better to upstream everything together with some numbers?
CC'ing him for his opinion.

I'm happy to follow up in whatever form is best. I can also resend
with INDIRECT_CALL_INET2 if there are no objections in including
this version from the start.

> > And we need some new INDIRECT_CALL_INET2 such that f2 doesn't get
> > disabled when ipv6 is disabled :-/
>
> The same handler is called for IPv4 and IPv6, so you need the variant
> with only one known handler (tcp_bpf_bypass_getsockopt)
>
> Only it needs to make sure CONFIG_INET is enabled.