[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKH8qBsuea-HbfDD4AB0sMgT2Gn-0P3xw0CdaTcoytRAGLa4zg@mail.gmail.com>
Date: Fri, 8 Jan 2021 11:26:47 -0800
From: Stanislav Fomichev <sdf@...gle.com>
To: Eric Dumazet <edumazet@...gle.com>
Cc: netdev <netdev@...r.kernel.org>, bpf <bpf@...r.kernel.org>,
Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
Martin KaFai Lau <kafai@...com>,
Song Liu <songliubraving@...com>,
Brian Vazquez <brianvv@...gle.com>
Subject: Re: [PATCH bpf-next v5 1/3] bpf: remove extra lock_sock for TCP_ZEROCOPY_RECEIVE
On Fri, Jan 8, 2021 at 11:23 AM Eric Dumazet <edumazet@...gle.com> wrote:
>
> On Fri, Jan 8, 2021 at 8:08 PM Stanislav Fomichev <sdf@...gle.com> wrote:
> >
> > On Fri, Jan 8, 2021 at 10:41 AM Eric Dumazet <edumazet@...gle.com> wrote:
> > >
> > > On Fri, Jan 8, 2021 at 7:26 PM Stanislav Fomichev <sdf@...gle.com> wrote:
> > > >
> > > > On Fri, Jan 8, 2021 at 10:10 AM Eric Dumazet <edumazet@...gle.com> wrote:
> > > > >
> > > > > On Fri, Jan 8, 2021 at 7:03 PM Stanislav Fomichev <sdf@...gle.com> wrote:
> > > > > >
> > > > > > Add custom implementation of getsockopt hook for TCP_ZEROCOPY_RECEIVE.
> > > > > > We skip generic hooks for TCP_ZEROCOPY_RECEIVE and have a custom
> > > > > > call in do_tcp_getsockopt using the on-stack data. This removes
> > > > > > 3% overhead for locking/unlocking the socket.
> > > > > >
> > > > > > Without this patch:
> > > > > > 3.38% 0.07% tcp_mmap [kernel.kallsyms] [k] __cgroup_bpf_run_filter_getsockopt
> > > > > > |
> > > > > > --3.30%--__cgroup_bpf_run_filter_getsockopt
> > > > > > |
> > > > > > --0.81%--__kmalloc
> > > > > >
> > > > > > With the patch applied:
> > > > > > 0.52% 0.12% tcp_mmap [kernel.kallsyms] [k] __cgroup_bpf_run_filter_getsockopt_kern
> > > > > >
> > > > >
> > > > >
> > > > > OK but we are adding yet another indirect call.
> > > > >
> > > > > Can you add a patch on top of it adding INDIRECT_CALL_INET() avoidance ?
> > > > Sure, but do you think it will bring any benefit?
> > >
> > > Sure, avoiding an indirect call might be the same gain than the
> > > lock_sock() avoidance :)
> > >
> > > > We don't have any indirect avoidance in __sys_getsockopt for the
> > > > sock->ops->getsockopt() call.
> > > > If we add it for this new bpf_bypass_getsockopt, we might as well add
> > > > it for sock->ops->getsockopt?
> > >
> > > Well, that is orthogonal to this patch.
> > > As you may know, Google kernels do have a mitigation there already and
> > > Brian may upstream it.
> > I guess my point here was that if I send it out only for bpf_bypass_getsockopt
> > it might look a bit strange because the rest of the getsockopt still
> > suffers the indirect costs.
>
>
> Each new indirect call adds a cost. If you focus on optimizing
> TCP_ZEROCOPY_RECEIVE,
> it is counter intuitive adding an expensive indirect call.
Ok, then let me resend with a mitigation in place and a note
that the rest will be added later.
> If Brian has plans to upstream the rest, maybe
> > it's better to upstream everything together with some numbers?
> > CC'ing him for his opinion.
>
> I am just saying your point about the other indirect call is already taken care.
>
> >
> > I'm happy to follow up in whatever form is best. I can also resend
> > with INDIRECT_CALL_INET2 if there are no objections in including
> > this version from the start.
> >
>
> INDIRECT_CALL_INET2 seems a strange name to me.
Any suggestion for a better name? I did play with the following:
diff --git a/include/linux/bpf-cgroup.h b/include/linux/bpf-cgroup.h
index cbba9c9ab073..f7342a30284c 100644
--- a/include/linux/bpf-cgroup.h
+++ b/include/linux/bpf-cgroup.h
@@ -371,7 +371,9 @@ int bpf_percpu_cgroup_storage_update(struct
bpf_map *map, void *key,
int __ret = retval; \
if (cgroup_bpf_enabled(BPF_CGROUP_GETSOCKOPT)) \
if (!(sock)->sk_prot->bpf_bypass_getsockopt || \
- !(sock)->sk_prot->bpf_bypass_getsockopt(level, optname)) \
+
!INDIRECT_CALL_INET1((sock)->sk_prot->bpf_bypass_getsockopt, \
+ tcp_bpf_bypass_getsockopt, \
+ level, optname)) \
__ret = __cgroup_bpf_run_filter_getsockopt( \
sock, level, optname, optval, optlen, \
max_optlen, retval); \
diff --git a/include/linux/indirect_call_wrapper.h
b/include/linux/indirect_call_wrapper.h
index 54c02c84906a..9c3252f7e9bb 100644
--- a/include/linux/indirect_call_wrapper.h
+++ b/include/linux/indirect_call_wrapper.h
@@ -54,10 +54,13 @@
#if IS_BUILTIN(CONFIG_IPV6)
#define INDIRECT_CALL_INET(f, f2, f1, ...) \
INDIRECT_CALL_2(f, f2, f1, __VA_ARGS__)
+#define INDIRECT_CALL_INET1(f, f1, ...) INDIRECT_CALL_1(f, f1, __VA_ARGS__)
#elif IS_ENABLED(CONFIG_INET)
#define INDIRECT_CALL_INET(f, f2, f1, ...) INDIRECT_CALL_1(f, f1, __VA_ARGS__)
+#define INDIRECT_CALL_INET1(f, f1, ...) INDIRECT_CALL_1(f, f1, __VA_ARGS__)
#else
#define INDIRECT_CALL_INET(f, f2, f1, ...) f(__VA_ARGS__)
+#define INDIRECT_CALL_INET1(f, f1, ...) f(__VA_ARGS__)
#endif
#endif
Powered by blists - more mailing lists