[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b2af633d-aaae-d0c5-72f9-0688b76b4505@gmail.com>
Date: Wed, 15 Dec 2021 17:18:03 +0000
From: Pavel Begunkov <asml.silence@...il.com>
To: sdf@...gle.com
Cc: netdev@...r.kernel.org, bpf@...r.kernel.org,
Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
Andrii Nakryiko <andrii@...nel.org>,
Martin KaFai Lau <kafai@...com>,
Song Liu <songliubraving@...com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3] cgroup/bpf: fast path skb BPF filtering
On 12/15/21 16:51, sdf@...gle.com wrote:
> On 12/15, Pavel Begunkov wrote:
>> Add per socket fast path for not enabled BPF skb filtering, which sheds
>> a nice chunk of send/recv overhead when affected. Testing udp with 128
>> byte payload and/or zerocopy with any payload size showed 2-3%
>> improvement in requests/s on the tx side using fast NICs across network,
>> and around 4% for dummy device. Same goes for rx, not measured, but
>> numbers should be relatable.
>> In my understanding, this should affect a good share of machines, and at
>> least it includes my laptops and some checked servers.
>
>> The core of the problem is that even though there is
>> cgroup_bpf_enabled_key guarding from __cgroup_bpf_run_filter_skb()
>> overhead, there are cases where we have several cgroups and loading a
>> BPF program to one also makes all others to go through the slow path
>> even when they don't have any BPF attached. It's even worse, because
>> apparently systemd or some other early init loads some BPF and so
>> triggers exactly this situation for normal networking.
>
>> Signed-off-by: Pavel Begunkov <asml.silence@...il.com>
>> ---
>
>> v2: replace bitmask appoach with empty_prog_array (suggested by Martin)
>> v3: add "bpf_" prefix to empty_prog_array (Martin)
>
>> include/linux/bpf-cgroup.h | 24 +++++++++++++++++++++---
>> include/linux/bpf.h | 13 +++++++++++++
>> kernel/bpf/cgroup.c | 18 ++----------------
>> kernel/bpf/core.c | 16 ++++------------
>> 4 files changed, 40 insertions(+), 31 deletions(-)
>
>> diff --git a/include/linux/bpf-cgroup.h b/include/linux/bpf-cgroup.h
>> index 11820a430d6c..c6dacdbdf565 100644
>> --- a/include/linux/bpf-cgroup.h
>> +++ b/include/linux/bpf-cgroup.h
>> @@ -219,11 +219,28 @@ int bpf_percpu_cgroup_storage_copy(struct bpf_map *map, void *key, void *value);
>> int bpf_percpu_cgroup_storage_update(struct bpf_map *map, void *key,
>> void *value, u64 flags);
>
>> +static inline bool
>> +__cgroup_bpf_prog_array_is_empty(struct cgroup_bpf *cgrp_bpf,
>> + enum cgroup_bpf_attach_type type)
>> +{
>> + struct bpf_prog_array *array = rcu_access_pointer(cgrp_bpf->effective[type]);
>> +
>> + return array == &bpf_empty_prog_array.hdr;
>> +}
>> +
>> +#define CGROUP_BPF_TYPE_ENABLED(sk, atype) \
>> +({ \
>> + struct cgroup *__cgrp = sock_cgroup_ptr(&(sk)->sk_cgrp_data); \
>> + \
>> + !__cgroup_bpf_prog_array_is_empty(&__cgrp->bpf, (atype)); \
>> +})
>> +
>> /* Wrappers for __cgroup_bpf_run_filter_skb() guarded by cgroup_bpf_enabled. */
>> #define BPF_CGROUP_RUN_PROG_INET_INGRESS(sk, skb) \
>> ({ \
>> int __ret = 0; \
>> - if (cgroup_bpf_enabled(CGROUP_INET_INGRESS)) \
>> + if (cgroup_bpf_enabled(CGROUP_INET_INGRESS) && sk && \
>> + CGROUP_BPF_TYPE_ENABLED((sk), CGROUP_INET_INGRESS)) \
>
> Why not add this __cgroup_bpf_run_filter_skb check to
> __cgroup_bpf_run_filter_skb? Result of sock_cgroup_ptr() is already there
> and you can use it. Maybe move the things around if you want
> it to happen earlier.
For inlining. Just wanted to get it done right, otherwise I'll likely be
returning to it back in a few months complaining that I see measurable
overhead from the function call :)
--
Pavel Begunkov
Powered by blists - more mailing lists