[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20181011225739.3pjhdy5jsegvr3ub@ast-mbp.dhcp.thefacebook.com>
Date: Thu, 11 Oct 2018 15:57:41 -0700
From: Alexei Starovoitov <alexei.starovoitov@...il.com>
To: Daniel Borkmann <daniel@...earbox.net>
Cc: john.fastabend@...il.com, davejwatson@...com,
netdev@...r.kernel.org
Subject: Re: [PATCH bpf-next 3/8] bpf, sockmap: convert to generic sk_msg
interface
On Thu, Oct 11, 2018 at 02:45:42AM +0200, Daniel Borkmann wrote:
> Add a generic sk_msg layer, and convert current sockmap and later
> kTLS over to make use of it. While sk_buff handles network packet
> representation from netdevice up to socket, sk_msg handles data
> representation from application to socket layer.
>
> This means that sk_msg framework spans across ULP users in the
> kernel, and enables features such as introspection or filtering
> of data with the help of BPF programs that operate on this data
> structure.
>
> Latter becomes in particular useful for kTLS where data encryption
> is deferred into the kernel, and as such enabling the kernel to
> perform L7 introspection and policy based on BPF for TLS connections
> where the record is being encrypted after BPF has run and came to
> a verdict. In order to get there, first step is to transform open
> coding of scatter-gather list handling into a common core framework
> that subsystems use.
>
> Joint work with John.
>
> Signed-off-by: Daniel Borkmann <daniel@...earbox.net>
> Signed-off-by: John Fastabend <john.fastabend@...il.com>
> ---
> include/linux/bpf.h | 33 +-
> include/linux/bpf_types.h | 2 +-
> include/linux/filter.h | 21 -
> include/linux/skmsg.h | 371 +++++++
> include/net/tcp.h | 27 +
> kernel/bpf/Makefile | 5 -
> kernel/bpf/core.c | 2 -
> kernel/bpf/sockmap.c | 2610 ---------------------------------------------
> kernel/bpf/syscall.c | 6 +-
> net/Kconfig | 11 +
> net/core/Makefile | 2 +
> net/core/filter.c | 270 ++---
> net/core/skmsg.c | 763 +++++++++++++
> net/core/sock_map.c | 1002 +++++++++++++++++
> net/ipv4/Makefile | 1 +
> net/ipv4/tcp_bpf.c | 655 ++++++++++++
> net/strparser/Kconfig | 4 +-
> 17 files changed, 2925 insertions(+), 2860 deletions(-)
> +void sk_msg_trim(struct sock *sk, struct sk_msg *msg, int len)
> +{
> + int trim = msg->sg.size - len;
> + u32 i = msg->sg.end;
> +
> + if (trim <= 0) {
> + WARN_ON(trim < 0);
> + return;
> + }
> +
> + sk_msg_iter_var_prev(i);
> + msg->sg.size = len;
> + while (msg->sg.data[i].length &&
> + trim >= msg->sg.data[i].length) {
> + trim -= msg->sg.data[i].length;
> + sk_msg_free_elem(sk, msg, i, true);
> + sk_msg_iter_var_prev(i);
> + if (!trim)
> + goto out;
> + }
> +
> + msg->sg.data[i].length -= trim;
> + sk_mem_uncharge(sk, trim);
> +out:
> + /* If we trim data before curr pointer update copybreak and current
> + * so that any future copy operations start at new copy location.
> + * However trimed data that has not yet been used in a copy op
> + * does not require an update.
> + */
> + if (msg->sg.curr >= i) {
> + msg->sg.curr = i;//msg->sg.end;
is this a leftover of some debugging ?
I think such giant patchset is impossible to review in reasonable
amount of time. I guess you've considered splitting it, but
couldn't find a way to do so ?
May be expand the commit log a bit more to explain not only _why_
(which is typical requiremnt), but _how_ the patchset is doing it?
Overall sk_msg generalization looks great to me. Especially
considering how it's used in the next patches,
but details are scary. sockmap was plagued with bugs that
syzbot found. Are we changing it a lot again or it's mainly
copy-paste/split into different files plus rename?
I'm willing to take a leap of faith and merge it, but would
love if commit log could make it easier to see what is going on.
Powered by blists - more mailing lists