netdev - Re: [RFC bpf-next 07/11] bpf: Add helper to retrieve socket in BPF

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:   Mon, 14 May 2018 20:16:59 -0700
From:   Alexei Starovoitov <alexei.starovoitov@...il.com>
To:     Joe Stringer <joe@...d.net.nz>
Cc:     Martin KaFai Lau <kafai@...com>, daniel@...earbox.net,
        netdev <netdev@...r.kernel.org>, ast@...nel.org,
        john fastabend <john.fastabend@...il.com>
Subject: Re: [RFC bpf-next 07/11] bpf: Add helper to retrieve socket in BPF

On Fri, May 11, 2018 at 05:54:33PM -0700, Joe Stringer wrote:
> On 11 May 2018 at 14:41, Martin KaFai Lau <kafai@...com> wrote:
> > On Fri, May 11, 2018 at 02:08:01PM -0700, Joe Stringer wrote:
> >> On 10 May 2018 at 22:00, Martin KaFai Lau <kafai@...com> wrote:
> >> > On Wed, May 09, 2018 at 02:07:05PM -0700, Joe Stringer wrote:
> >> >> This patch adds a new BPF helper function, sk_lookup() which allows BPF
> >> >> programs to find out if there is a socket listening on this host, and
> >> >> returns a socket pointer which the BPF program can then access to
> >> >> determine, for instance, whether to forward or drop traffic. sk_lookup()
> >> >> takes a reference on the socket, so when a BPF program makes use of this
> >> >> function, it must subsequently pass the returned pointer into the newly
> >> >> added sk_release() to return the reference.
> >> >>
> >> >> By way of example, the following pseudocode would filter inbound
> >> >> connections at XDP if there is no corresponding service listening for
> >> >> the traffic:
> >> >>
> >> >>   struct bpf_sock_tuple tuple;
> >> >>   struct bpf_sock_ops *sk;
> >> >>
> >> >>   populate_tuple(ctx, &tuple); // Extract the 5tuple from the packet
> >> >>   sk = bpf_sk_lookup(ctx, &tuple, sizeof tuple, netns, 0);
> >> >>   if (!sk) {
> >> >>     // Couldn't find a socket listening for this traffic. Drop.
> >> >>     return TC_ACT_SHOT;
> >> >>   }
> >> >>   bpf_sk_release(sk, 0);
> >> >>   return TC_ACT_OK;
> >> >>
> >> >> Signed-off-by: Joe Stringer <joe@...d.net.nz>
> >> >> ---
> >>
> >> ...
> >>
> >> >> @@ -4032,6 +4036,96 @@ static const struct bpf_func_proto bpf_skb_get_xfrm_state_proto = {
> >> >>  };
> >> >>  #endif
> >> >>
> >> >> +struct sock *
> >> >> +sk_lookup(struct net *net, struct bpf_sock_tuple *tuple) {
> >> > Would it be possible to have another version that
> >> > returns a sk without taking its refcnt?
> >> > It may have performance benefit.
> >>
> >> Not really. The sockets are not RCU-protected, and established sockets
> >> may be torn down without notice. If we don't take a reference, there's
> >> no guarantee that the socket will continue to exist for the duration
> >> of running the BPF program.
> >>
> >> From what I follow, the comment below has a hidden implication which
> >> is that sockets without SOCK_RCU_FREE, eg established sockets, may be
> >> directly freed regardless of RCU.
> > Right, SOCK_RCU_FREE sk is the one I am concern about.
> > For example, TCP_LISTEN socket does not require taking a refcnt
> > now.  Doing a bpf_sk_lookup() may have a rather big
> > impact on handling TCP syn flood.  or the usual intention
> > is to redirect instead of passing it up to the stack?
> 
> I see, if you're only interested in listen sockets then probably this
> series could be extended with a new flag, eg something like
> BPF_F_SK_FIND_LISTENERS which restricts the set of possible sockets
> found to only listen sockets, then the implementation would call into
> __inet_lookup_listener() instead of inet_lookup(). The presence of
> that flag in the relevant register during CALL instruction would show
> that the verifier should not reference-track the result, then there'd
> need to be a check on the release to ensure that this unreferenced
> socket is never released. Just a thought, completely untested and I
> could still be missing some detail..
> 
> That said, I don't completely follow how you would expect to handle
> the traffic for sockets that are already established - the helper
> would no longer find those sockets, so you wouldn't know whether to
> pass the traffic up the stack for established traffic or not.

I think Martin has a valid concern here that if somebody starts using
this helper on the rx traffic the bpf program (via these two new
helpers) will be doing refcnt++ and refcnt-- even for listener
sockets which will cause synflood to suffer.
One can argue that this is bpf author mistake, but without fixes
(and api changes) to the helper the programmer doesn't really have a way
of avoiding this situation.
Also udp sockets don't need refcnt at all.
How about we split this single helper into three:
- bpf_sk_lookup_tcp_established() that will return refcnt-ed socket
and has to be bpf_sk_release()d by the program.
- bpf_sk_lookup_tcp_listener() that doesn't refcnt, since progs
run in rcu.
- bpf_sk_lookup_udp() that also doesn't refcnt.
The logic you want to put into this helper can be easily
replicated with these three helpers and the whole thing will
be much faster.
Thoughts?