[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180511214111.hi6ax6qoe37al37q@kafai-mbp.dhcp.thefacebook.com>
Date: Fri, 11 May 2018 14:41:13 -0700
From: Martin KaFai Lau <kafai@...com>
To: Joe Stringer <joe@...d.net.nz>
CC: <daniel@...earbox.net>, netdev <netdev@...r.kernel.org>,
<ast@...nel.org>, john fastabend <john.fastabend@...il.com>
Subject: Re: [RFC bpf-next 07/11] bpf: Add helper to retrieve socket in BPF
On Fri, May 11, 2018 at 02:08:01PM -0700, Joe Stringer wrote:
> On 10 May 2018 at 22:00, Martin KaFai Lau <kafai@...com> wrote:
> > On Wed, May 09, 2018 at 02:07:05PM -0700, Joe Stringer wrote:
> >> This patch adds a new BPF helper function, sk_lookup() which allows BPF
> >> programs to find out if there is a socket listening on this host, and
> >> returns a socket pointer which the BPF program can then access to
> >> determine, for instance, whether to forward or drop traffic. sk_lookup()
> >> takes a reference on the socket, so when a BPF program makes use of this
> >> function, it must subsequently pass the returned pointer into the newly
> >> added sk_release() to return the reference.
> >>
> >> By way of example, the following pseudocode would filter inbound
> >> connections at XDP if there is no corresponding service listening for
> >> the traffic:
> >>
> >> struct bpf_sock_tuple tuple;
> >> struct bpf_sock_ops *sk;
> >>
> >> populate_tuple(ctx, &tuple); // Extract the 5tuple from the packet
> >> sk = bpf_sk_lookup(ctx, &tuple, sizeof tuple, netns, 0);
> >> if (!sk) {
> >> // Couldn't find a socket listening for this traffic. Drop.
> >> return TC_ACT_SHOT;
> >> }
> >> bpf_sk_release(sk, 0);
> >> return TC_ACT_OK;
> >>
> >> Signed-off-by: Joe Stringer <joe@...d.net.nz>
> >> ---
>
> ...
>
> >> @@ -4032,6 +4036,96 @@ static const struct bpf_func_proto bpf_skb_get_xfrm_state_proto = {
> >> };
> >> #endif
> >>
> >> +struct sock *
> >> +sk_lookup(struct net *net, struct bpf_sock_tuple *tuple) {
> > Would it be possible to have another version that
> > returns a sk without taking its refcnt?
> > It may have performance benefit.
>
> Not really. The sockets are not RCU-protected, and established sockets
> may be torn down without notice. If we don't take a reference, there's
> no guarantee that the socket will continue to exist for the duration
> of running the BPF program.
>
> From what I follow, the comment below has a hidden implication which
> is that sockets without SOCK_RCU_FREE, eg established sockets, may be
> directly freed regardless of RCU.
Right, SOCK_RCU_FREE sk is the one I am concern about.
For example, TCP_LISTEN socket does not require taking a refcnt
now. Doing a bpf_sk_lookup() may have a rather big
impact on handling TCP syn flood. or the usual intention
is to redirect instead of passing it up to the stack?
>
> /* Sockets having SOCK_RCU_FREE will call this function after one RCU
> * grace period. This is the case for UDP sockets and TCP listeners.
> */
> static void __sk_destruct(struct rcu_head *head)
> ...
>
> Therefore without the refcount, it won't be safe.
Powered by blists - more mailing lists