[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190314053815.5rvdtxbrkywwpv4a@kafai-mbp.dhcp.thefacebook.com>
Date: Thu, 14 Mar 2019 05:38:23 +0000
From: Martin Lau <kafai@...com>
To: Alexei Starovoitov <alexei.starovoitov@...il.com>
CC: "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
Alexei Starovoitov <ast@...com>,
Daniel Borkmann <daniel@...earbox.net>,
Kernel Team <Kernel-team@...com>,
Lorenz Bauer <lmb@...udflare.com>
Subject: Re: [PATCH v4 bpf 0/5] Fix bpf_tcp_sock and bpf_sk_fullsock issue
related to bpf_sk_release
On Wed, Mar 13, 2019 at 12:20:27PM -0700, Alexei Starovoitov wrote:
> On Tue, Mar 12, 2019 at 10:23:01AM -0700, Martin KaFai Lau wrote:
> > This set addresses issue about accessing invalid
> > ptr returned from bpf_tcp_sock() and bpf_sk_fullsock()
> > after bpf_sk_release().
> >
> > v4:
> > - Tried the one "id" approach. It does not work well and the reason is in
> > the Patch 1 commit message.
> > - Rename refcount_id to ref_obj_id.
> > - With ref_obj_id, resetting reg->id to 0 is fine in mark_ptr_or_null_reg()
> > because ref_obj_id is passed to release_reference() instead of reg->id.
> > - Also reset reg->ref_obj_id in mark_ptr_or_null_reg() when is_null == true
> > - sk_to_full_sk() is removed from bpf_sk_fullsock() and bpf_tcp_sock().
> > - bpf_get_listener_sock() is added to do sk_to_full_sk() in Patch 2.
> > - If tp is from bpf_tcp_sock(sk) and sk is a refcounted ptr,
> > bpf_sk_release(tp) is also allowed.
>
> Thanks for the hard work on this fix.
> I applied the set to bpf tree.
Thanks for reviewing.
>
> I have a small question regarding patch 2:
> +BPF_CALL_1(bpf_get_listener_sock, struct sock *, sk)
> +{
> + sk = sk_to_full_sk(sk);
> +
> + if (sk->sk_state == TCP_LISTEN && sock_flag(sk, SOCK_RCU_FREE))
> + return (unsigned long)sk;
> +
> + return (unsigned long)NULL;
> +}
>
> My understanding that sk->sk_state == TCP_LISTEN is enough for correctness
> and additional sock_flag(sk, SOCK_RCU_FREE) check is to prevent
> silent breakage in case listener socks will be re-implemented without rcu_free?
> In such case should we add warn_on_once(!sock_flag(sk, SOCK_RCU_FREE)) too?
> I mean in the very unlikely scenario that kernel will have such drastic
> implementation change the bpf progs will not work correctly because NULL
> is returned, but it will be harder for humans to debug?
> I think it's also ok without warn, since such huge re-implemenation will
> require all sorts of efforts, so highly unlikely we will miss this spot.
> warn_on_once feels like too much paranoia. May be just a comment ?
> Or I'm overthinking?
No. It is a valid question. Sorry that I missed details in this patch.
Here is what I had thought about in the patch:
1. inet_reqsk(sk)->rsk_listener is only set for TCP (and dccp).
Hence, sk_to_full_sk(sk) is only useful for TCP (and dccp).
2. The TCP_LISTEN sock has SOCK_RCU_FREE set when it is added
to the listening_hash in __inet_hash(). It is done for
TCP (and dccp).
3. However, not all TCP_LISTEN socket is in SOCK_RCU_FREE.
e.g. af_unix.c. It may be accessible through skb->sk.
If WARN_ON_ONCE() was used, it would be a false alarm.
For those non TCP sk, sk_to_full_sk(sk) is a no-op.
If its passed-in sk is not already in TCP_LISTEN,
bpf_get_listener_sock() has to return NULL anyway.
It is not ideal for those sk because this helper
still returns NULL if its passed-in sk is already TCP_LISTEN
because of the sock_flag(sk, SOCK_RCU_FREE) check.
On the other hand, there is sk->sk_state for the bpf_prog
to check first before calling this helper.
I have re-thought about (3). How about this:
1. Limit this helper to "sk->sk_protocol == IPPROTO_TCP".
Return NULL for others because this helper cannot
do anything extra for those sk anyway.
2. Rename this helper to bpf_get_tcp_listener_sock(). It still returns
"struct bpf_sock *" because I think the send_cwnd/srtt_us/rtt_min/...
of the listener's "struct bpf_tcp_sock *" is not very useful.
If needed, bpf_tcp_sock() can be used to get "struct bpf_tcp_sock *".
3. Add WARN_ON_ONCE() on the SOCK_RCU_FREE flag.
Powered by blists - more mailing lists