netdev - Re: [PATCH net] net: bpf: fix request

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <56d6f898-bde0-bb25-3427-12a330b29fb8@iogearbox.net>
Date:   Thu, 9 Jun 2022 22:29:15 +0200
From:   Daniel Borkmann <daniel@...earbox.net>
To:     Jon Maxwell <jmaxwell37@...il.com>, netdev@...r.kernel.org
Cc:     davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org,
        pabeni@...hat.com, atenart@...nel.org, cutaylor-pub@...oo.com,
        alexei.starovoitov@...il.com, kafai@...com, joe@...ium.io,
        i@....io, bpf@...r.kernel.org
Subject: Re: [PATCH net] net: bpf: fix request_sock leak in filter.c

On 6/9/22 3:18 AM, Jon Maxwell wrote:
> A customer reported a request_socket leak in a Calico cloud environment. We
> found that a BPF program was doing a socket lookup with takes a refcnt on
> the socket and that it was finding the request_socket but returning the parent
> LISTEN socket via sk_to_full_sk() without decrementing the child request socket
> 1st, resulting in request_sock slab object leak. This patch retains the
> existing behaviour of returning full socks to the caller but it also decrements
> the child request_socket if one is present before doing so to prevent the leak.
> 
> Thanks to Curtis Taylor for all the help in diagnosing and testing this. And
> thanks to Antoine Tenart for the reproducer and patch input.
> 
> Fixes: f7355a6c0497 bpf: ("Check sk_fullsock() before returning from bpf_sk_lookup()")
> Fixes: edbf8c01de5a bpf: ("add skc_lookup_tcp helper")
> Tested-by: Curtis Taylor <cutaylor-pub@...oo.com>
> Co-developed-by: Antoine Tenart <atenart@...nel.org>
> Signed-off-by:: Antoine Tenart <atenart@...nel.org>
> Signed-off-by: Jon Maxwell <jmaxwell37@...il.com>
> ---
>   net/core/filter.c | 20 ++++++++++++++------
>   1 file changed, 14 insertions(+), 6 deletions(-)
> 
> diff --git a/net/core/filter.c b/net/core/filter.c
> index 2e32cee2c469..e3c04ae7381f 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -6202,13 +6202,17 @@ __bpf_sk_lookup(struct sk_buff *skb, struct bpf_sock_tuple *tuple, u32 len,
>   {
>   	struct sock *sk = __bpf_skc_lookup(skb, tuple, len, caller_net,
>   					   ifindex, proto, netns_id, flags);
> +	struct sock *sk1 = sk;
>   
>   	if (sk) {
>   		sk = sk_to_full_sk(sk);
> -		if (!sk_fullsock(sk)) {
> -			sock_gen_put(sk);
> +		/* sk_to_full_sk() may return (sk)->rsk_listener, so make sure the original sk1
> +		 * sock refcnt is decremented to prevent a request_sock leak.
> +		 */
> +		if (!sk_fullsock(sk1))
> +			sock_gen_put(sk1);
> +		if (!sk_fullsock(sk))
>   			return NULL;

[ +Martin/Joe/Lorenz ]

I wonder, should we also add some asserts in here to ensure we don't get an unbalance for the
bpf_sk_release() case later on? Rough pseudocode could be something like below:

static struct sock *
__bpf_sk_lookup(struct sk_buff *skb, struct bpf_sock_tuple *tuple, u32 len,
                 struct net *caller_net, u32 ifindex, u8 proto, u64 netns_id,
                 u64 flags)
{
         struct sock *sk = __bpf_skc_lookup(skb, tuple, len, caller_net,
                                            ifindex, proto, netns_id, flags);
         if (sk) {
                 struct sock *sk2 = sk_to_full_sk(sk);

                 if (!sk_fullsock(sk2))
                         sk2 = NULL;
                 if (sk2 != sk) {
                         sock_gen_put(sk);
                         if (unlikely(sk2 && !sock_flag(sk2, SOCK_RCU_FREE))) {
                                 WARN_ONCE(1, "Found non-RCU, unreferenced socket!");
                                 sk2 = NULL;
                         }
                 }
                 sk = sk2;
         }
         return sk;
}

Thanks,
Daniel