netdev - Re: [bpf PATCH v2 1/8] bpf: sockmap/tls, during free we may call tcp_bpf

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87pnfnscvl.fsf@cloudflare.com>
Date:   Mon, 13 Jan 2020 14:29:50 +0100
From:   Jakub Sitnicki <jakub@...udflare.com>
To:     John Fastabend <john.fastabend@...il.com>
Cc:     bpf@...r.kernel.org, netdev@...r.kernel.org, daniel@...earbox.net,
        ast@...nel.org, song@...nel.org, jonathan.lemon@...il.com
Subject: Re: [bpf PATCH v2 1/8] bpf: sockmap/tls, during free we may call tcp_bpf_unhash() in loop

On Sat, Jan 11, 2020 at 07:11 AM CET, John Fastabend wrote:
> When a sockmap is free'd and a socket in the map is enabled with tls
> we tear down the bpf context on the socket, the psock struct and state,
> and then call tcp_update_ulp(). The tcp_update_ulp() call is to inform
> the tls stack it needs to update its saved sock ops so that when the tls
> socket is later destroyed it doesn't try to call the now destroyed psock
> hooks.
>
> This is about keeping stacked ULPs in good shape so they always have
> the right set of stacked ops.
>
> However, recently unhash() hook was removed from TLS side. But, the
> sockmap/bpf side is not doing any extra work to update the unhash op
> when is torn down instead expecting TLS side to manage it. So both
> TLS and sockmap believe the other side is managing the op and instead
> no one updates the hook so it continues to point at tcp_bpf_unhash().
> When unhash hook is called we call tcp_bpf_unhash() which detects the
> psock has already been destroyed and calls sk->sk_prot_unhash() which
> calls tcp_bpf_unhash() yet again and so on looping and hanging the core.
>
> To fix have sockmap tear down logic fixup the stale pointer.
>
> Fixes: 5d92e631b8be ("net/tls: partially revert fix transition through disconnect with close")
> Reported-by: syzbot+83979935eb6304f8cd46@...kaller.appspotmail.com
> Cc: stable@...r.kernel.org
> Acked-by: Song Liu <songliubraving@...com>
> Signed-off-by: John Fastabend <john.fastabend@...il.com>
> ---
>  include/linux/skmsg.h | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h
> index ef7031f8a304..b6afe01f8592 100644
> --- a/include/linux/skmsg.h
> +++ b/include/linux/skmsg.h
> @@ -358,6 +358,7 @@ static inline void sk_psock_update_proto(struct sock *sk,
>  static inline void sk_psock_restore_proto(struct sock *sk,
>  					  struct sk_psock *psock)
>  {
> +	sk->sk_prot->unhash = psock->saved_unhash;

We could also restore it from psock->sk_proto->unhash if we were not
NULL'ing on first call, right?

I've been wondering what is the purpose of having psock->saved_unhash
and psock->saved_close if we have the whole sk->sk_prot saved in
psock->sk_proto.

>  	sk->sk_write_space = psock->saved_write_space;
>
>  	if (psock->sk_proto) {

Reviewed-by: Jakub Sitnicki <jakub@...udflare.com>