netdev - Re: [bpf PATCH v2 1/2] bpf: sockmap, fix crash when ipv6 sock is added

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:   Tue, 12 Jun 2018 01:14:25 +0200
From:   Daniel Borkmann <daniel@...earbox.net>
To:     John Fastabend <john.fastabend@...il.com>, edumazet@...gle.com,
        weiwan@...gle.com, ast@...nel.org
Cc:     netdev@...r.kernel.org
Subject: Re: [bpf PATCH v2 1/2] bpf: sockmap, fix crash when ipv6 sock is
 added

Hi John,

On 06/08/2018 05:06 PM, John Fastabend wrote:
> This fixes a crash where we assign tcp_prot to IPv6 sockets instead
> of tcpv6_prot.
> 
> Previously we overwrote the sk->prot field with tcp_prot even in the
> AF_INET6 case. This patch ensures the correct tcp_prot and tcpv6_prot
> are used. Further, only allow ESTABLISHED connections to join the
> map per note in TLS ULP,
> 
>    /* The TLS ulp is currently supported only for TCP sockets
>     * in ESTABLISHED state.
>     * Supporting sockets in LISTEN state will require us
>     * to modify the accept implementation to clone rather then
>     * share the ulp context.
>     */
> 
> Also tested with 'netserver -6' and 'netperf -H [IPv6]' as well as
> 'netperf -H [IPv4]'. The ESTABLISHED check resolves the previously
> crashing case here.
> 
> Fixes: 174a79ff9515 ("bpf: sockmap with sk redirect support")
> Reported-by: syzbot+5c063698bdbfac19f363@...kaller.appspotmail.com
> Signed-off-by: John Fastabend <john.fastabend@...il.com>
> Signed-off-by: Wei Wang <weiwan@...gle.com>
[...]

Still one question for some more clarification below that popped up while
review:

> @@ -162,6 +164,8 @@ static bool bpf_tcp_stream_read(const struct sock *sk)
>  }
>  
>  static struct proto tcp_bpf_proto;
> +static struct proto tcpv6_bpf_proto;

These two are global, w/o locking.

>  static int bpf_tcp_init(struct sock *sk)
>  {
>  	struct smap_psock *psock;
> @@ -181,14 +185,30 @@ static int bpf_tcp_init(struct sock *sk)
>  	psock->save_close = sk->sk_prot->close;
>  	psock->sk_proto = sk->sk_prot;
>  
> +	if (sk->sk_family == AF_INET6) {
> +		tcpv6_bpf_proto = *sk->sk_prot;
> +		tcpv6_bpf_proto.close = bpf_tcp_close;
> +	} else {
> +		tcp_bpf_proto = *sk->sk_prot;
> +		tcp_bpf_proto.close = bpf_tcp_close;
> +	}

And each time we add a BPF ULP to a v4/v6 socket, we override tcp{,v6}_bpf_proto
from scratch.

>  	if (psock->bpf_tx_msg) {
> +		tcpv6_bpf_proto.sendmsg = bpf_tcp_sendmsg;
> +		tcpv6_bpf_proto.sendpage = bpf_tcp_sendpage;
> +		tcpv6_bpf_proto.recvmsg = bpf_tcp_recvmsg;
> +		tcpv6_bpf_proto.stream_memory_read = bpf_tcp_stream_read;
>  		tcp_bpf_proto.sendmsg = bpf_tcp_sendmsg;
>  		tcp_bpf_proto.sendpage = bpf_tcp_sendpage;
>  		tcp_bpf_proto.recvmsg = bpf_tcp_recvmsg;
>  		tcp_bpf_proto.stream_memory_read = bpf_tcp_stream_read;
>  	}
>  
> -	sk->sk_prot = &tcp_bpf_proto;
> +	if (sk->sk_family == AF_INET6)
> +		sk->sk_prot = &tcpv6_bpf_proto;
> +	else
> +		sk->sk_prot = &tcp_bpf_proto;

Where every active socket would be affected from it as well. Isn't that
generally racy? E.g. existing ones where tcpv6_bpf_proto.sendmsg points
to bpf_tcp_sendmsg would get overridden with earlier assignment on the
tcpv6_bpf_proto = *sk->sk_prot during their lifetime after bpf_tcp_init().

In the kTLS case, the v4 protos are built up in module init via tls_register()
and never change from there. The v6 ones are only reloaded when their addr
changes e.g. module reload would come to mind, which should only be possible
once no active v6 socket is present. What speaks against adapting similar
scheme resp. what am I missing that the above would work? (Would be nice if
there was some discussion in commit log related to it on 'why' this approach
was done differently.)

Thanks,
Daniel

>  	rcu_read_unlock();
>  	return 0;
>  }
> @@ -1111,8 +1131,6 @@ static void bpf_tcp_msg_add(struct smap_psock *psock,
>  
>  static int bpf_tcp_ulp_register(void)
>  {
> -	tcp_bpf_proto = tcp_prot;
> -	tcp_bpf_proto.close = bpf_tcp_close;
>  	/* Once BPF TX ULP is registered it is never unregistered. It
>  	 * will be in the ULP list for the lifetime of the system. Doing
>  	 * duplicate registers is not a problem.
>