netdev - Re: [bpf PATCH] bpf, sockmap: RCU splat with TLS redirect and strparser

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <5ef4d9ca149ed_486f2aacfa4de5b44e@john-XPS-13-9370.notmuch>
Date:   Thu, 25 Jun 2020 10:07:22 -0700
From:   John Fastabend <john.fastabend@...il.com>
To:     John Fastabend <john.fastabend@...il.com>,
        Martin KaFai Lau <kafai@...com>,
        John Fastabend <john.fastabend@...il.com>
Cc:     jakub@...udflare.com, daniel@...earbox.net, ast@...nel.org,
        netdev@...r.kernel.org, bpf@...r.kernel.org
Subject: Re: [bpf PATCH] bpf, sockmap: RCU splat with TLS redirect and
 strparser

John Fastabend wrote:
> Martin KaFai Lau wrote:
> > On Wed, Jun 24, 2020 at 02:09:23PM -0700, John Fastabend wrote:
> > > Redirect on non-TLS sockmap side has RCU lock held from sockmap code
> > > path but when called from TLS this is no longer true. The RCU section
> > > is needed because we use rcu dereference to fetch the psock of the
> > > socket we are redirecting to.
> > sk_psock_verdict_apply() is also called by sk_psock_strp_read() after
> > rcu_read_unlock().  This issue should not be limited to tls?
> 
> The base case is covered because the non-TLS case is wrapped in
> rcu_read_lock/unlock here,
> 
>  static void sk_psock_strp_data_ready(struct sock *sk)
>  {
> 	struct sk_psock *psock;
> 
> 	rcu_read_lock();
> 	psock = sk_psock(sk);
> 	if (likely(psock)) {
> 		if (tls_sw_has_ctx_rx(sk)) {
> 			psock->parser.saved_data_ready(sk);
> 		} else {
> 			write_lock_bh(&sk->sk_callback_lock);
> 			strp_data_ready(&psock->parser.strp);
> 			write_unlock_bh(&sk->sk_callback_lock);
> 		}
> 	}
> 	rcu_read_unlock();
>  }
> 
> There is a case that has existed for awhile where if a skb_clone()
> fails or alloc_skb_for_msg() fails when building a merged skb. We
> could call back into sk_psock_strp_read() from a workqueue in
> strparser that would not be covered by above sk_psock_strp_data_ready().
> This would hit the sk_psock_verdict_apply() you caught above.
> 
> We don't actually see this from selftests because in selftests we
> always return skb->len indicating a msg is a single skb. In our
> use cases this is all we've ever used to date so we've not actually
> hit the case you call out. Another case that might hit this, based
> on code review, is some of the zero copy TX cases.
> 
> To fix the above case I think its best to submit another patch
> because the Fixes tag will be different. Sound OK? I could push
> them as a two patch series if that is easier to understand.

Sorry not enough coffee this morning the fix here is enough for
both cases I'll update the commit message.

> 
> Also I should have a patch shortly to allow users to skip setting
> up a parse_msg hook for the strparser. This helps because the
> vast majority of use cases I've seen just use skb->len as the
> message deliminator. It also bypasses above concern.
> 
> Thanks,
> John