netdev - Re: [bpf PATCH v4 1/4] bpf: tls, implement unhash to avoid transition out of ESTABLISHED

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <5cdb92cc4bed5_12292afa9bb1c5b8d5@john-XPS-13-9360.notmuch>
Date:   Tue, 14 May 2019 21:17:16 -0700
From:   John Fastabend <john.fastabend@...il.com>
To:     Jakub Kicinski <jakub.kicinski@...ronome.com>
Cc:     ast@...nel.org, daniel@...earbox.net, netdev@...r.kernel.org,
        bpf@...r.kernel.org, Eric Dumazet <edumazet@...gle.com>
Subject: Re: [bpf PATCH v4 1/4] bpf: tls, implement unhash to avoid transition
 out of ESTABLISHED

Jakub Kicinski wrote:
> On Tue, 14 May 2019 15:34:55 -0700, John Fastabend wrote:
> > John Fastabend wrote:
> > > Jakub Kicinski wrote:  
> > > > On Thu, 09 May 2019 21:57:49 -0700, John Fastabend wrote:  
> > > > > @@ -2042,12 +2060,14 @@ void tls_sw_free_resources_tx(struct sock *sk)
> > > > >  	if (atomic_read(&ctx->encrypt_pending))
> > > > >  		crypto_wait_req(-EINPROGRESS, &ctx->async_wait);
> > > > >  
> > > > > -	release_sock(sk);
> > > > > +	if (locked)
> > > > > +		release_sock(sk);
> > > > >  	cancel_delayed_work_sync(&ctx->tx_work.work);  
> > > > 
> > > > So in the splat I got (on a slightly hacked up kernel) it seemed like
> > > > unhash may be called in atomic context:
> > > > 
> > > > [  783.232150]  tls_sk_proto_unhash+0x72/0x110 [tls]
> > > > [  783.237497]  tcp_set_state+0x484/0x640
> > > > [  783.241776]  ? __sk_mem_reduce_allocated+0x72/0x4a0
> > > > [  783.247317]  ? tcp_recv_timestamp+0x5c0/0x5c0
> > > > [  783.252265]  ? tcp_write_queue_purge+0xa6a/0x1180
> > > > [  783.257614]  tcp_done+0xac/0x260
> > > > [  783.261309]  tcp_reset+0xbe/0x350
> > > > [  783.265101]  tcp_validate_incoming+0xd9d/0x1530
> > > > 
> > > > I may have been unclear off-list, I only tested the patch no longer
> > > > crashes the offload :(
> > > >   
> > > 
> > > Yep, I misread and thought it was resolved here as well. OK I'll dig into
> > > it. I'm not seeing it from selftests but I guess that means we are missing
> > > a testcase. :( yet another version I guess.
> > >   
> > 
> > Seems we need to call release_sock in the unhash case as well. Will
> > send a new patch shortly.
> 
> My reading of the stack trace was that unhash gets called from
> tcp_reset(), IOW from soft IRQ, so we can't cancel_delayed_work_sync()
> in tls_sw_free_resources_tx(), no?

Well the tcp_close() path has the lock held and can also call unhash(). Anyways
this dropping the sock lock in the middle of the block seems a bit suspect
to me anyways. I think we can defer the free until after sock is released this
is how it was solved on sockmap side.