[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190514155857.1f10ef78@cakuba.netronome.com>
Date: Tue, 14 May 2019 15:58:57 -0700
From: Jakub Kicinski <jakub.kicinski@...ronome.com>
To: John Fastabend <john.fastabend@...il.com>
Cc: ast@...nel.org, daniel@...earbox.net, netdev@...r.kernel.org,
bpf@...r.kernel.org, Eric Dumazet <edumazet@...gle.com>
Subject: Re: [bpf PATCH v4 1/4] bpf: tls, implement unhash to avoid
transition out of ESTABLISHED
On Tue, 14 May 2019 15:34:55 -0700, John Fastabend wrote:
> John Fastabend wrote:
> > Jakub Kicinski wrote:
> > > On Thu, 09 May 2019 21:57:49 -0700, John Fastabend wrote:
> > > > @@ -2042,12 +2060,14 @@ void tls_sw_free_resources_tx(struct sock *sk)
> > > > if (atomic_read(&ctx->encrypt_pending))
> > > > crypto_wait_req(-EINPROGRESS, &ctx->async_wait);
> > > >
> > > > - release_sock(sk);
> > > > + if (locked)
> > > > + release_sock(sk);
> > > > cancel_delayed_work_sync(&ctx->tx_work.work);
> > >
> > > So in the splat I got (on a slightly hacked up kernel) it seemed like
> > > unhash may be called in atomic context:
> > >
> > > [ 783.232150] tls_sk_proto_unhash+0x72/0x110 [tls]
> > > [ 783.237497] tcp_set_state+0x484/0x640
> > > [ 783.241776] ? __sk_mem_reduce_allocated+0x72/0x4a0
> > > [ 783.247317] ? tcp_recv_timestamp+0x5c0/0x5c0
> > > [ 783.252265] ? tcp_write_queue_purge+0xa6a/0x1180
> > > [ 783.257614] tcp_done+0xac/0x260
> > > [ 783.261309] tcp_reset+0xbe/0x350
> > > [ 783.265101] tcp_validate_incoming+0xd9d/0x1530
> > >
> > > I may have been unclear off-list, I only tested the patch no longer
> > > crashes the offload :(
> > >
> >
> > Yep, I misread and thought it was resolved here as well. OK I'll dig into
> > it. I'm not seeing it from selftests but I guess that means we are missing
> > a testcase. :( yet another version I guess.
> >
>
> Seems we need to call release_sock in the unhash case as well. Will
> send a new patch shortly.
My reading of the stack trace was that unhash gets called from
tcp_reset(), IOW from soft IRQ, so we can't cancel_delayed_work_sync()
in tls_sw_free_resources_tx(), no?
Powered by blists - more mailing lists