[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210525101429.5f80116b@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com>
Date: Tue, 25 May 2021 10:14:29 -0700
From: Jakub Kicinski <kuba@...nel.org>
To: Maxim Mikityanskiy <maximmi@...dia.com>
Cc: Boris Pismenny <borisp@...dia.com>,
John Fastabend <john.fastabend@...il.com>,
Daniel Borkmann <daniel@...earbox.net>,
"David S. Miller" <davem@...emloft.net>,
Aviad Yehezkel <aviadye@...dia.com>,
"Tariq Toukan" <tariqt@...dia.com>, <netdev@...r.kernel.org>
Subject: Re: [PATCH net 1/2] net/tls: Replace TLS_RX_SYNC_RUNNING with RCU
On Tue, 25 May 2021 11:52:20 +0300 Maxim Mikityanskiy wrote:
> On 2021-05-24 19:05, Jakub Kicinski wrote:
> > On Mon, 24 May 2021 15:12:19 +0300 Maxim Mikityanskiy wrote:
> >> RCU synchronization is guaranteed to finish in finite time, unlike a
> >> busy loop that polls a flag. This patch is a preparation for the bugfix
> >> in the next patch, where the same synchronize_net() call will also be
> >> used to sync with the TX datapath.
> >>
> >> Signed-off-by: Maxim Mikityanskiy <maximmi@...dia.com>
> >> Reviewed-by: Tariq Toukan <tariqt@...dia.com>
> >> ---
> >> include/net/tls.h | 1 -
> >> net/tls/tls_device.c | 10 +++-------
> >> 2 files changed, 3 insertions(+), 8 deletions(-)
> >>
> >> diff --git a/include/net/tls.h b/include/net/tls.h
> >> index 3eccb525e8f7..6531ace2a68b 100644
> >> --- a/include/net/tls.h
> >> +++ b/include/net/tls.h
> >> @@ -193,7 +193,6 @@ struct tls_offload_context_tx {
> >> (sizeof(struct tls_offload_context_tx) + TLS_DRIVER_STATE_SIZE_TX)
> >>
> >> enum tls_context_flags {
> >> - TLS_RX_SYNC_RUNNING = 0,
> >> /* Unlike RX where resync is driven entirely by the core in TX only
> >> * the driver knows when things went out of sync, so we need the flag
> >> * to be atomic.
> >> diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c
> >> index 76a6f8c2eec4..171752cd6910 100644
> >> --- a/net/tls/tls_device.c
> >> +++ b/net/tls/tls_device.c
> >> @@ -680,15 +680,13 @@ static void tls_device_resync_rx(struct tls_context *tls_ctx,
> >> struct tls_offload_context_rx *rx_ctx = tls_offload_ctx_rx(tls_ctx);
> >> struct net_device *netdev;
> >>
> >> - if (WARN_ON(test_and_set_bit(TLS_RX_SYNC_RUNNING, &tls_ctx->flags)))
> >> - return;
> >> -
> >> trace_tls_device_rx_resync_send(sk, seq, rcd_sn, rx_ctx->resync_type);
> >> + rcu_read_lock();
> >> netdev = READ_ONCE(tls_ctx->netdev);
> >> if (netdev)
> >> netdev->tlsdev_ops->tls_dev_resync(netdev, sk, seq, rcd_sn,
> >> TLS_OFFLOAD_CTX_DIR_RX);
> >
> > Now this can't sleep right? No bueno.
>
> No, it can't sleep under RCU. However, are you sure it was allowed to
> sleep before my change? I don't think so. Your commit e52972c11d6b
> ("net/tls: replace the sleeping lock around RX resync with a bit lock")
> mentions that "RX resync may get called from soft IRQ", which
> essentially means that it can't sleep.
>
> Furthermore, no implementations try to sleep in RX resync, as far as I
> see from reviewing the code. For example, nfp_net_tls_resync uses
> GFP_ATOMIC for RX resync and GFP_KERNEL for TX resync.
> mlx5_fpga_tls_resync_rx also uses GFP_ATOMIC.
>
> So, I don't think I'm breaking anything with my change.
You're right.
Powered by blists - more mailing lists