[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aC0RlqfuilOj51kT@kernel.org>
Date: Tue, 20 May 2025 19:34:46 -0400
From: Mike Snitzer <snitzer@...nel.org>
To: cel@...nel.org
Cc: Thomas Haynes <loghyr@...merspace.com>, linux-nfs@...r.kernel.org,
netdev@...r.kernel.org, kernel-tls-handshake@...ts.linux.dev,
Chuck Lever <chuck.lever@...cle.com>,
Steve Sears <sjs@...merspace.com>, Jakub Kacinski <kuba@...nel.org>
Subject: Re: [PATCH v1] SUNRPC: Prevent hang on NFS mount with xprtsec=[m]tls
On Tue, May 20, 2025 at 03:59:16PM -0400, cel@...nel.org wrote:
> From: Chuck Lever <chuck.lever@...cle.com>
>
> Engineers at Hammerspace noticed that sometimes mounting with
> "xprtsec=tls" hangs for a minute or so, and then times out, even
> when the NFS server is reachable and responsive.
>
> kTLS shuts off data_ready callbacks if strp->msg_ready is set to
> mitigate data_ready callbacks when a full TLS record is not yet
> ready to be read from the socket.
>
> Normally msg_ready is clear when the first TLS record arrives on
> a socket. However, I observed that sometimes tls_setsockopt() sets
> strp->msg_ready, and that prevents forward progress because
> tls_data_ready() becomes a no-op.
>
> Moreover, Jakub says: "If there's a full record queued at the time
> when [tlshd] passes the socket back to the kernel, it's up to the
> reader to read the already queued data out." So SunRPC cannot
> expect a data_ready call when ingress data is already waiting.
>
> Add an explicit poll after SunRPC's upper transport is set up to
> pick up any data that arrived after the TLS handshake but before
> transport set-up is complete.
>
> Reported-by: Steve Sears <sjs@...merspace.com>
> Suggested-by: Jakub Kacinski <kuba@...nel.org>
> Fixes: 75eb6af7acdf ("SUNRPC: Add a TCP-with-TLS RPC transport class")
> Signed-off-by: Chuck Lever <chuck.lever@...cle.com>
> ---
> net/sunrpc/xprtsock.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> Mike, can you try this out?
Works well, thanks to you and Jakub for seeing this through!
Tested-by: Mike Snitzer <snitzer@...nel.org>
Reviewed-by: Mike Snitzer <snitzer@...nel.org>
>
> diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
> index 83cc095846d3..4b10ecf4c265 100644
> --- a/net/sunrpc/xprtsock.c
> +++ b/net/sunrpc/xprtsock.c
> @@ -2740,6 +2740,11 @@ static void xs_tcp_tls_setup_socket(struct work_struct *work)
> }
> rpc_shutdown_client(lower_clnt);
>
> + /* Check for ingress data that arrived before the socket's
> + * ->data_ready callback was set up.
> + */
> + xs_poll_check_readable(upper_transport);
> +
> out_unlock:
> current_restore_flags(pflags, PF_MEMALLOC);
> upper_transport->clnt = NULL;
> --
> 2.49.0
>
Powered by blists - more mailing lists