[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <938c82cd-9760-42e5-b0ce-123c86710782@app.fastmail.com>
Date: Sat, 06 Dec 2025 10:12:02 -0500
From: "Chuck Lever" <cel@...nel.org>
To: "Scott Mayhew" <smayhew@...hat.com>,
"Chuck Lever" <chuck.lever@...cle.com>
Cc: kernel-tls-handshake@...ts.linux.dev, netdev@...r.kernel.org
Subject: Re: [PATCH] net/handshake: a handshake can only be cancelled once
On Sat, Dec 6, 2025, at 9:30 AM, Scott Mayhew wrote:
> When a handshake request is cancelled it is removed from the
> handshake_net->hn_requests list, but it is still present in the
> handshake_rhashtbl until it is destroyed.
>
> If a second cancellation request arrives for the same handshake request,
> then remove_pending() will return false... and assuming
> HANDSHAKE_F_REQ_COMPLETED isn't set in req->hr_flags, we'll continue
> processing through the out_true label, where we put another reference on
> the sock and a refcount underflow occurs.
>
> This can happen for example if a handshake times out - particularly if
> the SUNRPC client sends the AUTH_TLS probe to the server but doesn't
> follow it up with the ClientHello due to a problem with tlshd. When the
> timeout is hit on the server, the server will send a FIN, which triggers
> a cancellation request via xs_reset_transport(). When the timeout is
> hit on the client, another cancellation request happens via
> xs_tls_handshake_sync().
>
> Fixes: 3b3009ea8abb ("net/handshake: Create a NETLINK service for
> handling handshake requests")
> Signed-off-by: Scott Mayhew <smayhew@...hat.com>
> ---
> net/handshake/request.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/net/handshake/request.c b/net/handshake/request.c
> index 274d2c89b6b2..c7b20d167a55 100644
> --- a/net/handshake/request.c
> +++ b/net/handshake/request.c
> @@ -333,6 +333,10 @@ bool handshake_req_cancel(struct sock *sk)
> return false;
> }
>
> + /* Duplicate cancellation request */
> + trace_handshake_cancel_none(net, req, sk);
> + return false;
> +
> out_true:
> trace_handshake_cancel(net, req, sk);
>
> --
> 2.51.0
To help support engineers find this patch, I recommend using
"net/handshake: duplicate handshake cancellations leak socket" as
the short description.
The proposed solution might introduce a socket reference leak:
1. Request submitted: sock_hold() called (line 271)
2. Request accepted by daemon via handshake_req_next()
(removes from pending list)
3. Cancel called:
- remove_pending() returns FALSE (not in pending list)
- test_and_set_bit() returns FALSE (sets the bit now)
- With patch: returns FALSE, sock_put() NOT called
4. handshake_complete() called: bit already set, skips sock_put()
What if we use test_and_set_bit(HANDSHAKE_F_REQ_COMPLETED) in the
pending cancel path so duplicate cancels can be detected?
Instead of:
if (hn && remove_pending(hn, req)) {
/* Request hadn't been accepted */
goto out_true;
}
go with this bit of untested code:
if (hn && remove_pending(hn, req)) {
/* Request hadn't been accepted - mark cancelled */
if (test_and_set_bit(HANDSHAKE_F_REQ_COMPLETED, &req->hr_flags)) {
trace_handshake_cancel_busy(net, req, sk);
return false;
}
goto out_true;
}
--
Chuck Lever
Powered by blists - more mailing lists