netdev - Re: [PATCH] net/handshake: a handshake can only be cancelled once

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <938c82cd-9760-42e5-b0ce-123c86710782@app.fastmail.com>
Date: Sat, 06 Dec 2025 10:12:02 -0500
From: "Chuck Lever" <cel@...nel.org>
To: "Scott Mayhew" <smayhew@...hat.com>,
 "Chuck Lever" <chuck.lever@...cle.com>
Cc: kernel-tls-handshake@...ts.linux.dev, netdev@...r.kernel.org
Subject: Re: [PATCH] net/handshake: a handshake can only be cancelled once



On Sat, Dec 6, 2025, at 9:30 AM, Scott Mayhew wrote:
> When a handshake request is cancelled it is removed from the
> handshake_net->hn_requests list, but it is still present in the
> handshake_rhashtbl until it is destroyed.
>
> If a second cancellation request arrives for the same handshake request,
> then remove_pending() will return false... and assuming
> HANDSHAKE_F_REQ_COMPLETED isn't set in req->hr_flags, we'll continue
> processing through the out_true label, where we put another reference on
> the sock and a refcount underflow occurs.
>
> This can happen for example if a handshake times out - particularly if
> the SUNRPC client sends the AUTH_TLS probe to the server but doesn't
> follow it up with the ClientHello due to a problem with tlshd.  When the
> timeout is hit on the server, the server will send a FIN, which triggers
> a cancellation request via xs_reset_transport().  When the timeout is
> hit on the client, another cancellation request happens via
> xs_tls_handshake_sync().
>
> Fixes: 3b3009ea8abb ("net/handshake: Create a NETLINK service for 
> handling handshake requests")
> Signed-off-by: Scott Mayhew <smayhew@...hat.com>
> ---
>  net/handshake/request.c | 4 ++++
>  1 file changed, 4 insertions(+)
>
> diff --git a/net/handshake/request.c b/net/handshake/request.c
> index 274d2c89b6b2..c7b20d167a55 100644
> --- a/net/handshake/request.c
> +++ b/net/handshake/request.c
> @@ -333,6 +333,10 @@ bool handshake_req_cancel(struct sock *sk)
>  		return false;
>  	}
> 
> +	/* Duplicate cancellation request */
> +	trace_handshake_cancel_none(net, req, sk);
> +	return false;
> +
>  out_true:
>  	trace_handshake_cancel(net, req, sk);
> 
> -- 
> 2.51.0

To help support engineers find this patch, I recommend using
"net/handshake: duplicate handshake cancellations leak socket" as
the short description.

The proposed solution might introduce a socket reference leak:

1. Request submitted: sock_hold() called (line 271)
2. Request accepted by daemon via handshake_req_next()
   (removes from pending list)
3. Cancel called:
  - remove_pending() returns FALSE (not in pending list)
  - test_and_set_bit() returns FALSE (sets the bit now)
  - With patch: returns FALSE, sock_put() NOT called
4. handshake_complete() called: bit already set, skips sock_put()

What if we use test_and_set_bit(HANDSHAKE_F_REQ_COMPLETED) in the
pending cancel path so duplicate cancels can be detected?

Instead of:

        if (hn && remove_pending(hn, req)) {
                /* Request hadn't been accepted */
                goto out_true;
        }

go with this bit of untested code:

        if (hn && remove_pending(hn, req)) {
                /* Request hadn't been accepted - mark cancelled */
                if (test_and_set_bit(HANDSHAKE_F_REQ_COMPLETED, &req->hr_flags)) {
                        trace_handshake_cancel_busy(net, req, sk);
                        return false;
                }
                goto out_true;
        }

-- 
Chuck Lever