[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aTh4NGPQfWl-uurT@aion>
Date: Tue, 9 Dec 2025 14:27:48 -0500
From: Scott Mayhew <smayhew@...hat.com>
To: Chuck Lever <cel@...nel.org>
Cc: Chuck Lever <chuck.lever@...cle.com>,
kernel-tls-handshake@...ts.linux.dev, netdev@...r.kernel.org
Subject: Re: [PATCH] net/handshake: a handshake can only be cancelled once
On Sat, 06 Dec 2025, Chuck Lever wrote:
>
>
> On Sat, Dec 6, 2025, at 9:30 AM, Scott Mayhew wrote:
> > When a handshake request is cancelled it is removed from the
> > handshake_net->hn_requests list, but it is still present in the
> > handshake_rhashtbl until it is destroyed.
> >
> > If a second cancellation request arrives for the same handshake request,
> > then remove_pending() will return false... and assuming
> > HANDSHAKE_F_REQ_COMPLETED isn't set in req->hr_flags, we'll continue
> > processing through the out_true label, where we put another reference on
> > the sock and a refcount underflow occurs.
> >
> > This can happen for example if a handshake times out - particularly if
> > the SUNRPC client sends the AUTH_TLS probe to the server but doesn't
> > follow it up with the ClientHello due to a problem with tlshd. When the
> > timeout is hit on the server, the server will send a FIN, which triggers
> > a cancellation request via xs_reset_transport(). When the timeout is
> > hit on the client, another cancellation request happens via
> > xs_tls_handshake_sync().
> >
> > Fixes: 3b3009ea8abb ("net/handshake: Create a NETLINK service for
> > handling handshake requests")
> > Signed-off-by: Scott Mayhew <smayhew@...hat.com>
> > ---
> > net/handshake/request.c | 4 ++++
> > 1 file changed, 4 insertions(+)
> >
> > diff --git a/net/handshake/request.c b/net/handshake/request.c
> > index 274d2c89b6b2..c7b20d167a55 100644
> > --- a/net/handshake/request.c
> > +++ b/net/handshake/request.c
> > @@ -333,6 +333,10 @@ bool handshake_req_cancel(struct sock *sk)
> > return false;
> > }
> >
> > + /* Duplicate cancellation request */
> > + trace_handshake_cancel_none(net, req, sk);
> > + return false;
> > +
> > out_true:
> > trace_handshake_cancel(net, req, sk);
> >
> > --
> > 2.51.0
>
> To help support engineers find this patch, I recommend using
> "net/handshake: duplicate handshake cancellations leak socket" as
> the short description.
>
> The proposed solution might introduce a socket reference leak:
>
> 1. Request submitted: sock_hold() called (line 271)
> 2. Request accepted by daemon via handshake_req_next()
> (removes from pending list)
> 3. Cancel called:
> - remove_pending() returns FALSE (not in pending list)
> - test_and_set_bit() returns FALSE (sets the bit now)
> - With patch: returns FALSE, sock_put() NOT called
> 4. handshake_complete() called: bit already set, skips sock_put()
>
> What if we use test_and_set_bit(HANDSHAKE_F_REQ_COMPLETED) in the
> pending cancel path so duplicate cancels can be detected?
>
> Instead of:
>
> if (hn && remove_pending(hn, req)) {
> /* Request hadn't been accepted */
> goto out_true;
> }
>
> go with this bit of untested code:
>
> if (hn && remove_pending(hn, req)) {
> /* Request hadn't been accepted - mark cancelled */
> if (test_and_set_bit(HANDSHAKE_F_REQ_COMPLETED, &req->hr_flags)) {
> trace_handshake_cancel_busy(net, req, sk);
> return false;
> }
> goto out_true;
> }
Thanks, Chuck. That works.
>
> --
> Chuck Lever
>
Powered by blists - more mailing lists