netdev - Re: [PATCH] tls: Fix tls_sw

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <27F02B22-1673-4833-B83E-D2BA5E793004@redhat.com>
Date: Tue, 07 Jan 2025 07:28:46 -0500
From: Benjamin Coddington <bcodding@...hat.com>
To: Jakub Kicinski <kuba@...nel.org>
Cc: Boris Pismenny <borisp@...dia.com>,
 John Fastabend <john.fastabend@...il.com>,
 "David S. Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>,
 Paolo Abeni <pabeni@...hat.com>, Simon Horman <horms@...nel.org>,
 netdev@...r.kernel.org, linux-nfs@...r.kernel.org,
 Vakul Garg <vakul.garg@....com>
Subject: Re: [PATCH] tls: Fix tls_sw_sendmsg error handling

On 6 Jan 2025, at 21:36, Jakub Kicinski wrote:

> On Sat,  4 Jan 2025 10:29:45 -0500 Benjamin Coddington wrote:
>> We've noticed that NFS can hang when using RPC over TLS on an unstable
>> connection, and investigation shows that the RPC layer is stuck in a tight
>> loop attempting to transmit, but forever getting -EBADMSG back from the
>> underlying network.  The loop begins when tcp_sendmsg_locked() returns
>> -EPIPE to tls_tx_records(), but that error is converted to -EBADMSG when
>> calling the socket's error reporting handler.
>>
>> Instead of converting errors from tcp_sendmsg_locked(), let's pass them
>> along in this path.  The RPC layer handles -EPIPE by reconnecting the
>> transport, which prevents the endless attempts to transmit on a broken
>> connection.
>
> LGTM, only question in my mind is whether we should send this to stable.
> Any preference?

Yes, I think it can go, though not a strong preference.  This code well
predates RPC over TLS which landed on v6.5.  I haven't investigated other
users - they may not have the same problem since RPC over TLS has very
precise error handling, so it perhaps it makes sense to show the Fixes but
limit how far back we go for RPC.

Fixes: a42055e8d2c3 ("net/tls: Add support for async encryption of records for performance")
Cc: <stable@...r.kernel.org> # 6.5.x

Thanks for the look Jakub.
Ben