netdev - TCP stall issue

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-Id: <35A4DDAA-7E8D-43CB-A1F5-D1E46A4ED42E@gmail.com>
Date:   Tue, 23 Feb 2021 11:09:19 +0100
From:   Gil Pedersen <kanongil@...il.com>
To:     davem@...emloft.net, yoshfuji@...ux-ipv6.org, dsahern@...nel.org
Cc:     netdev@...r.kernel.org
Subject: TCP stall issue

Hi,

I am investigating a TCP stall that can occur when sending to an Android device (kernel 4.9.148) from an Ubuntu server running kernel 5.11.0.

The issue seems to be that RACK is not applied when a D-SACK (with SACK) is received on the server after an RTO re-transmission (CA_Loss state). Here the re-transmitted segment is considered to be already delivered and loss undo logic is applied. Then nothing is re-transmitted until the next RTO, where the next segment is sent and the same thing happens again. The causes the retransmitted segments to be delivered at a rate of ~1 per second, so a burst loss of eg. 20 segments cause a 20+ second stall. I would expect RACK to kick in long before this happens.

Note the D-SACK should not be considered spurious, as the TSecr value matches the re-transmission TSval.

Also, the Android receiver is definitely sending strange D-SACKs that does not properly advance the ACK number to include received segments. However, I can't control it and need to fix it on the server by quickly re-transmitting the segments. The connection itself is functional. If the client makes a request to the server in this state, it can respond and the client will receive any segments sent in reply.

I can see from counters that TcpExtTCPLossUndo & TcpExtTCPSackFailures are incremented on the server when this happens.
The issue appears both with F-RTO enabled and disabled. Also appears both with BBR and RENO.

Any idea of why this happens, or suggestions on how to debug the issue further?

/Gil