lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 31 Mar 2022 11:41:03 -0400
From:   Neal Cardwell <ncardwell@...gle.com>
To:     Eric Dumazet <edumazet@...gle.com>
Cc:     Jaco Kroon <jaco@....co.za>, LKML <linux-kernel@...r.kernel.org>,
        Netdev <netdev@...r.kernel.org>,
        Yuchung Cheng <ycheng@...gle.com>
Subject: Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections

On Wed, Mar 30, 2022 at 9:04 AM Jaco Kroon <jaco@....co.za> wrote:
...
> When you state sane/normal, do you mean there is fault with the other
> frames that could not be explained by packet loss in one or both of the
> directions?

Yes.

(1) If you look at the attached trace time/sequence plots (from
tcptrace and xplot.org) there are several behaviors that do not look
like normal congestive packet loss:

  (a) Literally *all* original transmissions (white segments in the
plot) of packets after client sequence 66263 appear lost (are not
ACKed). Congestion generally does not behave like that. But broken
firewalls/middleboxes do.
       (See netdev-2022-03-29-tcp-disregarded-acks-zoomed-out.png )

  (b) When the client is retransmitting packets, only packets at
exactly snd_una are ACKed. The packets beyond that point are always
un-ACKed. Again sounds like a broken firewall/middlebox.
       (See netdev-2022-03-29-tcp-disregarded-acks-zoomed-in.png )

  (c) After the client receives the server's "ack 73403", the client
ignores/drops all other incoming packets that show up in the trace.

       As Eric notes, this doesn't look like a PAWS issue. And it
doesn't look like a checksum or sequence/ACK validation issue. The
client starts ignoring ACKs between two ACKs that have correct
checksums, valid ACK numbers, and valid (identical) sequence numbers
and TS val and ecr values (here showing absolute sequence/ACK
numbers):

    (i) The client processes this ACK and uses it to advance snd_una:
    17:46:49.889911 IP6 (flowlabel 0x97427, hlim 61, next-header TCP
(6) payload length: 32) 2a00:1450:4013:c16::1a.25 >
2c0f:f720:0:3:d6ae:52ff:feb8:f27b.48590: . cksum 0x7005 (correct)
2699968514:2699968514(0) ack 3451415932 win 830 <nop,nop,TS val
1206546583 ecr 331191428>

    (ii) The client ignores this ACK and all later ACKs:
    17:46:49.889912 IP6 (flowlabel 0x97427, hlim 61, next-header TCP
(6) payload length: 32) 2a00:1450:4013:c16::1a.25 >
2c0f:f720:0:3:d6ae:52ff:feb8:f27b.48590: . cksum 0x6a66 (correct)
2699968514:2699968514(0) ack 3451417360 win 841 <nop,nop,TS val
1206546583 ecr 331191428>

neal

Download attachment "netdev-2022-03-29-tcp-disregarded-acks-zoomed-out.png" of type "image/png" (131216 bytes)

Download attachment "netdev-2022-03-29-tcp-disregarded-acks-zoomed-in.png" of type "image/png" (128102 bytes)

Powered by blists - more mailing lists