netdev - Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CANn89iKHbmVYoBdo2pCQWTzB4eFBjqAMdFbqL5EKSFqgg3uAJQ@mail.gmail.com>
Date:   Tue, 29 Mar 2022 20:48:02 -0700
From:   Eric Dumazet <edumazet@...gle.com>
To:     Jaco Kroon <jaco@....co.za>
Cc:     Neal Cardwell <ncardwell@...gle.com>,
        LKML <linux-kernel@...r.kernel.org>,
        Netdev <netdev@...r.kernel.org>,
        Yuchung Cheng <ycheng@...gle.com>
Subject: Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections

On Tue, Mar 29, 2022 at 7:58 PM Jaco Kroon <jaco@....co.za> wrote:
>
> Hi Neal,
>
> > Thanks for the report!  I have CC-ed the netdev list, since it is
> > probably a better forum for this discussion.
> Awesome thank you.
> >
> > Can you please attach (or link to) a tcpdump raw .pcap file  (produced
> > with the -w flag)? There are a number of tools that will make this
> > easier to visualize and analyze if we can see the raw .pcap file. You
> > may want to anonymize the trace and/or capture just headers, etc (for
> > example, the -s flag can control how much of each packet tcpdump
> > grabs).
>
> Attached.
>
> The traffic itself should be mostly encrypted but stripped with -s100
> anyway.  At this point SACK was still on.
>
> I don't know how, or why, but this relates to TFO.  After sending report
> on a hunch (based on comparing the exim logs of a successful delivery
> compared to a non-successful) and the only difference was that the
> non-working was stating:
>
> TFO mode sendto, no data: EINPROGRESS
>
> and then specifically:
>
> TCP_FASTOPEN tcpi_unacked 2
>
> The working connections never had the latter line in the output.
>
> The moment I set sysctl -w net.ipv4.tcp_fastopen=0 (default is 1) I've
> managed to flood out about 1200 emails to google in a matter of no more
> than 15 minutes.
>
> In the kernel sources:  git log v5.8..v5.17 net/
>
> And searching for TFO only gives so many possible commits that broke
> this, just looking at changelogs I'm not sure if any of them are
> relevant.  I'm guessing the issue possibly relates to congestion
> control, as such this is probably the most relevant:
>
> commit be5d1b61a2ad28c7e57fe8bfa277373e8ecffcdc
> Author: Nguyen Dinh Phi <phind.uet@...il.com>
> Date:   Tue Jul 6 07:19:12 2021 +0800
>
>     tcp: fix tcp_init_transfer() to not reset icsk_ca_initialized
>
> Just looking at the diff it removes a icsk->icsk_ca_initialized = 0; -
> the only other place this gets set to 0 is in tcp_disconnect() ... and
> to 1 in tcp_init_congestion_control() - so I think we might have an
> uninitialized variable here ... then again tcp_init_socket mentions
> explicitly that sk_alloc set lots of stuff to 0 - still bugs me that the
> original commit (8919a9b31eb4) felt the need to set an explicit 0 in
> tcp_init_transfer().

I do not think this commit is related to the issue you have.

I guess you could try a revert ?

Then, if you think old linux versions were ok, start a bisection ?

Thank you.

(I do not see why a successful TFO would lead to a freeze after ~70 KB
of data has been sent)

>
> >
> > Can you please share the exact kernel version of the client machine?
> Our side (client) is 5.17.1 (side that initiates TCP/IP connection), I
> obviously can't comment for the Google side (server).
> > Also, can you please summarize/clarify whether you think the client,
> > server, or both are misbehaving?
>
> client is re-transmitting frames for which it has already received an
> ACK from the server.  In pcap from frames 105 onwards one can start
> seeing retransmits, then first "spurious retransmission" as wireshark
> labels it from frames 122 onwards.
>
> Kind Regards,
> Jaco