netdev - Re: TCP fast retransmit issues

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CADVnQykQszMkHq+KvebHcWinc319H=NMjnP+bX5miFpN8tPPzw@mail.gmail.com>
Date:   Mon, 31 Jul 2017 23:17:01 -0400
From:   Neal Cardwell <ncardwell@...gle.com>
To:     Willy Tarreau <w@....eu>
Cc:     Eric Dumazet <eric.dumazet@...il.com>, Klavs Klavsen <kl@...n.dk>,
        Netdev <netdev@...r.kernel.org>,
        Yuchung Cheng <ycheng@...gle.com>,
        Nandita Dukkipati <nanditad@...gle.com>
Subject: Re: TCP fast retransmit issues

On Fri, Jul 28, 2017 at 6:54 PM, Neal Cardwell <ncardwell@...gle.com> wrote:
> On Wed, Jul 26, 2017 at 3:02 PM, Neal Cardwell <ncardwell@...gle.com> wrote:
>> On Wed, Jul 26, 2017 at 2:38 PM, Neal Cardwell <ncardwell@...gle.com> wrote:
>>> Yeah, it looks like I can reproduce this issue with (1) bad sacks
>>> causing repeated TLPs, and (2) TLPs timers being pushed out to later
>>> times due to incoming data. Scripts are attached.
>>
>> I'm testing a fix of only scheduling a TLP if (flag & FLAG_DATA_ACKED)
>> is true...
>
> An update for the TLP aspect of this thread: our team has a proposed
> fix for this RTO/TLP reschedule issue that we have reviewed internally
> and tested with our packetdrill test suite, including some new tests.
> The basic approach in the fix is as follows:
>
> a) only reschedule the xmit timer once per ACK
>
> b) only reschedule the xmit timer if tcp_clean_rtx_queue() deems this
> is safe (a packet was cumulatively ACKed, or we got a SACK for a
> packet that was sent before the most recent retransmit of the write
> queue head).
>
> After further review and testing we will post it. Hopefully next week.

The timer patches are upstream for review for the "net" branch:

  https://patchwork.ozlabs.org/patch/796057/
  https://patchwork.ozlabs.org/patch/796058/
  https://patchwork.ozlabs.org/patch/796059/

Again, thank you for reporting this, and thanks for the packet trace!

neal