[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CADVnQym9JpC+vDVPVjP0ibhPQu3NhxsynRYA--FNzAgKJUJQSg@mail.gmail.com>
Date: Fri, 28 Jul 2017 18:54:41 -0400
From: Neal Cardwell <ncardwell@...gle.com>
To: Willy Tarreau <w@....eu>
Cc: Eric Dumazet <eric.dumazet@...il.com>, Klavs Klavsen <kl@...n.dk>,
Netdev <netdev@...r.kernel.org>,
Yuchung Cheng <ycheng@...gle.com>,
Nandita Dukkipati <nanditad@...gle.com>
Subject: Re: TCP fast retransmit issues
On Wed, Jul 26, 2017 at 3:02 PM, Neal Cardwell <ncardwell@...gle.com> wrote:
> On Wed, Jul 26, 2017 at 2:38 PM, Neal Cardwell <ncardwell@...gle.com> wrote:
>> Yeah, it looks like I can reproduce this issue with (1) bad sacks
>> causing repeated TLPs, and (2) TLPs timers being pushed out to later
>> times due to incoming data. Scripts are attached.
>
> I'm testing a fix of only scheduling a TLP if (flag & FLAG_DATA_ACKED)
> is true...
An update for the TLP aspect of this thread: our team has a proposed
fix for this RTO/TLP reschedule issue that we have reviewed internally
and tested with our packetdrill test suite, including some new tests.
The basic approach in the fix is as follows:
a) only reschedule the xmit timer once per ACK
b) only reschedule the xmit timer if tcp_clean_rtx_queue() deems this
is safe (a packet was cumulatively ACKed, or we got a SACK for a
packet that was sent before the most recent retransmit of the write
queue head).
After further review and testing we will post it. Hopefully next week.
thanks,
neal
Powered by blists - more mailing lists