[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6f9c7fb9-4996-ff23-8ead-ef6d09d958e8@gmail.com>
Date: Fri, 9 Mar 2018 06:32:21 -0800
From: Eric Dumazet <eric.dumazet@...il.com>
To: Ilpo Järvinen <ilpo.jarvinen@...sinki.fi>,
Yuchung Cheng <ycheng@...gle.com>
Cc: Neal Cardwell <ncardwell@...gle.com>,
Netdev <netdev@...r.kernel.org>,
Eric Dumazet <edumazet@...gle.com>,
Willem de Bruijn <willemb@...gle.com>
Subject: Re: [PATCH net 4/5] tcp: prevent bogus undos when SACK is not enabled
On 03/09/2018 06:11 AM, Ilpo Järvinen wrote:
> On Wed, 7 Mar 2018, Yuchung Cheng wrote:
>
>> On Wed, Mar 7, 2018 at 12:19 PM, Neal Cardwell <ncardwell@...gle.com> wrote:
>>>
>>> On Wed, Mar 7, 2018 at 7:59 AM, Ilpo Järvinen <ilpo.jarvinen@...sinki.fi> wrote:
>>>> A bogus undo may/will trigger when the loss recovery state is
>>>> kept until snd_una is above high_seq. If tcp_any_retrans_done
>>>> is zero, retrans_stamp is cleared in this transient state. On
>>>> the next ACK, tcp_try_undo_recovery again executes and
>>>> tcp_may_undo will always return true because tcp_packet_delayed
>>>> has this condition:
>>>> return !tp->retrans_stamp || ...
>>>>
>>>> Check for the false fast retransmit transient condition in
>>>> tcp_packet_delayed to avoid bogus undos. Since snd_una may have
>>>> advanced on this ACK but CA state still remains unchanged,
>>>> prior_snd_una needs to be passed instead of tp->snd_una.
>>>
>>> This one also seems like a case where it would be nice to have a
>>> specific packet-by-packet example, or trace, or packetdrill scenario.
>>> Something that we might be able to translate into a test, or at least
>>> to document the issue more explicitly.
>>
>> I am hesitate for further logic to make undo "perfect" on non-sack
>> cases b/c undo is very complicated and SACK is extremely
>> well-supported today. so a trace to demonstrate how severe this issue
>> is appreciated.
>
> This is not just some remote corner cases to which I perhaps would
> understand your "making undo perfect" comment. Those undos result in
> a burst that, at worst, triggers additional buffer overflow because the
> correct CC action is cancelled. Unfortunately I don't have now permission
> to publish the time-seq graph about it but I've tried to improve the
> changelog messages so that you can better understand under which
> conditions the problem occurs.
>
> SACK case remains the same even after this change. I did rework the
> logic a bit though (pass prior_snd_una rather than flag around) to
> make it more obvious but the change is unfortunately lengthy (no matter
> what I pass through the call-chain).
>
>
Can you provide packetdrill test(s) then if you can not provide traces ?
If we can not have tests, we will likely have future regressions, since
most developments are using SACK in mind.
Fact that these bugs have been unnoticed for years is concerning.
We have the goal of updating packetdrill and publish soon our regression
suite (about 1500 TCP tests)
CC Willem who is working on that.
Powered by blists - more mailing lists