[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CACJspmLaxdHoa63jCuD-mKJS35BZ69b9qw3tEZjFxbUNb3PSHg@mail.gmail.com>
Date: Tue, 5 Dec 2017 11:36:44 -0800
From: Steve Ibanez <sibanez@...nford.edu>
To: Neal Cardwell <ncardwell@...gle.com>
Cc: Eric Dumazet <edumazet@...gle.com>,
Yuchung Cheng <ycheng@...gle.com>,
Daniel Borkmann <daniel@...earbox.net>,
Netdev <netdev@...r.kernel.org>, Florian Westphal <fw@...len.de>,
Mohammad Alizadeh <alizadeh@...il.mit.edu>,
Lawrence Brakmo <brakmo@...com>
Subject: Re: Linux ECN Handling
Hi Neal,
I've included a link to small trace of 13 packets which is different
from the screenshot I attached in my last email, but shows the same
sequence of events. It's a bit hard to read the tcptrace due to the
300ms timeout, so I figured this was the best approach.
slice.pcap: https://drive.google.com/open?id=1hYXbUClHGbQv1hWG1HZWDO2WYf30N6G8
Thanks for the help!
-Steve
On Tue, Dec 5, 2017 at 7:23 AM, Neal Cardwell <ncardwell@...gle.com> wrote:
> On Tue, Dec 5, 2017 at 12:22 AM, Steve Ibanez <sibanez@...nford.edu> wrote:
>> Hi Neal,
>>
>> Happy to help out :) And thanks for the tip!
>>
>> I was able to track down where the missing bytes that you pointed out
>> are being lost. It turns out the destination host seems to be
>> misbehaving. I performed a packet capture at the destination host
>> interface (a snapshot of the trace is attached). I see the following
>> sequence of events when a timeout occurs (note that I have NIC
>> offloading enabled so wireshark captures packets larger than the MTU):
>>
>> 1. The destination receives a data packet of length X with seqNo = Y
>> from the src with the CWR bit set and does not send back a
>> corresponding ACK.
>> 2. The source times out and sends a retransmission packet of length Z
>> (where Z < X) with seqNo = Y
>> 3. The destination sends back an ACK with AckNo = Y + X
>>
>> So in other words, the packet which the destination host does not
>> initially ACK (causing the timeout) does not actually get lost because
>> after receiving the retransmission the AckNo moves forward all the way
>> past the bytes in the initial unACKed CWR packet. In the attached
>> screenshot, I've marked the unACKed CWR packet with a red box.
>>
>> Have you seen this behavior before? And do you know what might be
>> causing the destination host not to ACK the CWR packet? In most cases
>> the CWR marked packets are ACKed properly, it's just occasionally they
>> are not.
>
> Thanks for the detailed report!
>
> I have not heard of an incoming CWR causing the receiver to fail to
> ACK. And in re-reading the code, I don't see an obvious way in which a
> CWR bit should cause the receiver to fail to ACK.
>
> That screen shot is a bit hard to parse. Would you be able to post a
> tcpdump .pcap of that particular section, or post a screen shot of a
> time-sequence plot of that section?
>
> To extract that segment and take screen shot, you could use something like:
>
> editcap -A "2017-12-04 11:22:27" -B "2017-12-04 11:22:30" all.pcap
> slice.pcap
> tcptrace -S -xy -zy slice.pcap
> xplot.org a2b_tsg.xpl &
> # take screenshot
>
> Or, alternatively, would you be able to post the slice.pcap on a web
> server or public drive?
>
> thanks,
> neal
Powered by blists - more mailing lists