netdev - Re: Linux ECN Handling

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CACJspmL3SKhsAa9_2uEFNGADd0Q3e8S1hrgRAmvXjbnuevasyg@mail.gmail.com>
Date:   Mon, 6 Nov 2017 15:31:09 -0800
From:   Steve Ibanez <sibanez@...nford.edu>
To:     Daniel Borkmann <daniel@...earbox.net>
Cc:     Neal Cardwell <ncardwell@...gle.com>,
        Netdev <netdev@...r.kernel.org>, Florian Westphal <fw@...len.de>,
        Mohammad Alizadeh <alizadeh@...il.mit.edu>,
        Lawrence Brakmo <brakmo@...com>
Subject: Re: Linux ECN Handling

Hi Daniel,

Apologies for the delay. I tried out Neal's suggestion to printk the
cwnd and ss-thresh in the tcp_init_cwnd_reduction(),
tcp_cwnd_reduction(), and tcp_end_cwnd_reduction() functions in
tcp_input.c. From what I can tell, none of these functions are setting
the cwnd to 0.

Here is the kernel log with the cwnd and ss-thresh print statements:
https://drive.google.com/open?id=1LEWIkz64NuZN3yuDpBOAXbUfJfiju55O
And here is the corresponding packet trace at this end host:
https://drive.google.com/open?id=1qf4cSW3wzsiwPngcYpZY-AoBspuqONLH
(The kernel log buffer was not large enough to capture the full
3-second experiment, so there's only about a second of data for the
log buffer and about 3 seconds for the packet trace.)

Here are is a plot of the cwnd and ss-thresh from each of the three functions:
    - tcp_init_cwnd_reduction:
https://drive.google.com/open?id=1KOEXG2ISJQMi9c6KyPOQ6rpVUVsQwtWU
    - tcp_cwnd_reduction:
https://drive.google.com/open?id=1awoPWC3hi4CGZt7HyuI4aAaLG1LPLwJE
    - tcp_end_cwnd_reduction:
https://drive.google.com/open?id=1G7XUSnkX8tP7Z5XdY2O97OWj6jguQHO5

Here is a plot of the measured flow rates:
https://drive.google.com/open?id=1XwmGve10J4qa1nPE3LustK8NbvhZscac

The kernel log and packet trace data was collected on the 10.0.0.3
host. The cwnd and ss-thresh plots are from the final second or so of
the experiment and they show two timeout events. In the first event,
the 10.0.0.1 host times out allowing 10.0.0.3 to increase it's cwnd.
And in the second event, the 10.0.0.3 host times out causing the cwnd
to decrease from ~100 to about ~10. The cwnd samples from tcp_probe (
https://drive.google.com/open?id=1QCuPspLqbGoA68MKTaAh7rx2wCv3Cr_e )
indicate that the cwnd is 1 MSS just before the timeout event, but I
don't see that in the data collected from the tcp_*_cwnd_reduction
functions.

Here is a diff of the changes that I applied to the tcp_input.c file:
https://drive.google.com/open?id=1k5x3AkfTr3tJhohSIcmQp-3g2yTVNMWm

Are there other places in the code that you would suggest I check for
how the cwnd and ss-thresh are changing?

Thanks,
-Steve


On Mon, Nov 6, 2017 at 6:08 AM, Daniel Borkmann <daniel@...earbox.net> wrote:
> On 10/24/2017 03:11 AM, Neal Cardwell wrote:
>>
>> On Mon, Oct 23, 2017 at 6:15 PM, Steve Ibanez <sibanez@...nford.edu>
>> wrote:
>>>
>>> Hi All,
>>>
>>> I upgraded the kernel on all of our machines to Linux
>>> 4.13.8-041308-lowlatency. However, I'm still observing the same
>>> behavior where the source enters a timeout when the CWND=1MSS and it
>>> receives ECN marks.
>>>
>>> Here are the measured flow rates:
>>>
>>> <https://drive.google.com/file/d/0B-bt9QS-C3ONT0VXMUt6WHhKREE/view?usp=sharing>
>>>
>>> Here are snapshots of the packet traces at the sources when they both
>>> enter a timeout at t=1.6sec:
>>>
>>> 10.0.0.1 timeout event:
>>>
>>> <https://drive.google.com/file/d/0B-bt9QS-C3ONcl9WRnRPazg2ems/view?usp=sharing>
>>>
>>> 10.0.0.3 timeout event:
>>>
>>> <https://drive.google.com/file/d/0B-bt9QS-C3ONeDlxRjNXa0VzWm8/view?usp=sharing>
>>>
>>> Both still essentially follow the same sequence of events that I
>>> mentioned earlier:
>>> (1) receives an ACK for byte XYZ with the ECN flag set
>>> (2) stops sending for RTO_min=300ms
>>> (3) sends a retransmission for byte XYZ
>>>
>>> The cwnd samples reported by tcp_probe still indicate that the sources
>>> are reacting to the ECN marks more than once per window. Here are the
>>> cwnd samples at the same timeout event mentioned above:
>>>
>>> <https://drive.google.com/file/d/0B-bt9QS-C3ONdEZQdktpaW5JUm8/view?usp=sharing>
>>>
>>> Let me know if there is anything else you think I should try.
>>
>>
>> Sounds like perhaps cwnd is being set to 0 somewhere in this DCTCP
>> scenario. Would you be able to add printk statements in
>> tcp_init_cwnd_reduction(), tcp_cwnd_reduction(), and
>> tcp_end_cwnd_reduction(), printing the IP:port, tp->snd_cwnd, and
>> tp->snd_ssthresh?
>>
>> Based on the output you may be able to figure out where cwnd is being
>> set to zero. If not, could you please post the printk output and
>> tcpdump traces (.pcap, headers-only is fine) from your tests?
>
>
> Hi Steve, do you have any updates on your debugging?
>
>> thanks,
>> neal
>>
>