lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <05E336BF-BAF7-432D-85B5-4B06CD02D34C@amazon.com>
Date:   Mon, 7 Dec 2020 16:34:57 +0000
From:   "Mohamed Abuelfotoh, Hazem" <abuehaze@...zon.com>
To:     Eric Dumazet <edumazet@...gle.com>
CC:     "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "stable@...r.kernel.org" <stable@...r.kernel.org>,
        "ycheng@...gle.com" <ycheng@...gle.com>,
        "ncardwell@...gle.com" <ncardwell@...gle.com>,
        "weiwan@...gle.com" <weiwan@...gle.com>,
        "Strohman, Andy" <astroh@...zon.com>,
        "Herrenschmidt, Benjamin" <benh@...zon.com>
Subject: Re: [PATCH net-next] tcp: optimise receiver buffer autotuning initialisation
 for high latency connections

100ms RTT

>Which exact version of linux kernel are you using ?
On the receiver side I could see the issue with any mainline kernel version >=4.19.86 which is the first kernel version that has patches [1] & [2] included.
On the sender I am using kernel 5.4.0-rc6.

Links:

[1] https://lore.kernel.org/patchwork/patch/1157936/
[2] https://lore.kernel.org/patchwork/patch/1157883/

Thank you.

Hazem



On 07/12/2020, 16:24, "Eric Dumazet" <edumazet@...gle.com> wrote:

    CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.



    On Mon, Dec 7, 2020 at 5:09 PM Mohamed Abuelfotoh, Hazem
    <abuehaze@...zon.com> wrote:
    >
    >     >Since I can not reproduce this problem with another NIC on x86, I
    >     >really wonder if this is not an issue with ENA driver on PowerPC
    >     >perhaps ?
    >
    >
    > I am able to reproduce it on x86 based EC2 instances using ENA  or  Xen netfront or Intel ixgbevf driver on the receiver so it's not specific to ENA, we were able to easily reproduce it between 2 VMs running in virtual box on the same physical host considering the environment requirements I mentioned in my first e-mail.
    >
    > What's the RTT between the sender & receiver in your reproduction? Are you using bbr on the sender side?


    100ms RTT

    Which exact version of linux kernel are you using ?



    >
    > Thank you.
    >
    > Hazem
    >
    > On 07/12/2020, 15:26, "Eric Dumazet" <edumazet@...gle.com> wrote:
    >
    >     CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
    >
    >
    >
    >     On Sat, Dec 5, 2020 at 1:03 PM Mohamed Abuelfotoh, Hazem
    >     <abuehaze@...zon.com> wrote:
    >     >
    >     > Unfortunately few things are missing in this report.
    >     >
    >     >     What is the RTT between hosts in your test ?
    >     >      >>>>>RTT in my test is 162 msec, but I am able to reproduce it with lower RTTs for example I could see the issue downloading from google   endpoint with RTT of 16.7 msec, as mentioned in my previous e-mail the issue is reproducible whenever RTT exceeded 12msec given that    the sender is using bbr.
    >     >
    >     >         RTT between hosts where I run the iperf test.
    >     >         # ping 54.199.163.187
    >     >         PING 54.199.163.187 (54.199.163.187) 56(84) bytes of data.
    >     >         64 bytes from 54.199.163.187: icmp_seq=1 ttl=33 time=162 ms
    >     >         64 bytes from 54.199.163.187: icmp_seq=2 ttl=33 time=162 ms
    >     >         64 bytes from 54.199.163.187: icmp_seq=3 ttl=33 time=162 ms
    >     >         64 bytes from 54.199.163.187: icmp_seq=4 ttl=33 time=162 ms
    >     >
    >     >         RTT between my EC2 instances and google endpoint.
    >     >         # ping 172.217.4.240
    >     >         PING 172.217.4.240 (172.217.4.240) 56(84) bytes of data.
    >     >         64 bytes from 172.217.4.240: icmp_seq=1 ttl=101 time=16.7 ms
    >     >         64 bytes from 172.217.4.240: icmp_seq=2 ttl=101 time=16.7 ms
    >     >         64 bytes from 172.217.4.240: icmp_seq=3 ttl=101 time=16.7 ms
    >     >         64 bytes from 172.217.4.240: icmp_seq=4 ttl=101 time=16.7 ms
    >     >
    >     >     What driver is used at the receiving side ?
    >     >       >>>>>>I am using ENA driver version version: 2.2.10g on the receiver with scatter gathering enabled.
    >     >
    >     >         # ethtool -k eth0 | grep scatter-gather
    >     >         scatter-gather: on
    >     >                 tx-scatter-gather: on
    >     >                 tx-scatter-gather-fraglist: off [fixed]
    >
    >     This ethtool output refers to TX scatter gather, which is not relevant
    >     for this bug.
    >
    >     I see ENA driver might use 16 KB per incoming packet (if ENA_PAGE_SIZE is 16 KB)
    >
    >     Since I can not reproduce this problem with another NIC on x86, I
    >     really wonder if this is not an issue with ENA driver on PowerPC
    >     perhaps ?
    >
    >
    >
    >
    > Amazon Web Services EMEA SARL, 38 avenue John F. Kennedy, L-1855 Luxembourg, R.C.S. Luxembourg B186284
    >
    > Amazon Web Services EMEA SARL, Irish Branch, One Burlington Plaza, Burlington Road, Dublin 4, Ireland, branch registration number 908705
    >
    >




Amazon Web Services EMEA SARL, 38 avenue John F. Kennedy, L-1855 Luxembourg, R.C.S. Luxembourg B186284

Amazon Web Services EMEA SARL, Irish Branch, One Burlington Plaza, Burlington Road, Dublin 4, Ireland, branch registration number 908705


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ