netdev - Re: TCP and BBR: reproducibly low cwnd and bandwidth

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <02c75fe8-1792-d7ab-c6be-01799f3d50b0@applied-asynchrony.com>
Date:   Fri, 16 Feb 2018 18:13:21 +0100
From:   Holger Hoffstätte <holger@...lied-asynchrony.com>
To:     Neal Cardwell <ncardwell@...gle.com>
Cc:     Oleksandr Natalenko <oleksandr@...alenko.name>,
        "David S. Miller" <davem@...emloft.net>,
        Alexey Kuznetsov <kuznet@....inr.ac.ru>,
        Hideaki YOSHIFUJI <yoshfuji@...ux-ipv6.org>,
        Netdev <netdev@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Eric Dumazet <edumazet@...gle.com>,
        Soheil Hassas Yeganeh <soheil@...gle.com>,
        Yuchung Cheng <ycheng@...gle.com>,
        Van Jacobson <vanj@...gle.com>, Jerry Chu <hkchu@...gle.com>
Subject: Re: TCP and BBR: reproducibly low cwnd and bandwidth

On 02/16/18 17:56, Neal Cardwell wrote:
> On Fri, Feb 16, 2018 at 11:26 AM, Holger Hoffstätte
> <holger@...lied-asynchrony.com> wrote:
>>
>> BBR in general will run with lower cwnd than e.g. Cubic or others.
>> That's a feature and necessary for WAN transfers.
> 
> Please note that there's no general rule about whether BBR will run
> with a lower or higher cwnd than CUBIC, Reno, or other loss-based
> congestion control algorithms. Whether BBR's cwnd will be lower or
> higher depends on the BDP of the path, the amount of buffering in the
> bottleneck, and the number of flows. BBR tries to match the amount of
> in-flight data to the BDP based on the available bandwidth and the
> two-way propagation delay. This will usually produce an amount of data
> in flight that is smaller than CUBIC/Reno (yielding lower latency) if
> the path has deep buffers (bufferbloat), but can be larger than
> CUBIC/Reno (yielding higher throughput) if the buffers are shallow and
> the traffic is suffering burst losses.

In all my tests I've never seen it larger, but OK. Thanks for the
explanation. :)
On second reading the "necessary for WAN transfers" was phrased a bit
unfortunately, but it likely doesn't matter for Oleksandr's case
anyway..

(snip)

>> Something seems really wrong with your setup. I get completely
>> expected throughput on wired 1Gb between two hosts:
>>
>> Connecting to host tux, port 5201
>> [  5] local 192.168.100.223 port 48718 connected to 192.168.100.222 port 5201
>> [ ID] Interval           Transfer     Bitrate         Retr  Cwnd
>> [  5]   0.00-1.00   sec   113 MBytes   948 Mbits/sec    0    204 KBytes
>> [  5]   1.00-2.00   sec   112 MBytes   941 Mbits/sec    0    204 KBytes
>> [  5]   2.00-3.00   sec   112 MBytes   941 Mbits/sec    0    204 KBytes
>> [...]
>>
>> Running it locally gives the more or less expected results as well:
>>
>> Connecting to host ragnarok, port 5201
>> [  5] local 192.168.100.223 port 54090 connected to 192.168.100.223 port 5201
>> [ ID] Interval           Transfer     Bitrate         Retr  Cwnd
>> [  5]   0.00-1.00   sec  8.09 GBytes  69.5 Gbits/sec    0    512 KBytes
>> [  5]   1.00-2.00   sec  8.14 GBytes  69.9 Gbits/sec    0    512 KBytes
>> [  5]   2.00-3.00   sec  8.43 GBytes  72.4 Gbits/sec    0    512 KBytes
>> [...]
>>
>> Both hosts running 4.14.x with bbr and fq_codel (default qdisc everywhere).
> 
> Can you please clarify if this is over bare metal or between VM
> guests? It sounds like Oleksandr's initial report was between KVM VMs,
> so the virtualization may be an ingredient here.

These are real hosts, not VMs, wired by 1Gbit Ethernet (home office).
Like Eric said it's probably weird HZ, slow host, iffy high-res timer
(bad for both fq and fq_codel), overhead of retpoline in a VM or whatnot.

cheers
Holger