lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 11 Feb 2015 09:33:34 +0100
From:	Michal Kazior <michal.kazior@...to.com>
To:	Eric Dumazet <eric.dumazet@...il.com>
Cc:	Neal Cardwell <ncardwell@...gle.com>,
	linux-wireless <linux-wireless@...r.kernel.org>,
	Network Development <netdev@...r.kernel.org>,
	Eyal Perry <eyalpe@....mellanox.co.il>
Subject: Re: Throughput regression with `tcp: refine TSO autosizing`

On 10 February 2015 at 14:14, Eric Dumazet <eric.dumazet@...il.com> wrote:
> On Tue, 2015-02-10 at 11:33 +0100, Michal Kazior wrote:
>>                            ath10k_core_napi_dummy_poll, 64);
>> +       ewma_init(&ar->tx_delay_us, 16384, 8);
>
>
> 1) 16384 factor might be too big.
>
> 2) a weight of 8 seems too low given aggregation values used in wifi.
>
> On 32bit arches, the max range for ewma value would be 262144 usec,
> a quarter of a second...
>
> You could use a factor of 64 instead, and a weight of 16.

64/16 seems to work fine as well.

On a related note: I still wonder how to get single TCP flow to reach
line rate with ath10k (it still doesn't; I reach line rate with
multiple flows only). Isn't the tcp_limit_output_bytes just too small
for devices like Wi-Fi where you can send aggregates of even 64*3*1500
bytes long in a single shot and you can't expect even a single
tx-completion of it to come in before its transmitted entirely? You
effectively operate with bursts of traffic.

Some numbers:
 ath10k w/o cushion w/o aggregation 1 flow: UDP 65mbps, TCP 30mbps
 ath10k w/ cushion w/o aggregation 1 flow: UDP 65mbps, TCP 59mbps
 ath10k w/o cushion w/ aggregation 1 flow: UDP 650mbps, TCP 250mbps
 ath10k w/ cushion w/ aggregation 1 flow: UDP 650mbps, TCP 250mbps
 ath10k w/o cushion w/ aggregation 5 flows: UDP 650mbps, TCP 250mbps
 ath10k w/ cushion w/ aggregation 5 flows: UDP 650mbps, TCP 600mbps

"w/o aggregation" means forcing ath10k to use 1 A-MSDU and 1 A-MPDU
per aggregate so latencies due to aggregation itself should be pretty
much nil.

If I set tcp_limit_output_bytes to 700K+ I can get ath10k w/ cushion
w/ aggregation to reach 600mbps on a single flow.


MichaƂ
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ