netdev - Re: Throughput regression with `tcp: refine TSO autosizing`

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1423230001.31870.128.camel@edumazet-glaptop2.roam.corp.google.com>
Date:	Fri, 06 Feb 2015 05:40:01 -0800
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	Michal Kazior <michal.kazior@...to.com>
Cc:	Neal Cardwell <ncardwell@...gle.com>,
	linux-wireless <linux-wireless@...r.kernel.org>,
	Network Development <netdev@...r.kernel.org>,
	eyalpe@....mellanox.co.il
Subject: Re: Throughput regression with `tcp: refine TSO autosizing`

On Fri, 2015-02-06 at 10:42 +0100, Michal Kazior wrote:

> The above brings back previous behaviour, i.e. I can get 600mbps TCP
> on 5 flows again. Single flow is still (as it was before TSO
> autosizing) limited to roughly ~280mbps.
> 
> I never really bothered before to understand why I need to push a few
> flows through ath10k to max it out, i.e. if I run a single UDP flow I
> get ~300mbps while with, e.g. 5 I get 670mbps easily.
> 

For single UDP flow, tweaking /proc/sys/net/core/wmem_default might be
enough : UDP has no callback from TX completion to feed following frames
(No write queue like TCP)

# cat /proc/sys/net/core/wmem_default
212992
# ethtool -C eth1 tx-usecs 1024 tx-frames 120
# ./netperf -H remote -t UDP_STREAM -- -m 1450
Socket  Message  Elapsed      Messages                
Size    Size     Time         Okay Errors   Throughput
bytes   bytes    secs            #      #   10^6bits/sec

212992    1450   10.00      697705      0     809.27
212992           10.00      673412            781.09

# echo 800000 >/proc/sys/net/core/wmem_default
# ./netperf -H remote -t UDP_STREAM -- -m 1450
Socket  Message  Elapsed      Messages                
Size    Size     Time         Okay Errors   Throughput
bytes   bytes    secs            #      #   10^6bits/sec

800000    1450   10.00     7329221      0    8501.84
212992           10.00     7284051           8449.44


> I guess it was the tx completion latency all along.
> 
> I just put an extra debug to ath10k to see the latency between
> submission and completion. Here's a log
> (http://www.filedropper.com/complete-log) of 2s run of UDP iperf
> trying to push 1gbps but managing only 300mbps.
> 
> I've made sure to not hold any locks nor introduce internal to ath10k
> delays. Frames get completed between 2-4ms in avarage during load.


tcp_wfree() could maintain in tp->tx_completion_delay_ms an EWMA
of TX completion delay. But this would require yet another expensive
call to ktime_get() if HZ < 1000.

Then tcp_write_xmit() could use it to adjust :

   limit = max(2 * skb->truesize, sk->sk_pacing_rate >> 9);

to

   amount = (2 + tp->tx_completion_delay_ms) * sk->sk_pacing_rate 

   limit = max(2 * skb->truesize, amount / 1000);

I'll cook a patch.

Thanks.


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html