lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 30 Jan 2015 14:39:45 +0100
From:	Michal Kazior <michal.kazior@...to.com>
To:	Eric Dumazet <eric.dumazet@...il.com>
Cc:	linux-wireless <linux-wireless@...r.kernel.org>,
	Network Development <netdev@...r.kernel.org>,
	eyalpe@....mellanox.co.il
Subject: Re: Throughput regression with `tcp: refine TSO autosizing`

On 29 January 2015 at 14:14, Eric Dumazet <eric.dumazet@...il.com> wrote:
> On Thu, 2015-01-29 at 12:48 +0100, Michal Kazior wrote:
>> Hi,
>>
>> I'm not subscribed to netdev list and I can't find the message-id so I
>> can't reply directly to the original thread `BW regression after "tcp:
>> refine TSO autosizing"`.
>>
>> I've noticed a big TCP performance drop with ath10k
>> (drivers/net/wireless/ath/ath10k) on 3.19-rc5. Instead of 500mbps I
>> get 250mbps in my testbed.
>>
>> After bisecting I ended up at `tcp: refine TSO autosizing`. Reverting
>> `tcp: refine TSO autosizing` and `tcp: Do not apply TSO segment limit
>> to non-TSO packets` (for conflict free reverts) fixes the problem.
[...]
>> Any hints/ideas?
>>
>
> Hi Michal
>
> This patch restored original TSQ behavior, because the 1ms worth of data
> per flow had totally destroyed TSQ intent.
>
> vi +630 Documentation/networking/ip-sysctl.txt
[...]
> With this change and the stretch ack fixes, I got 37Gbps of throughput
> on a single flow, on a 40Gbit NIC (mlx4)
>
> If a driver needs to buffer more than tcp_limit_output_bytes=131072 to
> get line rate, I suggest that you either :
>
> 1) tweak tcp_limit_output_bytes, but its not practical from a driver.

I've briefly tried playing with this knob to no avail unfortunately. I
tried 256K, 1M - it didn't improve TCP performance. When I tried to
make it smaller (e.g. 16K) the traffic dropped even more so it does
have an effect. It seems there's some other limiting factor in this
case.


> 2) change the driver, knowing what are its exact requirements, by
> removing a fraction of skb->truesize at ndo_start_xmit() time as in :
>
> if ((skb->destructor == sock_wfree ||
>      skb->restuctor == tcp_wfree) &&
>     skb->sk) {
>     u32 fraction = skb->truesize / 2;
>
>     skb->truesize -= fraction;
>     atomic_sub(fraction, &skb->sk->sk_wmem_alloc);
> }

tcp_wfree isn't exported so I went and just did a quick hack and used
`if (skb->sk)` as the condition. Didn't help with TCP.

I wonder if this TSO patch is the single thing to blame. Maybe it's
just the trigger and something else is broken? I've been seeing some
UDP performance inconsistencies in 3.19 (I don't recall seeing these
in 3.18) - but maybe my perception is failing on me.

Anyway, thanks for the tips so far.


MichaƂ
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ