netdev - Re: [PATCH net-next 0/7] tcp: implement rb-tree based retransmit queue

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <69e51472-33d9-b8e2-e02c-7f51c0fd657f@mellanox.com>
Date:   Tue, 6 Feb 2018 18:27:40 +0200
From:   Tal Gilboa <talgi@...lanox.com>
To:     Eric Dumazet <eric.dumazet@...il.com>,
        David Laight <David.Laight@...LAB.COM>,
        'Eric Dumazet' <edumazet@...gle.com>
Cc:     David Miller <davem@...emloft.net>,
        "ncardwell@...gle.com" <ncardwell@...gle.com>,
        "ycheng@...gle.com" <ycheng@...gle.com>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        Saeed Mahameed <saeedm@...lanox.com>,
        Tariq Toukan <tariqt@...lanox.com>,
        Amir Ancel <amira@...lanox.com>
Subject: Re: [PATCH net-next 0/7] tcp: implement rb-tree based retransmit
 queue

On 2/6/2018 5:52 PM, Eric Dumazet wrote:
> On Tue, 2018-02-06 at 15:22 +0000, David Laight wrote:
>> From: Eric Dumazet
>>> Sent: 06 February 2018 14:20
>>
>> ...
>>> Please give exact details.
>>> Sending 64, 128, 256 or 512 bytes at a time on TCP_STREAM makes little sense.
>>> We are not optimizing stack for pathological cases, sorry.
>>
>> There are plenty of workloads which are not bulk data and where multiple
>> small buffers get sent at unknown intervals (which may be back to back).
>> Such connections have to have Nagle disabled because the Nagle delays
>> are 'horrid'.
>> Clearly lost packets can cause delays, but they are rare on local networks.
> 
> Auto corking makes sure aggregation happens, even for when Nagle is in
> the picture.

> 
> netperf -- -m 256    will still cook 64KB TSO packets

This is what we would have liked to see, but auto corking isn't forcing 
64KB TSO packets. Under certain conditions, specifically when TX queue 
is empty, it would send the SKB to transmit even if it isn't full:
static bool tcp_should_autocork(struct sock *sk, struct sk_buff *skb,
				int size_goal)
{
	return skb->len < size_goal &&
	       sock_net(sk)->ipv4.sysctl_tcp_autocorking &&
	       skb != tcp_write_queue_head(sk) &&
	       refcount_read(&sk->sk_wmem_alloc) > skb->truesize;
}
When skb == tcp_write_queue_head(sk) corking is done. This is part of 
the optimization for mlx5 driver I've mentioned. If we can better 
utilize auto corking we shouldn't have an issue.

> 
> netperf is not adding delays between each send(), unless it has been
> modified.
> 
> 

I ran this command:
./super_netperf 2000 -H <IP> -l 30 -f g -- -m $size
didn't change netperf in any way.