lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1423056591.907.130.camel@edumazet-glaptop2.roam.corp.google.com>
Date:	Wed, 04 Feb 2015 05:29:51 -0800
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	Michal Kazior <michal.kazior@...to.com>
Cc:	Neal Cardwell <ncardwell@...gle.com>,
	linux-wireless <linux-wireless@...r.kernel.org>,
	Network Development <netdev@...r.kernel.org>,
	eyalpe@....mellanox.co.il
Subject: Re: Throughput regression with `tcp: refine TSO autosizing`

On Wed, 2015-02-04 at 05:16 -0800, Eric Dumazet wrote:
> OK guys
> 
> Using a mlx4 testbed I can reproduce the problem by pushing coalescing
> settings and disabling SG (thus disabling GSO)
> 
> ethtool -K eth0 sg off
> Actual changes:
> scatter-gather: off
> 	tx-scatter-gather: off
> generic-segmentation-offload: off [requested on]
> 
> ethtool -C eth0 tx-usecs 1024 tx-frames 64
> 
> Meaning that NIC waits one ms before sending the TX IRQ,
> and can accumulate 64 frames before forcing the interrupt.
> 
> We probably have a bug in cwnd expansion logic :
> 
> lpaa23:~# DUMP_TCP_INFO=1 ./netperf -H 10.246.7.152 -Cc
> MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.246.7.152 () port 0 AF_INET
> rto=201000 ato=0 pmtu=1500 rcv_ssthresh=29200 rtt=230 rttvar=30 snd_ssthresh=41 cwnd=59 reordering=3 total_retrans=1 ca_state=0 pacing_rate=5943.1 Mbits
> Recv   Send    Send                          Utilization       Service Demand
> Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
> Size   Size    Size     Time     Throughput  local    remote   local   remote
> bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB   us/KB
> 
>  87380  16384  16384    10.00       530.39   0.40     0.32     2.965   2.398  
> 
> 
> -> final cwnd=59 which is not enough to avoid the 1ms delay between each
> burst. 
> 
> So sender sends ~60 packets, then has to wait 1ms (to get NIC TX IRQ)
> before sending the following burst.
> 
> I am CCing Neal, he probably can help to root cause the problem.

Arg, this was with net-next, ie not including our recent stretch ack
fixes.

Using David Miller 'net' tree, cwnd seems OK.

Speed is low because of 64 queued frames are exceeding
tcp_limit_output_bytes

lpaa23:~# cat /proc/sys/net/ipv4/tcp_limit_output_bytes
131072
lpaa23:~# DUMP_TCP_INFO=1 ./netperf -H 10.246.7.152 -Cc
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.246.7.152 () port 0 AF_INET
rto=201000 ato=0 pmtu=1500 rcv_ssthresh=29200 rtt=166 rttvar=16 snd_ssthresh=26 cwnd=59 reordering=3 total_retrans=0 ca_state=0 pacing_rate=8203.52 Mbits
Recv   Send    Send                          Utilization       Service Demand
Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
Size   Size    Size     Time     Throughput  local    remote   local   remote
bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB   us/KB

 87380  16384  16384    10.00       569.96   0.52     0.38     3.588   2.625  


lpaa23:~# echo 262144 >/proc/sys/net/ipv4/tcp_limit_output_bytes
lpaa23:~# DUMP_TCP_INFO=1 ./netperf -H 10.246.7.152 -Cc
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.246.7.152 () port 0 AF_INET
rto=201000 ato=0 pmtu=1500 rcv_ssthresh=29200 rtt=98 rttvar=18 snd_ssthresh=312 cwnd=313 reordering=3 total_retrans=23 ca_state=0
Recv   Send    Send                          Utilization       Service Demand
Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
Size   Size    Size     Time     Throughput  local    remote   local   remote
bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB   us/KB

 87380  16384  16384    10.00      8518.40   2.60     1.57     1.200   0.727  




--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ