[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1423056591.907.130.camel@edumazet-glaptop2.roam.corp.google.com>
Date: Wed, 04 Feb 2015 05:29:51 -0800
From: Eric Dumazet <eric.dumazet@...il.com>
To: Michal Kazior <michal.kazior@...to.com>
Cc: Neal Cardwell <ncardwell@...gle.com>,
linux-wireless <linux-wireless@...r.kernel.org>,
Network Development <netdev@...r.kernel.org>,
eyalpe@....mellanox.co.il
Subject: Re: Throughput regression with `tcp: refine TSO autosizing`
On Wed, 2015-02-04 at 05:16 -0800, Eric Dumazet wrote:
> OK guys
>
> Using a mlx4 testbed I can reproduce the problem by pushing coalescing
> settings and disabling SG (thus disabling GSO)
>
> ethtool -K eth0 sg off
> Actual changes:
> scatter-gather: off
> tx-scatter-gather: off
> generic-segmentation-offload: off [requested on]
>
> ethtool -C eth0 tx-usecs 1024 tx-frames 64
>
> Meaning that NIC waits one ms before sending the TX IRQ,
> and can accumulate 64 frames before forcing the interrupt.
>
> We probably have a bug in cwnd expansion logic :
>
> lpaa23:~# DUMP_TCP_INFO=1 ./netperf -H 10.246.7.152 -Cc
> MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.246.7.152 () port 0 AF_INET
> rto=201000 ato=0 pmtu=1500 rcv_ssthresh=29200 rtt=230 rttvar=30 snd_ssthresh=41 cwnd=59 reordering=3 total_retrans=1 ca_state=0 pacing_rate=5943.1 Mbits
> Recv Send Send Utilization Service Demand
> Socket Socket Message Elapsed Send Recv Send Recv
> Size Size Size Time Throughput local remote local remote
> bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB
>
> 87380 16384 16384 10.00 530.39 0.40 0.32 2.965 2.398
>
>
> -> final cwnd=59 which is not enough to avoid the 1ms delay between each
> burst.
>
> So sender sends ~60 packets, then has to wait 1ms (to get NIC TX IRQ)
> before sending the following burst.
>
> I am CCing Neal, he probably can help to root cause the problem.
Arg, this was with net-next, ie not including our recent stretch ack
fixes.
Using David Miller 'net' tree, cwnd seems OK.
Speed is low because of 64 queued frames are exceeding
tcp_limit_output_bytes
lpaa23:~# cat /proc/sys/net/ipv4/tcp_limit_output_bytes
131072
lpaa23:~# DUMP_TCP_INFO=1 ./netperf -H 10.246.7.152 -Cc
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.246.7.152 () port 0 AF_INET
rto=201000 ato=0 pmtu=1500 rcv_ssthresh=29200 rtt=166 rttvar=16 snd_ssthresh=26 cwnd=59 reordering=3 total_retrans=0 ca_state=0 pacing_rate=8203.52 Mbits
Recv Send Send Utilization Service Demand
Socket Socket Message Elapsed Send Recv Send Recv
Size Size Size Time Throughput local remote local remote
bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB
87380 16384 16384 10.00 569.96 0.52 0.38 3.588 2.625
lpaa23:~# echo 262144 >/proc/sys/net/ipv4/tcp_limit_output_bytes
lpaa23:~# DUMP_TCP_INFO=1 ./netperf -H 10.246.7.152 -Cc
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.246.7.152 () port 0 AF_INET
rto=201000 ato=0 pmtu=1500 rcv_ssthresh=29200 rtt=98 rttvar=18 snd_ssthresh=312 cwnd=313 reordering=3 total_retrans=23 ca_state=0
Recv Send Send Utilization Service Demand
Socket Socket Message Elapsed Send Recv Send Recv
Size Size Size Time Throughput local remote local remote
bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB
87380 16384 16384 10.00 8518.40 2.60 1.57 1.200 0.727
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists