[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1423156205.31870.86.camel@edumazet-glaptop2.roam.corp.google.com>
Date: Thu, 05 Feb 2015 09:10:05 -0800
From: Eric Dumazet <eric.dumazet@...il.com>
To: Michal Kazior <michal.kazior@...to.com>
Cc: Neal Cardwell <ncardwell@...gle.com>,
linux-wireless <linux-wireless@...r.kernel.org>,
Network Development <netdev@...r.kernel.org>,
eyalpe@....mellanox.co.il
Subject: Re: Throughput regression with `tcp: refine TSO autosizing`
On Thu, 2015-02-05 at 06:41 -0800, Eric Dumazet wrote:
> Not at all. This basically removes backpressure.
>
> A single UDP socket can now blast packets regardless of SO_SNDBUF
> limits.
>
> This basically remove years of work trying to fix bufferbloat.
>
> I still do not understand why increasing tcp_limit_output_bytes is not
> working for you.
Oh well, tcp_limit_output_bytes might be ok.
In fact, the problem comes from GSO assumption. Maybe Herbert was right,
when he suggested TCP would be simpler if we enforced GSO...
When GSO is used, the thing works because 2*skb->truesize is roughly 2
ms worth of traffic.
Because you do not use GSO, and tx completions are slow, we need this :
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 65caf8b95e17..ac01b4cd0035 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -2044,7 +2044,8 @@ static bool tcp_write_xmit(struct sock *sk, unsigned int mss_now, int nonagle,
break;
/* TCP Small Queues :
- * Control number of packets in qdisc/devices to two packets / or ~1 ms.
+ * Control number of packets in qdisc/devices to two packets /
+ * or ~2 ms (sk->sk_pacing_rate >> 9) in case GSO is off.
* This allows for :
* - better RTT estimation and ACK scheduling
* - faster recovery
@@ -2053,7 +2054,7 @@ static bool tcp_write_xmit(struct sock *sk, unsigned int mss_now, int nonagle,
* of queued bytes to ensure line rate.
* One example is wifi aggregation (802.11 AMPDU)
*/
- limit = max(2 * skb->truesize, sk->sk_pacing_rate >> 10);
+ limit = max(2 * skb->truesize, sk->sk_pacing_rate >> 9);
limit = min_t(u32, limit, sysctl_tcp_limit_output_bytes);
if (atomic_read(&sk->sk_wmem_alloc) > limit) {
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists