[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <1335173934.3293.84.camel@edumazet-glaptop>
Date: Mon, 23 Apr 2012 11:38:54 +0200
From: Eric Dumazet <eric.dumazet@...il.com>
To: David Miller <davem@...emloft.net>
Cc: netdev <netdev@...r.kernel.org>, Tom Herbert <therbert@...gle.com>,
Neal Cardwell <ncardwell@...gle.com>,
Maciej Żenczykowski <maze@...gle.com>,
Yuchung Cheng <ycheng@...gle.com>,
Ilpo Järvinen <ilpo.jarvinen@...sinki.fi>,
Rick Jones <rick.jones2@...com>
Subject: [PATCH 2/2 net-next] tcp: sk_add_backlog() is too agressive for TCP
From: Eric Dumazet <edumazet@...gle.com>
While investigating TCP performance problems on 10Gb+ links, we found a
tcp sender was dropping lot of incoming ACKS because of sk_rcvbuf limit
in sk_add_backlog(), especially if receiver doesnt use GRO/LRO and sends
one ACK every two MSS segments.
A sender usually tweaks sk_sndbuf, but sk_rcvbuf stays at its default
value (87380), allowing a too small backlog.
A TCP ACK, even being small, can consume nearly same truesize space than
outgoing packets. Using sk_rcvbuf + sk_sndbuf as a limit makes sense and
is fast to compute.
Performance results on netperf, single flow, receiver with disabled
GRO/LRO : 7500 Mbits instead of 6050 Mbits, no more TCPBacklogDrop
increments at sender.
Signed-off-by: Eric Dumazet <edumazet@...gle.com>
Cc: Neal Cardwell <ncardwell@...gle.com>
Cc: Tom Herbert <therbert@...gle.com>
Cc: Maciej Żenczykowski <maze@...gle.com>
Cc: Yuchung Cheng <ycheng@...gle.com>
Cc: Ilpo Järvinen <ilpo.jarvinen@...sinki.fi>
Cc: Rick Jones <rick.jones2@...com>
---
net/ipv4/tcp_ipv4.c | 3 ++-
net/ipv6/tcp_ipv6.c | 3 ++-
2 files changed, 4 insertions(+), 2 deletions(-)
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 917607e..cf97e98 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1752,7 +1752,8 @@ process:
if (!tcp_prequeue(sk, skb))
ret = tcp_v4_do_rcv(sk, skb);
}
- } else if (unlikely(sk_add_backlog(sk, skb, sk->sk_rcvbuf))) {
+ } else if (unlikely(sk_add_backlog(sk, skb,
+ sk->sk_rcvbuf + sk->sk_sndbuf))) {
bh_unlock_sock(sk);
NET_INC_STATS_BH(net, LINUX_MIB_TCPBACKLOGDROP);
goto discard_and_relse;
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index b04e6d8..5fb19d3 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -1654,7 +1654,8 @@ process:
if (!tcp_prequeue(sk, skb))
ret = tcp_v6_do_rcv(sk, skb);
}
- } else if (unlikely(sk_add_backlog(sk, skb, sk->sk_rcvbuf))) {
+ } else if (unlikely(sk_add_backlog(sk, skb,
+ sk->sk_rcvbuf + sk->sk_sndbuf))) {
bh_unlock_sock(sk);
NET_INC_STATS_BH(net, LINUX_MIB_TCPBACKLOGDROP);
goto discard_and_relse;
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists