[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <1318576791.2533.99.camel@edumazet-laptop>
Date: Fri, 14 Oct 2011 09:19:51 +0200
From: Eric Dumazet <eric.dumazet@...il.com>
To: David Miller <davem@...emloft.net>
Cc: netdev <netdev@...r.kernel.org>
Subject: [PATCH net-next] tcp: reduce memory needs of out of order queue
Many drivers allocates big skb to store a single TCP frame.
(WIFI drivers, or NIC using PAGE_SIZE fragments)
Its now common to get skb->truesize bigger than 4096 to store a ~1500
bytes TCP frame.
TCP sessions with large RTT and packet losses can fill their Out Of
Order queue with such oversized skbs, and hit their sk_rcvbuf limit,
starting a pruning of complete OFO queue, without giving chance to
receive the missing packet(s) and moving skbs from OFO to receive queue.
This patch adds skb_reduce_truesize() helper, and uses it for all skbs
queued into OFO queue.
Spending some time to perform a copy is worth the pain, since it permits
SACK processing to have a chance to complete over the RTT barrier.
This greatly improves user experience, without added cost on fast path.
Signed-off-by: Eric Dumazet <eric.dumazet@...il.com>
---
net/ipv4/tcp_input.c | 24 ++++++++++++++++++++++++
1 file changed, 24 insertions(+)
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index c1653fe..1d10edb 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -4426,6 +4426,25 @@ static inline int tcp_try_rmem_schedule(struct sock *sk, unsigned int size)
return 0;
}
+/*
+ * Caller want to reduce memory needs before queueing skb
+ * The (expensive) copy should not be be done in fast path.
+ */
+static struct sk_buff *skb_reduce_truesize(struct sk_buff *skb)
+{
+ if (skb->truesize > 2 * SKB_TRUESIZE(skb->len)) {
+ struct sk_buff *nskb;
+
+ nskb = skb_copy_expand(skb, skb_headroom(skb), 0,
+ GFP_ATOMIC | __GFP_NOWARN);
+ if (nskb) {
+ __kfree_skb(skb);
+ skb = nskb;
+ }
+ }
+ return skb;
+}
+
static void tcp_data_queue(struct sock *sk, struct sk_buff *skb)
{
struct tcphdr *th = tcp_hdr(skb);
@@ -4553,6 +4572,11 @@ drop:
SOCK_DEBUG(sk, "out of order segment: rcv_next %X seq %X - %X\n",
tp->rcv_nxt, TCP_SKB_CB(skb)->seq, TCP_SKB_CB(skb)->end_seq);
+ /* Since this skb might stay on ofo a long time, try to reduce
+ * its truesize (if its too big) to avoid future pruning.
+ * Many drivers allocate large buffers even to hold tiny frames.
+ */
+ skb = skb_reduce_truesize(skb);
skb_set_owner_r(skb, sk);
if (!skb_peek(&tp->out_of_order_queue)) {
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists