lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 14 Oct 2011 09:19:51 +0200
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	David Miller <davem@...emloft.net>
Cc:	netdev <netdev@...r.kernel.org>
Subject: [PATCH net-next] tcp: reduce memory needs of out of order queue

Many drivers allocates big skb to store a single TCP frame.
(WIFI drivers, or NIC using PAGE_SIZE fragments)

Its now common to get skb->truesize bigger than 4096 to store a ~1500
bytes TCP frame.

TCP sessions with large RTT and packet losses can fill their Out Of
Order queue with such oversized skbs, and hit their sk_rcvbuf limit,
starting a pruning of complete OFO queue, without giving chance to
receive the missing packet(s) and moving skbs from OFO to receive queue.

This patch adds skb_reduce_truesize() helper, and uses it for all skbs
queued into OFO queue.

Spending some time to perform a copy is worth the pain, since it permits
SACK processing to have a chance to complete over the RTT barrier.

This greatly improves user experience, without added cost on fast path.

Signed-off-by: Eric Dumazet <eric.dumazet@...il.com>
---
 net/ipv4/tcp_input.c |   24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index c1653fe..1d10edb 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -4426,6 +4426,25 @@ static inline int tcp_try_rmem_schedule(struct sock *sk, unsigned int size)
 	return 0;
 }
 
+/*
+ * Caller want to reduce memory needs before queueing skb
+ * The (expensive) copy should not be be done in fast path.
+ */
+static struct sk_buff *skb_reduce_truesize(struct sk_buff *skb)
+{
+	if (skb->truesize > 2 * SKB_TRUESIZE(skb->len)) {
+		struct sk_buff *nskb;
+
+		nskb = skb_copy_expand(skb, skb_headroom(skb), 0,
+				       GFP_ATOMIC | __GFP_NOWARN);
+		if (nskb) {
+			__kfree_skb(skb);
+			skb = nskb;
+		}
+	}
+	return skb;
+}
+
 static void tcp_data_queue(struct sock *sk, struct sk_buff *skb)
 {
 	struct tcphdr *th = tcp_hdr(skb);
@@ -4553,6 +4572,11 @@ drop:
 	SOCK_DEBUG(sk, "out of order segment: rcv_next %X seq %X - %X\n",
 		   tp->rcv_nxt, TCP_SKB_CB(skb)->seq, TCP_SKB_CB(skb)->end_seq);
 
+	/* Since this skb might stay on ofo a long time, try to reduce
+	 * its truesize (if its too big) to avoid future pruning.
+	 * Many drivers allocate large buffers even to hold tiny frames.
+	 */
+	skb = skb_reduce_truesize(skb);
 	skb_set_owner_r(skb, sk);
 
 	if (!skb_peek(&tp->out_of_order_queue)) {


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ