lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Sun, 04 Dec 2016 18:43:04 -0800
From:   Eric Dumazet <eric.dumazet@...il.com>
To:     Paolo Abeni <pabeni@...hat.com>
Cc:     netdev <netdev@...r.kernel.org>
Subject: [RFC] udp: some improvements on RX path.

We currently access 3 cache lines from an skb in receive queue while
holding receive queue lock :

First cache line (contains ->next / prev pointers )
2nd cache line (skb->peeked)
3rd cache line (skb->truesize)

I believe we could get rid of skb->peeked completely.

I will cook a patch, but basically the idea is that the last owner of a
skb (right before skb->users becomes 0) can have the 'ownership' and
thus increase stats.

The 3rd cache line miss is easily avoided by the following patch.

But I also want to work on the idea I gave few days back, having a
separate queue and use splice to transfer the 'softirq queue' into
a calm queue in a different cache line.

I expect a 50 % performance increase under load, maybe 1.5 Mpps.

diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 16d88ba9ff1c..37d4e8da6482 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1191,7 +1191,13 @@ static void udp_rmem_release(struct sock *sk, int size, int partial)
 /* Note: called with sk_receive_queue.lock held */
 void udp_skb_destructor(struct sock *sk, struct sk_buff *skb)
 {
-	udp_rmem_release(sk, skb->truesize, 1);
+	/* HACK HACK HACK :
+	 * Instead of using skb->truesize here, find a copy of it in skb->dev.
+	 * This avoids a cache line miss in this path,
+	 * while sk_receive_queue lock is held.
+	 * Look at __udp_enqueue_schedule_skb() to find where this copy is done.
+	 */
+	udp_rmem_release(sk, (int)(unsigned long)skb->dev, 1);
 }
 EXPORT_SYMBOL(udp_skb_destructor);
 
@@ -1201,6 +1207,11 @@ int __udp_enqueue_schedule_skb(struct sock *sk, struct sk_buff *skb)
 	int rmem, delta, amt, err = -ENOMEM;
 	int size = skb->truesize;
 
+	/* help udp_skb_destructor() to get skb->truesize from skb->dev
+	 * without a cache line miss.
+	 */
+	skb->dev = (struct net_device *)(unsigned long)size;
+
 	/* try to avoid the costly atomic add/sub pair when the receive
 	 * queue is full; always allow at least a packet
 	 */
@@ -1233,7 +1244,6 @@ int __udp_enqueue_schedule_skb(struct sock *sk, struct sk_buff *skb)
 	/* no need to setup a destructor, we will explicitly release the
 	 * forward allocated memory on dequeue
 	 */
-	skb->dev = NULL;
 	sock_skb_set_dropcount(sk, skb);
 
 	__skb_queue_tail(list, skb);






Powered by blists - more mailing lists