netdev - [PATCH net-next] virtio: Fix affinity for >32 VCPUs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20170203061905.100283-1-serebrin@google.com>
Date:   Thu,  2 Feb 2017 22:19:05 -0800
From:   Ben Serebrin <serebrin@...gle.com>
To:     netdev@...r.kernel.org, mst@...hat.com, jasowang@...hat.com,
        davem@...emloft.net, willemb@...gle.com, venkateshs@...gle.com,
        jmattson@...gle.com, serebrin@...gle.com
Subject: [PATCH net-next] virtio: Fix affinity for >32 VCPUs

From: Benjamin Serebrin <serebrin@...gle.com>

If the number of virtio queue pairs is not equal to the
number of VCPUs, the virtio guest driver doesn't assign
any CPU affinity for the queue interrupts or the xps
aggregation interrupt.

Google Compute Engine currently provides 1 queue pair for
every VCPU, but limits that at a maximum of 32 queue pairs.

This code assigns interrupt affinity even when there are more than
32 VCPUs.

Tested:

(on a 64-VCPU VM with debian 8, jessie-backports 4.9.2)

Without the fix we see all queues affinitized to all CPUs:

cd /proc/irq
for i in `seq 24 92` ; do sudo grep ".*" $i/smp_affinity_list;  done
0-63
[...]
0-63

and we see all TX queues' xps_cpus affinitzed to no cores:

for i in `seq 0 31` ; do sudo grep ".*" tx-$i/xps_cpus; done
00000000,00000000
[...]
00000000,00000000

With the fix, we see each queue assigned to the a single core,
and xps affinity set to 1 unique cpu per TX queue.

64 VCPU:

cd /proc/irq
for i in `seq 24 92` ; do sudo grep ".*" $i/smp_affinity_list;  done

0-63
0
0
1
1
2
2
3
3
4
4
5
5
6
6
7
7
8
8
9
9
10
10
11
11
12
12
13
13
14
14
15
15
16
16
17
17
18
18
19
19
20
20
21
21
22
22
23
23
24
24
25
25
26
26
27
27
28
28
29
29
30
30
31
31
0-63
0-63
0-63
0-63

cd /sys/class/net/eth0/queues
for i in `seq 0 31` ; do sudo grep ".*" tx-$i/xps_cpus;  done

00000001,00000001
00000002,00000002
00000004,00000004
00000008,00000008
00000010,00000010
00000020,00000020
00000040,00000040
00000080,00000080
00000100,00000100
00000200,00000200
00000400,00000400
00000800,00000800
00001000,00001000
00002000,00002000
00004000,00004000
00008000,00008000
00010000,00010000
00020000,00020000
00040000,00040000
00080000,00080000
00100000,00100000
00200000,00200000
00400000,00400000
00800000,00800000
01000000,01000000
02000000,02000000
04000000,04000000
08000000,08000000
10000000,10000000
20000000,20000000
40000000,40000000
80000000,80000000

48 VCPU:

cd /proc/irq
for i in `seq 24 92` ; do sudo grep ".*" $i/smp_affinity_list;  done
0-47
0
0
1
1
2
2
3
3
4
4
5
5
6
6
7
7
8
8
9
9
10
10
11
11
12
12
13
13
14
14
15
15
16
16
17
17
18
18
19
19
20
20
21
21
22
22
23
23
24
24
25
25
26
26
27
27
28
28
29
29
30
30
31
31
0-47
0-47
0-47
0-47

cd /sys/class/net/eth0/queues
for i in `seq 0 31` ; do sudo grep ".*" tx-$i/xps_cpus;  done

0001,00000001
0002,00000002
0004,00000004
0008,00000008
0010,00000010
0020,00000020
0040,00000040
0080,00000080
0100,00000100
0200,00000200
0400,00000400
0800,00000800
1000,00001000
2000,00002000
4000,00004000
8000,00008000
0000,00010000
0000,00020000
0000,00040000
0000,00080000
0000,00100000
0000,00200000
0000,00400000
0000,00800000
0000,01000000
0000,02000000
0000,04000000
0000,08000000
0000,10000000
0000,20000000
0000,40000000
0000,80000000

Signed-off-by: Ben Serebrin <serebrin@...gle.com>
Acked-by: Willem de Bruijn <willemb@...gle.com>
Acked-by: Jim Mattson <jmattson@...gle.com>
Acked-by: Venkatesh Srinivas <venkateshs@...gle.com>

Effort: kvm
---
 drivers/net/virtio_net.c | 30 +++++++++++++++++++++++++++---
 1 file changed, 27 insertions(+), 3 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 765c2d6358da..0dc3a102bfc4 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -1502,20 +1502,44 @@ static void virtnet_set_affinity(struct virtnet_info *vi)
 	 * queue pairs, we let the queue pairs to be private to one cpu by
 	 * setting the affinity hint to eliminate the contention.
 	 */
-	if (vi->curr_queue_pairs == 1 ||
-	    vi->max_queue_pairs != num_online_cpus()) {
+	if (vi->curr_queue_pairs == 1) {
 		virtnet_clean_affinity(vi, -1);
 		return;
 	}
 
+	/* If there are more cpus than queues, then assign the queues'
+	 * interrupts to the first cpus until we run out.
+	 */
 	i = 0;
 	for_each_online_cpu(cpu) {
+		if (i == vi->max_queue_pairs)
+			break;
 		virtqueue_set_affinity(vi->rq[i].vq, cpu);
 		virtqueue_set_affinity(vi->sq[i].vq, cpu);
-		netif_set_xps_queue(vi->dev, cpumask_of(cpu), i);
 		i++;
 	}
 
+	/* Stripe the XPS affinities across the online CPUs.
+	 * Hyperthread pairs are typically assigned such that Linux's
+	 * CPU X and X + (numcpus / 2) are hyperthread twins, so we cause
+	 * hyperthread twins to share TX queues, in the case where there are
+	 * more cpus than queues.
+	 */
+	for (i = 0; i < vi->max_queue_pairs; i++) {
+		struct cpumask mask;
+		int skip = i;
+
+		cpumask_clear(&mask);
+		for_each_online_cpu(cpu) {
+			while (skip--)
+				cpu = cpumask_next(cpu, cpu_online_mask);
+			if (cpu < num_possible_cpus())
+				cpumask_set_cpu(cpu, &mask);
+			skip = vi->max_queue_pairs - 1;
+		}
+		netif_set_xps_queue(vi->dev, &mask, i);
+	}
+
 	vi->affinity_hint_set = true;
 }