linux-kernel - Re: [BUG 4.15-rc7] IRQ matrix management errors

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.DEB.2.20.1801171557330.1777@nanos>
Date:   Wed, 17 Jan 2018 16:01:47 +0100 (CET)
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Keith Busch <keith.busch@...el.com>
cc:     LKML <linux-kernel@...r.kernel.org>
Subject: Re: [BUG 4.15-rc7] IRQ matrix management errors

On Wed, 17 Jan 2018, Keith Busch wrote:

> On Wed, Jan 17, 2018 at 10:32:12AM +0100, Thomas Gleixner wrote:
> > On Wed, 17 Jan 2018, Thomas Gleixner wrote:
> > > That doesn't sound right. The vectors should be spread evenly accross the
> > > CPUs. So ENOSPC should never happen.
> > > 
> > > Can you please take snapshots of /sys/kernel/debug/irq/ between the
> > > modprobe and modprobe -r steps?
> > 
> > The allocation fails because CPU1 has exhausted it's vector space here:
> > 
> > [002] d...   333.028216: irq_matrix_alloc_managed: bit=34 cpu=1 online=1 avl=0 alloc=202 managed=2 online_maps=112 global_avl=22085, global_rsvd=158, total_alloc=460
> > 
> > Now the interesting question is how that happens.
> 
> The trace with "trace_events=irq_matrix" kernel parameter is attached,
> ended shortly after an allocation failure.

Which device is allocating gazillions of non-managed interrupts?

  NetworkManager-2208  [044] d...     8.648608: irq_matrix_alloc: bit=68 cpu=0 online=1 avl=168 alloc=35 managed=3 online_maps=112 global_avl=22359, global_rsvd=532, total_alloc=215

....

  NetworkManager-2208  [044] d...     8.665114: irq_matrix_alloc: bit=237 cpu=0 online=1 avl=0 alloc=203 managed=3 online_maps=112 global_avl=22191, global_rsvd=364, total_alloc=383

That's 168 interrupts total. Enterprise grade insanity.

The patch below should cure that by spreading them out on allocation.

Thanks,

	tglx

8<------------------

diff --git a/kernel/irq/matrix.c b/kernel/irq/matrix.c
index 0ba0dd8863a7..5831cc7db27d 100644
--- a/kernel/irq/matrix.c
+++ b/kernel/irq/matrix.c
@@ -321,29 +321,38 @@ void irq_matrix_remove_reserved(struct irq_matrix *m)
 int irq_matrix_alloc(struct irq_matrix *m, const struct cpumask *msk,
 		     bool reserved, unsigned int *mapped_cpu)
 {
-	unsigned int cpu;
+	unsigned int cpu, best_cpu, maxavl = 0;
+	struct cpumap *cm;
+	unsigned int bit;
 
+	best_cpu = UINT_MAX;
 	for_each_cpu(cpu, msk) {
-		struct cpumap *cm = per_cpu_ptr(m->maps, cpu);
-		unsigned int bit;
+		cm = per_cpu_ptr(m->maps, cpu);
 
-		if (!cm->online)
+		if (!cm->online || cm->available <= maxavl)
 			continue;
 
-		bit = matrix_alloc_area(m, cm, 1, false);
-		if (bit < m->alloc_end) {
-			cm->allocated++;
-			cm->available--;
-			m->total_allocated++;
-			m->global_available--;
-			if (reserved)
-				m->global_reserved--;
-			*mapped_cpu = cpu;
-			trace_irq_matrix_alloc(bit, cpu, m, cm);
-			return bit;
-		}
+		best_cpu = cpu;
+		maxavl = cm->available;
 	}
-	return -ENOSPC;
+
+	if (!maxavl)
+		return -ENOSPC;
+
+	cm = per_cpu_ptr(m->maps, best_cpu);
+	bit = matrix_alloc_area(m, cm, 1, false);
+	if (bit >= m->alloc_end)
+		return -ENOSPC;
+
+	cm->allocated++;
+	cm->available--;
+	m->total_allocated++;
+	m->global_available--;
+	if (reserved)
+		m->global_reserved--;
+	*mapped_cpu = best_cpu;
+	trace_irq_matrix_alloc(bit, best_cpu, m, cm);
+	return bit;
 }
 
 /**