[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.21.1811211419260.1665@nanos.tec.linutronix.de>
Date: Wed, 21 Nov 2018 14:26:20 +0100 (CET)
From: Thomas Gleixner <tglx@...utronix.de>
To: Josh Hunt <johunt@...mai.com>
cc: saeedm@...lanox.com, linux-kernel@...r.kernel.org,
"Ozen, Gurhan" <guozen@...mai.com>
Subject: Re: vector space exhaustion on 4.14 LTS kernels
Josh,
On Mon, 19 Nov 2018, Josh Hunt wrote:
> We have a class of machines that appear to be exhausting the vector space on
> cpus 0 and 1 which causes some breakage later on when trying to set the
> affinity. The boxes are running the 4.14 LTS kernel.
>
> [ 39.531385] __assign_irq_vector: irq:512 cpu:128 searched:00,00000001
> vector:00,00000000 continue
> [ 39.531386] apic_set_affinity: irq:512 mask:00,00000001 err:-28
>
> The affinity values:
>
> root@....25.48.208:/proc/irq/512# grep . *
> affinity_hint:00,00000001
> effective_affinity:00,00000004
> effective_affinity_list:2
> grep: mlx5_comp0@pci:0000:65:00.1: Is a directory
> node:0
> smp_affinity:ff,ffffffff
> smp_affinity_list:0-39
> spurious:count 3
> spurious:unhandled 0
> spurious:last_unhandled 0 ms
>
> I noticed your change, a0c9259dc4e1 "irq/matrix: Spread interrupts on
> allocation", and this sounds like what we're hitting. Booting 4.19 does not
> have this problem. I haven't booted 4.15 yet, but can do it to confirm the
> above commit is what resolves this.
Might be, but in 4.15 the while vector allocation got rewritten. One of the
reasons was the exhaustion issue. Some of that is caused by massive over
allocation by certain device drivers. The new allocator mechanism handles
that way better.
> Since 4.14 doesn't have the matrix allocator it's not a trivial backport. I
> was wondering a) if you agree with my assessment and b) if there's any plans
> on resolving this on the 4.14 allocator? If not I can attempt to backport the
> idea to 4.14 to spread the interrupts around on allocation.
No plans. Good luck with trying to fix that on the 4.14 code. I'd recommend
to switch to 4.19 LTS :)
Thanks,
tglx
Powered by blists - more mailing lists