lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.20.1801171020440.1777@nanos>
Date:   Wed, 17 Jan 2018 10:24:06 +0100 (CET)
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Keith Busch <keith.busch@...el.com>
cc:     LKML <linux-kernel@...r.kernel.org>
Subject: Re: [BUG 4.15-rc7] IRQ matrix management errors

On Wed, 17 Jan 2018, Keith Busch wrote:
> On Wed, Jan 17, 2018 at 08:34:22AM +0100, Thomas Gleixner wrote:
> > Can you trace the matrix allocations from the very beginning or tell me how
> > to reproduce. I'd like to figure out why this is happening.
> 
> Sure, I'll get the irq_matrix events.
> 
> I reproduce this on a machine with 112 CPUs and 3 NVMe controllers. The
> first two NVMe want 112 MSI-x vectors, and the last only 31 vectors. The
> test runs 'modprobe nvme' and 'modprobe -r nvme' in a loop with 10
> second delay between each step. Repro occurs within a few iterations,
> sometimes already broken after the initial boot.

That doesn't sound right. The vectors should be spread evenly accross the
CPUs. So ENOSPC should never happen.

Can you please take snapshots of /sys/kernel/debug/irq/ between the
modprobe and modprobe -r steps?

Thanks,

	tglx





Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ