lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87o8h3lj0n.wl-maz@kernel.org>
Date:   Mon, 01 Feb 2021 18:50:32 +0000
From:   Marc Zyngier <maz@...nel.org>
To:     John Garry <john.garry@...wei.com>
Cc:     Thomas Gleixner <tglx@...utronix.de>,
        Zhou Wang <wangzhou1@...ilicon.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: PCI MSI issue with reinserting a driver

Hi John,

On Mon, 01 Feb 2021 18:34:59 +0000,
John Garry <john.garry@...wei.com> wrote:
> 
> Just a heads-up, by chance I noticed that I can't re-insert a specific
> driver on v5.11-rc6:
> 
> [   64.356023] hisi_dma 0000:7b:00.0: Adding to iommu group 31
> [   64.368627] hisi_dma 0000:7b:00.0: enabling device (0000 -> 0002)
> [   64.384156] hisi_dma 0000:7b:00.0: Failed to allocate MSI vectors!
> [   64.397180] hisi_dma: probe of 0000:7b:00.0 failed with error -28
> 
> That's with CONFIG_DEBUG_TEST_DRIVER_REMOVE=y
> 
> Bisect tells me that this is the first bad commit:
> 4615fbc3788d genirq/irqdomain: Don't try to free an interrupt that has
> no mapping
> 
> The relevant driver code is
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/dma/hisi_dma.c#n547
> 
> That driver only allocates 30 MSI, so maybe there's a problem with not
> allocating (and freeing) all 32 MSI.

Are they Multi-MSI (and not MSI-X)?

> I'll have a bit more of a look tomorrow.

Here's my suspicion: two of the interrupts are mapped in the low-level
domain (the ITS, I'd expect in your case), but they have never been
mapped at the higher level.

On teardown, we only get rid of the 30 that were actually mapped, and
leave the last two dangling in the ITS domain, and thus the ITS device
resources are never freed. On reload, we request another 32
interrupts, which can't be satisfied for this device.

Assuming I got it right, the question is: why weren't these interrupts
mapped in the PCI domain the first place. And if I got it wrong, I'm
even more curious!

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ