lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 2 Feb 2021 08:37:39 +0000
From:   John Garry <john.garry@...wei.com>
To:     Marc Zyngier <maz@...nel.org>
CC:     Thomas Gleixner <tglx@...utronix.de>,
        Zhou Wang <wangzhou1@...ilicon.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: PCI MSI issue with reinserting a driver

On 01/02/2021 18:50, Marc Zyngier wrote:

Hi Marc,

>> Just a heads-up, by chance I noticed that I can't re-insert a specific
>> driver on v5.11-rc6:
>>
>> [   64.356023] hisi_dma 0000:7b:00.0: Adding to iommu group 31
>> [   64.368627] hisi_dma 0000:7b:00.0: enabling device (0000 -> 0002)
>> [   64.384156] hisi_dma 0000:7b:00.0: Failed to allocate MSI vectors!
>> [   64.397180] hisi_dma: probe of 0000:7b:00.0 failed with error -28
>>
>> That's with CONFIG_DEBUG_TEST_DRIVER_REMOVE=y
>>
>> Bisect tells me that this is the first bad commit:
>> 4615fbc3788d genirq/irqdomain: Don't try to free an interrupt that has
>> no mapping
>>
>> The relevant driver code is
>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/dma/hisi_dma.c#n547
>>
>> That driver only allocates 30 MSI, so maybe there's a problem with not
>> allocating (and freeing) all 32 MSI.
> Are they Multi-MSI (and not MSI-X)?

multi-msi

> 
>> I'll have a bit more of a look tomorrow.
> Here's my suspicion: two of the interrupts are mapped in the low-level
> domain (the ITS, I'd expect in your case), but they have never been
> mapped at the higher level.
> 
> On teardown, we only get rid of the 30 that were actually mapped, and
> leave the last two dangling in the ITS domain, and thus the ITS device
> resources are never freed. On reload, we request another 32
> interrupts, which can't be satisfied for this device.
> 
> Assuming I got it right, the question is: why weren't these interrupts
> mapped in the PCI domain the first place. And if I got it wrong, I'm
> even more curious!

Not sure. I also now notice an error for the SAS PCI driver on D06 when 
nr_cpus < 16, which means number of MSI vectors allocated < 32, so looks 
the same problem. There we try to allocate 16 + max(nr cpus, 16) MSI.

Anyway, let me have a look today to see what is going wrong.

cheers,
John

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ