lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 2 Feb 2021 15:46:25 +0000
From:   John Garry <john.garry@...wei.com>
To:     Marc Zyngier <maz@...nel.org>
CC:     Thomas Gleixner <tglx@...utronix.de>,
        Zhou Wang <wangzhou1@...ilicon.com>,
        <linux-kernel@...r.kernel.org>
Subject: Re: PCI MSI issue with reinserting a driver

On 02/02/2021 14:48, Marc Zyngier wrote:
>>>
>>> Not sure. I also now notice an error for the SAS PCI driver on D06 
>>> when nr_cpus < 16, which means number of MSI vectors allocated < 32, 
>>> so looks the same problem. There we try to allocate 16 + max(nr cpus, 
>>> 16) MSI.
>>>
>>> Anyway, let me have a look today to see what is going wrong.
>>>
>> Could this be the problem:
>>
>> nr_cpus=11
>>
>> In alloc path, we have:
>>     its_alloc_device_irq(nvecs=27 = 16+11)
>>       bitmap_find_free_region(order = 5);
>> In free path, we have:
>>     its_irq_domain_free(nvecs = 1) and free each 27 vecs
>>       bitmap_release_region(order = 0)
>>
>> So we allocate 32 bits, but only free 27. And 2nd alloc for 32 fails.

[ ... ]

>>
>>
>> But I'm not sure that we have any requirement for those map bits to be
>> consecutive.
> 
> We can't really do that. All the events must be contiguous,
> and there is also a lot of assumptions in the ITS driver that
> LPI allocations is also contiguous.
> 
> But there is also the fact that for Multi-MSI, we *must*
> allocate 32 vectors. Any driver could assume that if we have
> allocated 17 vectors, then there is another 15 available.
> 
> My question still stand: how was this working with the previous
> behaviour?

Because previously in this scenario we would allocate 32 bits and free 
32 bits in the map; but now we allocate 32 bits, yet only free 27 - so 
leak 5 bits. And this comes from how irq_domain_free_irqs_hierarchy() 
now frees per-interrupt, instead of all irqs per domain.

Before:
  In free path, we have:
      its_irq_domain_free(nvecs = 27)
        bitmap_release_region(count order = 5 == 32bits)

Current:
  In free path, we have:
      its_irq_domain_free(nvecs = 1) for free each 27 vecs
        bitmap_release_region(count order = 0 == 1bit)

Cheers,
John

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ