linux-kernel - RE: irq_build_affinity_masks() allocates improper affinity if num_possible_cpus() > num_present

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <KU1P153MB0120B72473E50B1B723E25E6BF0A0@KU1P153MB0120.APCP153.PROD.OUTLOOK.COM>
Date:   Wed, 7 Oct 2020 03:08:09 +0000
From:   Dexuan Cui <decui@...rosoft.com>
To:     Thomas Gleixner <tglx@...utronix.de>,
        Ming Lei <ming.lei@...hat.com>, Christoph Hellwig <hch@....de>,
        Christian Borntraeger <borntraeger@...ibm.com>,
        Stefan Haberland <sth@...ux.vnet.ibm.com>,
        Jens Axboe <axboe@...nel.dk>,
        Marc Zyngier <marc.zyngier@....com>,
        "linux-pci@...r.kernel.org" <linux-pci@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
CC:     Long Li <longli@...rosoft.com>,
        Haiyang Zhang <haiyangz@...rosoft.com>,
        Michael Kelley <mikelley@...rosoft.com>
Subject: RE: irq_build_affinity_masks() allocates improper affinity if
 num_possible_cpus() > num_present_cpus()?

> From: Thomas Gleixner <tglx@...utronix.de>
> Sent: Tuesday, October 6, 2020 11:58 AM
> > ...
> > I pass through an MSI-X-capable PCI device to the Linux VM (which has
> > only 1 virtual CPU), and the below code does *not* report any error
> > (i.e. pci_alloc_irq_vectors_affinity() returns 2, and request_irq()
> > returns 0), but the code does not work: the second MSI-X interrupt is not
> > happening while the first interrupt does work fine.
> >
> > int nr_irqs = 2;
> > int i, nvec, irq;
> >
> > nvec = pci_alloc_irq_vectors_affinity(pdev, nr_irqs, nr_irqs,
> >                 PCI_IRQ_MSIX | PCI_IRQ_AFFINITY, NULL);
> 
> Why should it return an error?

The above code returns -ENOSPC if num_possible_cpus() is also 1, and
returns 0 if num_possible_cpus() is 128. So it looks the above code is
not using the API correctly, and hence gets undefined results.

> > for (i = 0; i < nvec; i++) {
> >         irq = pci_irq_vector(pdev, i);
> >         err = request_irq(irq, test_intr, 0, "test_intr", &intr_cxt[i]);
> > }
> 
> And why do you expect that the second interrupt works?
> 
> This is about managed interrupts and the spreading code has two vectors
> to which it can spread the interrupts. One is assigned to one half of
> the possible CPUs and the other one to the other half. Now you have only
> one CPU online so only the interrupt with has the online CPU in the
> assigned affinity mask is started up.
> 
> That's how managed interrupts work. If you don't want managed interrupts
> then don't use them.
> 
> Thanks,
> 
>         tglx

Thanks for the clarification! It looks with PCI_IRQ_AFFINITY the kernel 
guarantees that the allocated interrutps are 1:1 bound to CPUs, and the
userspace is unable to change the affinities. This is very useful to support
per-CPU I/O queues.

Thanks,
-- Dexuan