lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87o7t7rec7.ffs@tglx>
Date:   Wed, 16 Nov 2022 00:24:24 +0100
From:   Thomas Gleixner <tglx@...utronix.de>
To:     "Michael S. Tsirkin" <mst@...hat.com>
Cc:     Angus Chen <angus.chen@...uarmicro.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Ming Lei <ming.lei@...hat.com>,
        Jason Wang <jasowang@...hat.com>
Subject: Re: IRQ affinity problem from virtio_blk

On Wed, Nov 16 2022 at 00:04, Thomas Gleixner wrote:

> On Tue, Nov 15 2022 at 17:44, Michael S. Tsirkin wrote:
>> On Tue, Nov 15, 2022 at 11:19:47PM +0100, Thomas Gleixner wrote:
>>> > We can see global_available drop from 15354 to 15273, is 81.
>>> > And the total_allocated increase from 411 to 413. One config irq,and
>>> > one vq irq.
>>> 
>>> Right. That's perfectly fine. At the point where you looking at it, the
>>> matrix allocator has given out 2 vectors as can be seen via
>>> total_allocated.
>>> 
>>> But then it also has another 79 vectors put aside for the other queues,
>>
>> What makes it put these vectors aside? pci_alloc_irq_vectors_affinity ?
>
> init_vq() -> virtio_find_vqs() -> vp_find_vqs() ->
> vp_request_msix_vectors() -> pci_alloc_irq_vectors_affinity()
>
> init_vq() hands in a struct irq_affinity which means that
> pci_alloc_irq_vectors_affinity() will spread out interrupts and have one
> for config and one per queue if vp_request_msix_vectors() is invoked
> with per_vq_vectors == true, which is what the first invocation in
> vp_find_vqs() does.

I just checked on a random VM. The PCI device as advertised to the guest
does not expose that many vectors. One has 2 and the other 4.

But as the interrupts are requested 'managed' the core ends up setting
the vectors aside. That's a fundamental property of managed interrupts.

Assume you have less queues than CPUs, which is the case with 2 vectors
and tons of CPUs, i.e. one ends up for config and the other for the
actual queue. So the affinity spreading code will end up having the full
cpumask for the queue vector, which is marked managed. And managed means
that it's guaranteed e.g. in the CPU hotplug case that the interrupt can
be migrated to a still online CPU.

So we end up setting 79 vectors aside (one per CPU) in the case that the
virtio device only provides two vectors.

But that's not the end of the world as you really would need ~200 such
devices to exhaust the vector space...

Thanks,

        tglx


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ