lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <TY2PR06MB34248DB15CBBDA7624DAFAE185079@TY2PR06MB3424.apcprd06.prod.outlook.com>
Date:   Wed, 16 Nov 2022 00:46:26 +0000
From:   Angus Chen <angus.chen@...uarmicro.com>
To:     Thomas Gleixner <tglx@...utronix.de>,
        "Michael S. Tsirkin" <mst@...hat.com>
CC:     "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Ming Lei <ming.lei@...hat.com>,
        Jason Wang <jasowang@...hat.com>
Subject: RE: IRQ affinity problem from virtio_blk



> -----Original Message-----
> From: Thomas Gleixner <tglx@...utronix.de>
> Sent: Wednesday, November 16, 2022 7:24 AM
> To: Michael S. Tsirkin <mst@...hat.com>
> Cc: Angus Chen <angus.chen@...uarmicro.com>; linux-kernel@...r.kernel.org;
> Ming Lei <ming.lei@...hat.com>; Jason Wang <jasowang@...hat.com>
> Subject: Re: IRQ affinity problem from virtio_blk
> 
> On Wed, Nov 16 2022 at 00:04, Thomas Gleixner wrote:
> 
> > On Tue, Nov 15 2022 at 17:44, Michael S. Tsirkin wrote:
> >> On Tue, Nov 15, 2022 at 11:19:47PM +0100, Thomas Gleixner wrote:
> >>> > We can see global_available drop from 15354 to 15273, is 81.
> >>> > And the total_allocated increase from 411 to 413. One config irq,and
> >>> > one vq irq.
> >>>
> >>> Right. That's perfectly fine. At the point where you looking at it, the
> >>> matrix allocator has given out 2 vectors as can be seen via
> >>> total_allocated.
> >>>
> >>> But then it also has another 79 vectors put aside for the other queues,
en,it not the truth,in fact ,I just has one queue for one virtio_blk.

crash_cts> struct virtio_blk.num_vqs 0xffff888147b79c00
  num_vqs = 1,
I think is the key we talk about.
> >>
> >> What makes it put these vectors aside? pci_alloc_irq_vectors_affinity ?
> >
> > init_vq() -> virtio_find_vqs() -> vp_find_vqs() ->
> > vp_request_msix_vectors() -> pci_alloc_irq_vectors_affinity()
> >
> > init_vq() hands in a struct irq_affinity which means that
> > pci_alloc_irq_vectors_affinity() will spread out interrupts and have one
> > for config and one per queue if vp_request_msix_vectors() is invoked
> > with per_vq_vectors == true, which is what the first invocation in
> > vp_find_vqs() does.
> 
> I just checked on a random VM. The PCI device as advertised to the guest
> does not expose that many vectors. One has 2 and the other 4.
> 
> But as the interrupts are requested 'managed' the core ends up setting
> the vectors aside. That's a fundamental property of managed interrupts.
> 
> Assume you have less queues than CPUs, which is the case with 2 vectors
> and tons of CPUs, i.e. one ends up for config and the other for the
> actual queue. So the affinity spreading code will end up having the full
> cpumask for the queue vector, which is marked managed. And managed means
> that it's guaranteed e.g. in the CPU hotplug case that the interrupt can
> be migrated to a still online CPU.
> 
> So we end up setting 79 vectors aside (one per CPU) in the case that the
> virtio device only provides two vectors.
> 
> But that's not the end of the world as you really would need ~200 such
> devices to exhaust the vector space...
> 
Thank you for your reply..
Let's look the dmesg for more information.
...
Nov 14 11:48:45 localhost kernel: virtio_blk virtio181: 1/0/0 default/read/poll queues
Nov 14 11:48:45 localhost kernel: virtio_blk virtio181: [vdpr] 20480 512-byte logical blocks (10.5 MB/10.0 MiB)
Nov 14 11:48:46 localhost kernel: virtio-pci 0000:37:16.4: enabling device (0000 -> 0002)
Nov 14 11:48:46 localhost kernel: virtio-pci 0000:37:16.4: virtio_pci: leaving for legacy driver
Nov 14 11:48:46 localhost kernel: virtio_blk virtio182: 1/0/0 default/read/poll queues---------the virtio182 means index 182.
Nov 14 11:48:46 localhost kernel: vp_find_vqs_msix return err=-28-----------------------------the first time we get 'no space' error from irq subsystem. 
...
We are easy to get the output is :
crash_cts> p *vector_matrix
$97 = {
  matrix_bits = 256,
  alloc_start = 32,
  alloc_end = 236,
  alloc_size = 204,
  global_available = 0,------------the irq is exhausted.
  global_reserved = 154,
  systembits_inalloc = 3,
  total_allocated = 1861,
  online_maps = 80,
  maps = 0x46100,
  scratch_map = {18446744069952503807, 18446744073709551615, 18446744073709551615, 18435229987943481343},
  system_map = {1125904739729407, 0, 1, 18435221191850459136}
}

After that ,all the irq request will be returned "no space".

If the percpu irq vector is more asymmetric,than the more quickly we get the 'no space' error when we probe the irq with 
IRQD_AFFINITY_MANAGED.

> Thanks,
> 
>         tglx
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ