[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <TY2PR06MB342474BBB331FF36D5DF89C785559@TY2PR06MB3424.apcprd06.prod.outlook.com>
Date: Tue, 27 Sep 2022 01:01:18 +0000
From: Angus Chen <angus.chen@...uarmicro.com>
To: Christoph Hellwig <hch@...radead.org>
CC: "jasowang@...hat.com" <jasowang@...hat.com>,
"mst@...hat.com" <mst@...hat.com>,
"pbonzini@...hat.com" <pbonzini@...hat.com>,
"axboe@...nel.dk" <axboe@...nel.dk>,
"virtualization@...ts.linux-foundation.org"
<virtualization@...ts.linux-foundation.org>,
"linux-block@...r.kernel.org" <linux-block@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Liming Wu <liming.wu@...uarmicro.com>,
"tglx@...utronix.de" <tglx@...utronix.de>
Subject: RE: [PATCH v1] virtio_blk: should not use IRQD_AFFINITY_MANAGED in
init_rq
>
> On Sat, Sep 24, 2022 at 11:48:54AM +0800, Angus Chen wrote:
> > The background is that we use dpu in cloud computing,the arch is x86,80
> > cores.We will have a lots of virtio devices,like 512 or more.
> > When we probe about 200 virtio_blk devices,it will fail and
> > the stack is print as follows:
>
> I don't think managed irqs are your problem here, but only a symptom.
> Why is the timer irq allowed to be shared with PCI irqs to start with?
>
Thanks,because the virtio_net devices we realized is just suppprt msix,not support intx irq.
So the intx irq is zero by default.
Even if we have initialized the intx,we will fail also if the irq capacity is exhausted.
The function vp_find_vqs use msix to init first.
/* the config->find_vqs() implementation */
int vp_find_vqs(struct virtio_device *vdev, unsigned nvqs,
struct virtqueue *vqs[], vq_callback_t *callbacks[],
const char * const names[], const bool *ctx,
struct irq_affinity *desc)
{
int err;
/* Try MSI-X with one vector per queue. */
err = vp_find_vqs_msix(vdev, nvqs, vqs, callbacks, names, true, ctx, desc);
if (!err)
return 0;
/* Fallback: MSI-X with one vector for config, one shared for queues. */
err = vp_find_vqs_msix(vdev, nvqs, vqs, callbacks, names, false, ctx, desc);
if (!err)
return 0;
/* Finally fall back to regular interrupts. */
return vp_find_vqs_intx(vdev, nvqs, vqs, callbacks, names, ctx);
}
So I think the key resson is one virtio_blk device use 80 irqs capacity if it use automatic IRQ affinity.
> >
> > [25338.485128] virtio-pci 0000:b3:00.0: virtio_pci: leaving for legacy driver
> > [25338.496174] genirq: Flags mismatch irq 0. 00000080 (virtio418) vs.
> 00015a00 (timer)
> > [25338.503822] CPU: 20 PID: 5431 Comm: kworker/20:0 Kdump: loaded
> Tainted: G OE --------- - - 4.18.0-305.30.1.el8.x86_64
> > [25338.516403] Hardware name: Inspur NF5280M5/YZMB-00882-10E, BIOS
> 4.1.21 08/25/2021
> > [25338.523881] Workqueue: events work_for_cpu_fn
> > [25338.528235] Call Trace:
> > [25338.530687] dump_stack+0x5c/0x80
> > [25338.534000] __setup_irq.cold.53+0x7c/0xd3
> > [25338.538098] request_threaded_irq+0xf5/0x160
> > [25338.542371] vp_find_vqs+0xc7/0x190
> > [25338.545866] init_vq+0x17c/0x2e0 [virtio_blk]
> > [25338.550223] ? ncpus_cmp_func+0x10/0x10
> > [25338.554061] virtblk_probe+0xe6/0x8a0 [virtio_blk]
> > [25338.558846] virtio_dev_probe+0x158/0x1f0
> > [25338.562861] really_probe+0x255/0x4a0
> > [25338.566524] ? __driver_attach_async_helper+0x90/0x90
> > [25338.571567] driver_probe_device+0x49/0xc0
> > [25338.575660] bus_for_each_drv+0x79/0xc0
> > [25338.579499] __device_attach+0xdc/0x160
> > [25338.583337] bus_probe_device+0x9d/0xb0
> > [25338.587167] device_add+0x418/0x780
> > [25338.590654] register_virtio_device+0x9e/0xe0
> > [25338.595011] virtio_pci_probe+0xb3/0x140
> > [25338.598941] local_pci_probe+0x41/0x90
> > [25338.602689] work_for_cpu_fn+0x16/0x20
> > [25338.606443] process_one_work+0x1a7/0x360
> > [25338.610456] ? create_worker+0x1a0/0x1a0
> > [25338.614381] worker_thread+0x1cf/0x390
> > [25338.618132] ? create_worker+0x1a0/0x1a0
> > [25338.622051] kthread+0x116/0x130
> > [25338.625283] ? kthread_flush_work_fn+0x10/0x10
> > [25338.629731] ret_from_fork+0x1f/0x40
> > [25338.633395] virtio_blk: probe of virtio418 failed with error -16
> >
> > After I did some work of this stack,took stap and crash to get more
> > information,I found that the auto irq_affinity affect this.
> > When "vp_find_vqs" call "vp_find_vqs_msix" failed,it will be go back
> > to call vp_find_vqs_msix again with ctx be false, and when it failed again,
> > we will call vp_find_vqs_intx,if the vp_dev->pci_dev->irq is zero,
> > we will get a backtrace like above.
> >
> > The log :
> > "genirq: Flags mismatch irq 0. 00000080 (virtio418) vs. 00015a00 (timer)"
> > was print because of the irq 0 is used by timer exclusive,and when
> > vp_find_vqs called vp_find_vqs_msix and return false twice,then it will
> > call vp_find_vqs_intx for the last try.
> > Because vp_dev->pci_dev->irq is zero,so it will be request irq 0 with
> > flag IRQF_SHARED.
> >
> > without config CONFIG_GENERIC_IRQ_DEBUGFS,
> > I found that we called vp_find_vqs_msix failed twice because of
> > the irq resource was exhausted.
> >
> > crash> irq_domain.name,parent 0xffff9bff87d4dec0
> > name = 0xffff9bff87c1fd60 "INTEL-IR-MSI-1-2"
> > parent = 0xffff9bff87400000
> > crash> irq_domain.name,parent 0xffff9bff87400000
> > name = 0xffff9bff87c24300 "INTEL-IR-1"
> > parent = 0xffff9bff87c6c900
> > crash> irq_domain.name,parent 0xffff9bff87c6c900
> > name = 0xffff9bff87c3ecd0 "VECTOR"
> > parent = 0x0----------------------the highest level
> >
> > and stap irq_matrix_alloc_managed get return value -ENOSPC.
> >
> > When no virtio_blk device probe,the vctor_matrix is:
> > crash> p *vector_matrix
> > $1 = {
> > matrix_bits = 256,
> > alloc_start = 32,
> > alloc_end = 236,
> > alloc_size = 204,
> > global_available = 15593,
> > global_reserved = 149,
> > systembits_inalloc = 3,
> > total_allocated = 409,
> > online_maps = 80,
> > maps = 0x2ff20,
> > scratch_map = {1161063342014463, 0, 1, 18446726481523507200,
> > 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
> > 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0},
> > system_map = {1125904739729407, 0, 1, 18446726481523507200,
> > 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
> > 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}
> > }
> >
> > When the dump stack occur,the vector_matrix of system is exhausted.
> > crash> p *vector_matrix
> > $82 = {
> > matrix_bits = 256,
> > alloc_start = 32,
> > alloc_end = 236,
> > alloc_size = 204,
> > global_available = 0,//caq:irq left
> > global_reserved = 151,
> > systembits_inalloc = 3,
> > total_allocated = 1922,//caq:irq that allocated
> > online_maps = 80,
> > maps = 0x2ff20,
> > scratch_map = {18446744069952503807, 18446744073709551615,
> > 18446744073709551615, 18446735277616529407, 0, 0, 0, 0, 0,
> > 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0},
> > system_map = {1125904739729407, 0, 1, 18446726481523507200,
> > 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
> > 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}
> > }
> >
> > And we tested the virtio_blk device which request irq success,
> > we found that in a system with 80 cores and two numa ,one
> > virtio_blk device with just two data queues consume 81 irqs capacity,
> > Although it just only three irqs in /proc/interrupt,80 irqs capacity
> > is effected by function "irq_build_affinity_masks" with 2*40.
> >
> > before one virtio_blk device hotplug out:
> > crash> p *vector_matrix
> > $2 = {
> > matrix_bits = 256,
> > alloc_start = 32,
> > alloc_end = 236,
> > alloc_size = 204,
> > global_available = 15215,
> > global_reserved = 150,
> > systembits_inalloc = 3,
> > total_allocated = 553,
> > online_maps = 80,
> > maps = 0x2ff20,
> > scratch_map = {1179746449752063, 0, 1, 18446726481523507200, 0, 0, 0,
> > 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
> > 0, 0, 0, 0, 0},
> > system_map = {1125904739729407, 0, 1, 18446726481523507200, 0, 0, 0,
> > 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
> > 0, 0, 0, 0, 0}
> > }
> >
> > after one virtio_blk device hotplug out:
> > crash> p *vector_matrix
> > $3 = {
> > matrix_bits = 256,
> > alloc_start = 32,
> > alloc_end = 236,
> > alloc_size = 204,
> > global_available = 15296,---it increase 81,include 1 config irq.
> > global_reserved = 150,
> > systembits_inalloc = 3,
> > total_allocated = 550,------it just decrease 3.
> > online_maps = 80,
> > maps = 0x2ff20,
> > scratch_map = {481036337152, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
> > 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
> > 0, 0, 0, 0},
> > system_map = {1125904739729407, 0, 1, 18446726481523507200, 0, 0, 0,
> > 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
> > 0, 0, 0, 0, 0}
> > }
> >
> > We test the new kernel also,it occur the same result.
> > [Fri Sep 23 04:51:42 2022] genirq: Flags mismatch irq 0. 00000080 (virtio608)
> vs. 00015a00 (timer)
> > [Fri Sep 23 04:51:42 2022] CPU: 0 PID: 5749 Comm: kworker/0:0 Kdump:
> loaded Tainted: G W OE 6.0.0-rc6+ #5
> > [Fri Sep 23 04:51:42 2022] Hardware name: Inspur
> NF5280M5/YZMB-00882-10E, BIOS 4.1.19 06/16/2021
> > [Fri Sep 23 04:51:42 2022] Workqueue: events work_for_cpu_fn
> > [Fri Sep 23 04:51:42 2022] Call Trace:
> > [Fri Sep 23 04:51:42 2022] <TASK>
> > [Fri Sep 23 04:51:42 2022] dump_stack_lvl+0x33/0x46
> > [Fri Sep 23 04:51:42 2022] __setup_irq+0x705/0x770
> > [Fri Sep 23 04:51:42 2022] request_threaded_irq+0x109/0x170
> > [Fri Sep 23 04:51:42 2022] vp_find_vqs+0xc4/0x190
> > [Fri Sep 23 04:51:42 2022] init_vqs+0x348/0x580 [virtio_net]
> > [Fri Sep 23 04:51:42 2022] virtnet_probe+0x54d/0xa80 [virtio_net]
> > [Fri Sep 23 04:51:42 2022] virtio_dev_probe+0x19c/0x240
> > [Fri Sep 23 04:51:42 2022] really_probe+0x106/0x3e0
> > [Fri Sep 23 04:51:42 2022] ? pm_runtime_barrier+0x4f/0xa0
> > [Fri Sep 23 04:51:42 2022] __driver_probe_device+0x79/0x170
> > [Fri Sep 23 04:51:42 2022] driver_probe_device+0x1f/0xa0
> > [Fri Sep 23 04:51:42 2022] __device_attach_driver+0x85/0x110
> > [Fri Sep 23 04:51:42 2022] ? driver_allows_async_probing+0x60/0x60
> > [Fri Sep 23 04:51:42 2022] ? driver_allows_async_probing+0x60/0x60
> > [Fri Sep 23 04:51:42 2022] bus_for_each_drv+0x67/0xb0
> > [Fri Sep 23 04:51:42 2022] __device_attach+0xe9/0x1b0
> > [Fri Sep 23 04:51:42 2022] bus_probe_device+0x87/0xa0
> > [Fri Sep 23 04:51:42 2022] device_add+0x59f/0x950
> > [Fri Sep 23 04:51:42 2022] ? dev_set_name+0x4e/0x70
> > [Fri Sep 23 04:51:42 2022] register_virtio_device+0xac/0xf0
> > [Fri Sep 23 04:51:42 2022] virtio_pci_probe+0x101/0x170
> > [Fri Sep 23 04:51:42 2022] local_pci_probe+0x42/0xa0
> > [Fri Sep 23 04:51:42 2022] work_for_cpu_fn+0x13/0x20
> > [Fri Sep 23 04:51:42 2022] process_one_work+0x1c2/0x3d0
> > [Fri Sep 23 04:51:42 2022] ? process_one_work+0x3d0/0x3d0
> > [Fri Sep 23 04:51:42 2022] worker_thread+0x1b9/0x360
> > [Fri Sep 23 04:51:42 2022] ? process_one_work+0x3d0/0x3d0
> > [Fri Sep 23 04:51:42 2022] kthread+0xe6/0x110
> > [Fri Sep 23 04:51:42 2022] ? kthread_complete_and_exit+0x20/0x20
> > [Fri Sep 23 04:51:42 2022] ret_from_fork+0x1f/0x30
> > [Fri Sep 23 04:51:42 2022] </TASK>
> > [Fri Sep 23 04:51:43 2022] virtio_net: probe of virtio608 failed with error -16
> >
> > Fixes: ad71473d9c43 ("virtio_blk: use virtio IRQ affinity")
> > Signed-off-by: Angus Chen <angus.chen@...uarmicro.com>
> > Tested-by: Liming Wu <liming.wu@...uarmicro.com>
> > ---
> > drivers/block/virtio_blk.c | 3 +--
> > 1 file changed, 1 insertion(+), 2 deletions(-)
> >
> > diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
> > index a8bcf3f664af..075de30a9bb4 100644
> > --- a/drivers/block/virtio_blk.c
> > +++ b/drivers/block/virtio_blk.c
> > @@ -513,7 +513,6 @@ static int init_vq(struct virtio_blk *vblk)
> > struct virtqueue **vqs;
> > unsigned short num_vqs;
> > struct virtio_device *vdev = vblk->vdev;
> > - struct irq_affinity desc = { 0, };
> >
> > err = virtio_cread_feature(vdev, VIRTIO_BLK_F_MQ,
> > struct virtio_blk_config, num_queues,
> > @@ -548,7 +547,7 @@ static int init_vq(struct virtio_blk *vblk)
> > }
> >
> > /* Discover virtqueues and write information to configuration. */
> > - err = virtio_find_vqs(vdev, num_vqs, vqs, callbacks, names, &desc);
> > + err = virtio_find_vqs(vdev, num_vqs, vqs, callbacks, names, NULL);
> > if (err)
> > goto out;
> >
> > --
> > 2.17.1
> >
> ---end quoted text---
Powered by blists - more mailing lists