[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20221108044819.GA861843@zander>
Date: Tue, 8 Nov 2022 04:52:16 +0000
From: Wei Gong <gongwei833x@...il.com>
To: "Michael S. Tsirkin" <mst@...hat.com>
Cc: linux-kernel@...r.kernel.org, Bjorn Helgaas <bhelgaas@...gle.com>,
linux-pci@...r.kernel.org
Subject: Re: [PATCH v2] pci: fix device presence detection for VFs
O Wed, Oct 26, 2022 at 02:11:21AM -0400, Michael S. Tsirkin wrote:
> virtio uses the same driver for VFs and PFs. Accordingly,
> pci_device_is_present is used to detect device presence. This function
> isn't currently working properly for VFs since it attempts reading
> device and vendor ID. As VFs are present if and only if PF is present,
> just return the value for that device.
>
> Reported-by: Wei Gong <gongwei833x@...il.com>
> Signed-off-by: Michael S. Tsirkin <mst@...hat.com>
> ---
>
> Wei Gong, thanks for your testing of the RFC!
> As I made a small change, would appreciate re-testing.
>
> Thanks!
>
> changes from RFC:
> use pci_physfn() wrapper to make the code build without PCI_IOV
>
>
> drivers/pci/pci.c | 5 +++++
> 1 file changed, 5 insertions(+)
Tested-by: Wei Gong <gongwei833x@...il.com>
retest done and well.
I would rephrase the bug.
according to sriov's protocol specification vendor_id and
device_id field in all VFs return FFFFh when read
so when vf devs is in the pci_device_is_present,it will be
misjudged as surprise removeal
when io is issued on the vf, normally disable virtio_blk vf
devs,at this time the disable opration will hang. and virtio
blk dev io hang.
task:bash state:D stack: 0 pid: 1773 ppid: 1241 flags:0x00004002
Call Trace:
<TASK>
__schedule+0x2ee/0x900
schedule+0x4f/0xc0
blk_mq_freeze_queue_wait+0x69/0xa0
? wait_woken+0x80/0x80
blk_mq_freeze_queue+0x1b/0x20
blk_cleanup_queue+0x3d/0xd0
virtblk_remove+0x3c/0xb0 [virtio_blk]
virtio_dev_remove+0x4b/0x80
device_release_driver_internal+0x103/0x1d0
device_release_driver+0x12/0x20
bus_remove_device+0xe1/0x150
device_del+0x192/0x3f0
device_unregister+0x1b/0x60
unregister_virtio_device+0x18/0x30
virtio_pci_remove+0x41/0x80
pci_device_remove+0x3e/0xb0
device_release_driver_internal+0x103/0x1d0
device_release_driver+0x12/0x20
pci_stop_bus_device+0x79/0xa0
pci_stop_and_remove_bus_device+0x13/0x20
pci_iov_remove_virtfn+0xc5/0x130
? pci_get_device+0x4a/0x60
sriov_disable+0x33/0xf0
pci_disable_sriov+0x26/0x30
virtio_pci_sriov_configure+0x6f/0xa0
sriov_numvfs_store+0x104/0x140
dev_attr_store+0x17/0x30
sysfs_kf_write+0x3e/0x50
kernfs_fop_write_iter+0x138/0x1c0
new_sync_write+0x117/0x1b0
vfs_write+0x185/0x250
ksys_write+0x67/0xe0
__x64_sys_write+0x1a/0x20
do_syscall_64+0x61/0xb0
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f21bd1f3ba4
RSP: 002b:00007ffd34a24188 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f21bd1f3ba4
RDX: 0000000000000002 RSI: 0000560305040800 RDI: 0000000000000001
RBP: 0000560305040800 R08: 000056030503fd50 R09: 0000000000000073
R10: 00000000ffffffff R11: 0000000000000202 R12: 0000000000000002
R13: 00007f21bd2de760 R14: 00007f21bd2da5e0 R15: 00007f21bd2d99e0
when virtio_blk is performing io, as long as there two stages of:
1. dispatch io. queue_usage_counter++;
2. io is completed after receiving the interrupt. queue_usage_counter--;
disable virtio_blk vfs:
if(!pci_device_is_present(pci_dev))
virtio_break_device(&vp_dev->vdev);
virtqueue for vf devs will be marked broken.
the interrupt notification io is end. Since it is judged that the
virtqueue has been marked as broken, the completed io will not be
performed.
So queue_usage_counter will not be cleared.
when the disk is removed at the same time, the queue will be frozen,
and you must wait for the queue_usage_counter to be cleared.
Therefore, it leads to the removeal of hang.
Thanks,
Wei Gong
Powered by blists - more mailing lists