[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Zcav5ppteV/6PEr6@ubuntu>
Date: Fri, 9 Feb 2024 23:06:24 +0000
From: Jim Harris <jim.harris@...sung.com>
To: Bjorn Helgaas <helgaas@...nel.org>
CC: Bjorn Helgaas <bhelgaas@...gle.com>, "linux-pci@...r.kernel.org"
<linux-pci@...r.kernel.org>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>, Leon Romanovsky <leonro@...dia.com>, "Jason
Gunthorpe" <jgg@...dia.com>, Alex Williamson <alex.williamson@...hat.com>,
Pierre Crégut <pierre.cregut@...nge.com>
Subject: Re: [PATCH 0/2] pci/iov: avoid device_lock() when reading
sriov_numvfs
On Thu, Feb 08, 2024 at 06:30:02PM -0600, Bjorn Helgaas wrote:
> [+cc Pierre, author of 35ff867b7657 ("PCI/IOV: Serialize sysfs
> sriov_numvfs reads vs writes")]
>
> On Wed, Dec 20, 2023 at 10:58:12PM +0000, Jim Harris wrote:
> > If SR-IOV enabled device is held by vfio, and device is removed,
> > vfio will hold device lock and notify userspace of the removal. If
> > userspace reads sriov_numvfs sysfs entry, that thread will be
> > blocked since sriov_numvfs_show() also tries to acquire the device
> > lock. If that same thread is responsible for releasing the device to
> > vfio, it results in a deadlock.
> >
> > One patch was proposed to add a separate mutex, specifically for
> > struct pci_sriov, to synchronize access to sriov_numvfs in the sysfs
> > paths (replacing use of the device_lock()). Leon instead suggested
> > just reverting the commit 35ff867b765 which introduced device_lock()
> > in the store path. This also led to a small fix around ordering on
> > the kobject_uevent() when sriov_numvfs is updated.
> >
> > Ref: https://lore.kernel.org/linux-pci/ZXJI5+f8bUelVXqu@ubuntu/
>
> 1) Cc author of the commit being reverted (Pierre) so he has a chance
> to chime in and make sure the proposed fix works for him as well.
Ack. I'll also Cc Pierre on the v2.
> 2) The revert commit log needs to justify the revert, not merely say
> what the proper way is. The Ref: above suggests that the current code
> (pre-revert) leads to a deadlock in some cases, so the revert commit
> log should detail that.
>
> It's ideal if we never regress, not even between the revert and the
> second patch, so it's possible that they should be squashed into a
> single patch. But if you keep it as two patches, it's trivial for me
> to squash them if we decide that's best.
The deadlock I hit is fixed by patch 1 alone. Patch 2 is a separate
bug - it's better to update the num_VFs value before sending the notification
that the num_VFs value changed.
I'll add some more color to that commit message too, to differentiate it
from the revert. I have no issues if you eventually decide to squash them.
>
> 3) Follow subject line convention for drivers/pci (use "git log
> --oneline drivers/pci" to learn it).
Will fix in v2.
Thanks,
Jim
Powered by blists - more mailing lists