lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Zcav5ppteV/6PEr6@ubuntu>
Date: Fri, 9 Feb 2024 23:06:24 +0000
From: Jim Harris <jim.harris@...sung.com>
To: Bjorn Helgaas <helgaas@...nel.org>
CC: Bjorn Helgaas <bhelgaas@...gle.com>, "linux-pci@...r.kernel.org"
	<linux-pci@...r.kernel.org>, "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>, Leon Romanovsky <leonro@...dia.com>, "Jason
 Gunthorpe" <jgg@...dia.com>, Alex Williamson <alex.williamson@...hat.com>,
	Pierre Crégut <pierre.cregut@...nge.com>
Subject: Re: [PATCH 0/2] pci/iov: avoid device_lock() when reading
 sriov_numvfs

On Thu, Feb 08, 2024 at 06:30:02PM -0600, Bjorn Helgaas wrote:
> [+cc Pierre, author of 35ff867b7657 ("PCI/IOV: Serialize sysfs
> sriov_numvfs reads vs writes")]
> 
> On Wed, Dec 20, 2023 at 10:58:12PM +0000, Jim Harris wrote:
> > If SR-IOV enabled device is held by vfio, and device is removed,
> > vfio will hold device lock and notify userspace of the removal. If
> > userspace reads sriov_numvfs sysfs entry, that thread will be
> > blocked since sriov_numvfs_show() also tries to acquire the device
> > lock. If that same thread is responsible for releasing the device to
> > vfio, it results in a deadlock.
> >  
> > One patch was proposed to add a separate mutex, specifically for
> > struct pci_sriov, to synchronize access to sriov_numvfs in the sysfs
> > paths (replacing use of the device_lock()). Leon instead suggested
> > just reverting the commit 35ff867b765 which introduced device_lock()
> > in the store path. This also led to a small fix around ordering on
> > the kobject_uevent() when sriov_numvfs is updated.
> > 
> > Ref: https://lore.kernel.org/linux-pci/ZXJI5+f8bUelVXqu@ubuntu/ 
> 
> 1) Cc author of the commit being reverted (Pierre) so he has a chance
> to chime in and make sure the proposed fix works for him as well.

Ack. I'll also Cc Pierre on the v2.

> 2) The revert commit log needs to justify the revert, not merely say
> what the proper way is.  The Ref: above suggests that the current code
> (pre-revert) leads to a deadlock in some cases, so the revert commit
> log should detail that.
> 
> It's ideal if we never regress, not even between the revert and the
> second patch, so it's possible that they should be squashed into a
> single patch.  But if you keep it as two patches, it's trivial for me
> to squash them if we decide that's best.

The deadlock I hit is fixed by patch 1 alone. Patch 2 is a separate
bug - it's better to update the num_VFs value before sending the notification
that the num_VFs value changed.

I'll add some more color to that commit message too, to differentiate it
from the revert. I have no issues if you eventually decide to squash them.
> 
> 3) Follow subject line convention for drivers/pci (use "git log
> --oneline drivers/pci" to learn it).

Will fix in v2.

Thanks,

Jim

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ