lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20251107145156.GG20064@p1gen4-pw042f0m.boeblingen.de.ibm.com>
Date: Fri, 7 Nov 2025 15:51:56 +0100
From: Benjamin Block <bblock@...ux.ibm.com>
To: Niklas Schnelle <schnelle@...ux.ibm.com>
Cc: Bjorn Helgaas <helgaas@...nel.org>, Lukas Wunner <lukas@...ner.de>,
        Keith Busch <kbusch@...nel.org>, Gerd Bayer <gbayer@...ux.ibm.com>,
        Matthew Rosato <mjrosato@...ux.ibm.com>,
        Halil Pasic <pasic@...ux.ibm.com>, Farhan Ali <alifm@...ux.ibm.com>,
        Julian Ruess <julianr@...ux.ibm.com>,
        Heiko Carstens <hca@...ux.ibm.com>, Vasily Gorbik <gor@...ux.ibm.com>,
        Alexander Gordeev <agordeev@...ux.ibm.com>, linux-pci@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/2] PCI/IOV: Fix race between SR-IOV enable/disable and
 hotplug

Hey Niklas,

On Thu, Oct 30, 2025 at 11:26:02AM +0100, Niklas Schnelle wrote:
> Commit 05703271c3cd ("PCI/IOV: Add PCI rescan-remove locking when
> enabling/disabling SR-IOV") tried to fix a race between the VF removal
> inside sriov_del_vfs() and concurrent hot unplug by taking the PCI
> rescan/remove lock in sriov_del_vfs(). Similarly the PCI rescan/remove
> lock was also taken in sriov_add_vfs() to protect addition of VFs.
> 
> This approach however causes deadlock on trying to remove PFs with
> SR-IOV enabled because PFs disable SR-IOV during removal and this
> removal happens under the PCI rescan/remove lock. So the original fix
> had to be reverted.
> 
> Instead of taking the PCI rescan/remove lock in sriov_add_vfs() and
> sriov_del_vfs(), fix the race that occurs with SR-IOV enable and disable
> vs hotplug higher up in the callchain by taking the lock in
> sriov_numvfs_store() before calling into the driver's sriov_configure()
> callback.
> 
> Cc: stable@...r.kernel.org
> Fixes: 05703271c3cd ("PCI/IOV: Add PCI rescan-remove locking when enabling/disabling SR-IOV")
> Reported-by: Benjamin Block <bblock@...ux.ibm.com>
> Signed-off-by: Niklas Schnelle <schnelle@...ux.ibm.com>
> ---
>  drivers/pci/iov.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 

It fixes the original bug (kernel panic) AFAICT, insofar:

Reviewed-by: Benjamin Block <bblock@...ux.ibm.com>


But tbh., I find it a bit "scary" to take one global lock at such a high
level, when it is also used at much lower levels throughout the stack.

IIRC my original (internal) report boiled down to that pci_destroy_dev() is
called essentially unlocked, and without taking any of the attached reference
counts into account.. which seems fishy IMHO. The actual freeing of the memory
is hidden behind a reference, but stuff like device_del() is just.. done..
and that seems strange to me.

Having a bit more granularity here might not hurt; but I can't say I have a
good idea on top of my mind in how to do that, so... rather fix the panic.

-- 
Best Regards, Benjamin Block        /        Linux on IBM Z Kernel Development
IBM Deutschland Research & Development GmbH    /   https://www.ibm.com/privacy
Vors. Aufs.-R.: Wolfgang Wendt         /        Geschäftsführung: David Faller
Sitz der Ges.: Böblingen     /    Registergericht: AmtsG Stuttgart, HRB 243294

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ