lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <1869613445.153778.1765467944808.JavaMail.zimbra@raptorengineeringinc.com>
Date: Thu, 11 Dec 2025 09:45:44 -0600 (CST)
From: Timothy Pearson <tpearson@...torengineering.com>
To: Narayana Murty N <nnmlinux@...ux.ibm.com>
Cc: mahesh <mahesh@...ux.ibm.com>, Oliver <oohall@...il.com>, 
	Madhavan Srinivasan <maddy@...ux.ibm.com>, 
	Michael Ellerman <mpe@...erman.id.au>, npiggin <npiggin@...il.com>, 
	christophe leroy <christophe.leroy@...roup.eu>, 
	Bjorn Helgaas <bhelgaas@...gle.com>, 
	Timothy Pearson <tpearson@...torengineering.com>, 
	linuxppc-dev <linuxppc-dev@...ts.ozlabs.org>, 
	linux-kernel <linux-kernel@...r.kernel.org>, 
	vaibhav <vaibhav@...ux.ibm.com>, 
	Shivaprasad G Bhat <sbhat@...ux.ibm.com>, ganeshgr@...ux.ibm.com
Subject: Re: [PATCH v2 1/1] powerpc/eeh: fix recursive
 pci_lock_rescan_remove locking in EEH event handling



----- Original Message -----
> From: "Narayana Murty N" <nnmlinux@...ux.ibm.com>
> To: "mahesh" <mahesh@...ux.ibm.com>, "Oliver" <oohall@...il.com>, "Madhavan Srinivasan" <maddy@...ux.ibm.com>, "Michael
> Ellerman" <mpe@...erman.id.au>, "npiggin" <npiggin@...il.com>, "christophe leroy" <christophe.leroy@...roup.eu>
> Cc: "Bjorn Helgaas" <bhelgaas@...gle.com>, "Timothy Pearson" <tpearson@...torengineering.com>, "linuxppc-dev"
> <linuxppc-dev@...ts.ozlabs.org>, "linux-kernel" <linux-kernel@...r.kernel.org>, "vaibhav" <vaibhav@...ux.ibm.com>,
> "Shivaprasad G Bhat" <sbhat@...ux.ibm.com>, ganeshgr@...ux.ibm.com
> Sent: Wednesday, December 10, 2025 8:25:59 AM
> Subject: [PATCH v2 1/1] powerpc/eeh: fix recursive pci_lock_rescan_remove locking in EEH event handling

> The recent commit 1010b4c012b0 ("powerpc/eeh: Make EEH driver device
> hotplug safe") restructured the EEH driver to improve synchronization
> with the PCI hotplug layer.
> 
> However, it inadvertently moved pci_lock_rescan_remove() outside its
> intended scope in eeh_handle_normal_event(), leading to broken PCI
> error reporting and improper EEH event triggering. Specifically,
> eeh_handle_normal_event() acquired pci_lock_rescan_remove() before
> calling eeh_pe_bus_get(), but eeh_pe_bus_get() itself attempts to
> acquire the same lock internally, causing nested locking and disrupting
> normal EEH event handling paths.
> 
> This patch adds a boolean parameter do_lock to _eeh_pe_bus_get(),
> with two public wrappers:
>    eeh_pe_bus_get() with locking enabled.
>    eeh_pe_bus_get_nolock() that skips locking.
> 
> Callers that already hold pci_lock_rescan_remove() now use
> eeh_pe_bus_get_nolock() to avoid recursive lock acquisition.
> 
> Additionally, pci_lock_rescan_remove() calls are restored to the correct
> position—after eeh_pe_bus_get() and immediately before iterating affected
> PEs and devices. This ensures EEH-triggered PCI removes occur under proper
> bus rescan locking without recursive lock contention.
> 
> The eeh_pe_loc_get() function has been split into two functions:
>    eeh_pe_loc_get(struct eeh_pe *pe) which retrieves the loc for given PE.
>    eeh_pe_loc_get_bus(struct pci_bus *bus) which retrieves the location
>    code for given bus.

Conceptually the patch sounds OK, but given the complexity of these subsystems it's difficult to forsee all interactions.  Was the patch verified not to break NVMe hotplug on PowerNV systems using actual hardware?  If not, I will need to do so before sending an ack.  Thanks!

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ