lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Z9Wjk2GzrSURZoTG@wunner.de>
Date: Sat, 15 Mar 2025 16:58:11 +0100
From: Lukas Wunner <lukas@...ner.de>
To: Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>
Cc: Bjorn Helgaas <bhelgaas@...gle.com>, linux-pci@...r.kernel.org,
	Guenter Roeck <groeck@...iper.net>,
	Mika Westerberg <mika.westerberg@...ux.intel.com>,
	"Rafael J. Wysocki" <rafael.j.wysocki@...el.com>,
	Rajat Jain <rajatxjain@...il.com>,
	Joel Mathew Thomas <proxy0@...amail.com>,
	linux-kernel@...r.kernel.org,
	Jonathan Cameron <Jonathan.Cameron@...wei.com>,
	stable@...r.kernel.org
Subject: Re: [PATCH 1/4] PCI/hotplug: Disable HPIE over reset

On Thu, Mar 13, 2025 at 04:23:30PM +0200, Ilpo Järvinen wrote:
> pciehp_reset_slot() disables PDCE (Presence Detect Changed Enable) and
> DLLSCE (Data Link Layer State Changed Enable) for the duration of reset
> and clears the related status bits PDC and DLLSC from the Slot Status
> register after the reset to avoid hotplug incorrectly assuming the card
> was removed.
> 
> However, hotplug shares interrupt with PME and BW notifications both of
> which can make pciehp_isr() to run despite PDCE and DLLSCE bits being
> off. pciehp_isr() then picks PDC or DLLSC bits from the Slot Status
> register due to the events that occur during reset and caches them into
> ->pending_events. Later, the IRQ thread in pciehp_ist() will process
> the ->pending_events and will assume the Link went Down due to a card
> change (in pciehp_handle_presence_or_link_change()).
> 
> Change pciehp_reset_slot() to also clear HPIE (Hot-Plug Interrupt
> Enable) as pciehp_isr() will first check HPIE to see if the interrupt
> is not for it. Then synchronize with the IRQ handling to ensure no
> events are pending, before invoking the reset.

After dwelling on this for a while, I'm thinking that it may re-introduce
the issue fixed by commit f5eff5591b8f ("PCI: pciehp: Fix AB-BA deadlock
between reset_lock and device_lock"):

Looking at the second and third stack trace in its commit message,
down_write(reset_lock) in pciehp_reset_slot() is basically equivalent
to synchronize_irq() and we're holding device_lock() at that point,
hindering progress of pciehp_ist().

So I think I have guided you in the wrong direction and I apologize
for that.

However it seems to me that this should be solvable with the small
patch below.  Am I missing something?

@Joel Mathew Thomas, could you give the below patch a spin and see
if it helps?

Thanks!

-- >8 --

diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c
index bb5a8d9f03ad..99a2ac13a3d1 100644
--- a/drivers/pci/hotplug/pciehp_hpc.c
+++ b/drivers/pci/hotplug/pciehp_hpc.c
@@ -688,6 +688,11 @@ static irqreturn_t pciehp_isr(int irq, void *dev_id)
 		return IRQ_HANDLED;
 	}
 
+	/* Ignore events masked by pciehp_reset_slot(). */
+	events &= ctrl->slot_ctrl;
+	if (!events)
+		return IRQ_HANDLED;
+
 	/* Save pending events for consumption by IRQ thread. */
 	atomic_or(events, &ctrl->pending_events);
 	return IRQ_WAKE_THREAD;

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ