lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aG8BZcQZlbNsnrzt@kbusch-mbp>
Date: Wed, 9 Jul 2025 17:55:17 -0600
From: Keith Busch <kbusch@...nel.org>
To: Bjorn Helgaas <helgaas@...nel.org>
Cc: "Michael S. Tsirkin" <mst@...hat.com>, linux-kernel@...r.kernel.org,
	Lukas Wunner <lukas@...ner.de>, Bjorn Helgaas <bhelgaas@...gle.com>,
	Parav Pandit <parav@...dia.com>, virtualization@...ts.linux.dev,
	stefanha@...hat.com, alok.a.tiwari@...cle.com,
	linux-pci@...r.kernel.org
Subject: Re: [PATCH RFC v5 1/5] pci: report surprise removal event

On Wed, Jul 09, 2025 at 06:38:20PM -0500, Bjorn Helgaas wrote:
> This relies on somebody (typically pciehp, I guess) calling
> pci_dev_set_disconnected() when a surprise remove happens.
> 
> Do you think it would be practical for the driver's .remove() method
> to recognize that the device may stop responding at any point, even if
> no hotplug driver is present to call pci_dev_set_disconnected()?
> 
> Waiting forever for an interrupt seems kind of vulnerable in general.
> Maybe "artificially adding timeouts" is alluding to *not* waiting
> forever for interrupts?  That doesn't seem artificial to me because
> it's just a fact of life that devices can disappear at arbitrary
> times.

I totally agree here. Every driver's .remove() should be able to
guarantee forward progress some way. I put some work in blk-mq and nvme
to ensure that happens for those devices at least.

That "forward progress" can come slow though, maybe minutes, so we do
have opprotunisitic short cuts sprinkled about the driver. There are
still gaps when waiting for interrupt driven IO that need the longer
timeouts to trigger. It'd be cool if there was a mechansim to kick in
quicker, but this is still an uncommon exceptional condition, right?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ