lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Z6ycYOKUeOECrcgb@U-2FWC9VHC-2323.local>
Date: Wed, 12 Feb 2025 21:04:32 +0800
From: Feng Tang <feng.tang@...ux.alibaba.com>
To: Lukas Wunner <lukas@...ner.de>, "Rafael J. Wysocki" <rafael@...nel.org>
Cc: "Rafael J. Wysocki" <rafael@...nel.org>,
	Bjorn Helgaas <bhelgaas@...gle.com>,
	Jonathan Cameron <Jonthan.Cameron@...wei.com>,
	ilpo.jarvinen@...ux.intel.com, linux-pci@...r.kernel.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/2] PCI: Disable PCIE hotplug interrupts early when msi
 is disabled

On Thu, Feb 06, 2025 at 07:21:47AM +0100, Lukas Wunner wrote:
> [to += Rafael, start of thread is here:
> https://lore.kernel.org/all/Z6HcoUB3i51bzQDs@wunner.de/
> ]
> 
> Hi Rafael,
> 
> On Wed, Feb 05, 2025 at 11:58:04AM +0800, Feng Tang wrote:
> > On Tue, Feb 04, 2025 at 10:23:45AM +0100, Lukas Wunner wrote:
> > > On Tue, Feb 04, 2025 at 01:37:58PM +0800, Feng Tang wrote:
> > > > There was a irq storm bug when testing "pci=nomsi" case, and the root
> > > > cause is: 'nomsi' will disable MSI and let devices and root ports use
> > > > legacy INTX inerrupt, and likely make several devices/ports share one
> > > > interrupt. In the failure case, BIOS doesn't disable the PCIE hotplug
> > > > interrupts, and  actually asserts the command-complete interrupt.
> > > > As MSI is disabled, ACPI initialization code will not enumerate root
> > > > port's PCIE hotplug capability, and pciehp service driver wont' be
> > > > enabled for the root port to handle that interrupt, later on when it is
> > > > shared and enabled by other device driver like NVME or NIC, the "nobody
> > > > care irq storm" happens.
> > >
> > > Is there a section in the PCI Firmware Spec which says ACPI doesn't
> > > enumerate the hotplug capability if MSI is disabled?
> > 
> > No, I didn't get it from spec, but found the logic by code reading
> > during debugging the irq storm issue. The related code is about:
> > 
> > #define ACPI_PCIE_REQ_SUPPORT (OSC_PCI_EXT_CONFIG_SUPPORT \
> > 				| OSC_PCI_ASPM_SUPPORT \
> > 				| OSC_PCI_CLOCK_PM_SUPPORT \
> > 				| OSC_PCI_MSI_SUPPORT)
> 
> Commit 415e12b23792 ("PCI/ACPI: Request _OSC control once for each root
> bridge (v3)") contains a change which doesn't seem to be explained in
> the commit message:
> 
> If the user passes "pci=nomsi" on the command line, Linux doesn't
> request hotplug control (or any other control) from the platform.
> So ACPI always remains responsible for hotplug in the "pci=nomsi"
> case.
> 
> The commit sought to fix a cpu hog issue:
> 
> https://bugzilla.kernel.org/show_bug.cgi?id=29722
> 
> It's unclear to me if that bug was fixed by requesting _OSC only once,
> as the commit message suggests, or if the addition of OSC_MSI_SUPPORT
> to ACPI_PCIE_REQ_SUPPORT fixed the issue.
> 
> Since the latter is not mentioned in the commit message,
> it seems plausible to assume that the OSC_MSI_SUPPORT change
> was unintentional.
> 
> In any case it doesn't seem to make sense to not request any
> control in the "pci=nomsi" case.
> 
> It's also worth noting that the behavior is different on
> Apple machines as they use a fixed _OSC set even for "pci=nomsi".
> 
> I'm wondering if OSC_PCI_MSI_SUPPORT should simply be removed
> from ACPI_PCIE_REQ_SUPPORT, but I'm worried that it may cause
> reappearance of the cpu hog issue.
 
Hi Lukas,

I tried to remove OSC_PCI_MSI_SUPPORT from ACPI_PCIE_REQ_SUPPORT, but
after negotiate_os_control(), the 'PCIeHotplug' control is still
disabled in the control capability after ACPI query_osc, run_osc
routines (I haven't figured out why yet), thus the pciehp severvice
driver can't be loader.

Thanks,
Feng

> Thoughts?
> 
> Thanks,
> 
> Lukas

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ