[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z8GC9xiGAtUnWj-I@U-2FWC9VHC-2323.local>
Date: Fri, 28 Feb 2025 17:33:43 +0800
From: Feng Tang <feng.tang@...ux.alibaba.com>
To: Lukas Wunner <lukas@...ner.de>
Cc: rafael@...nel.org, Bjorn Helgaas <bhelgaas@...gle.com>,
Sathyanarayanan Kuppuswamy <sathyanarayanan.kuppuswamy@...ux.intel.com>,
Liguang Zhang <zhangliguang@...ux.alibaba.com>,
Guanghui Feng <guanghuifeng@...ux.alibaba.com>,
Markus Elfring <Markus.Elfring@....de>, lkp@...el.com,
Jonathan Cameron <Jonathan.Cameron@...wei.com>,
ilpo.jarvinen@...ux.intel.com, linux-pci@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3 2/4] PCI/portdrv: Add necessary wait for disabling
hotplug events
Hi Lukas,
On Fri, Feb 28, 2025 at 08:14:04AM +0100, Lukas Wunner wrote:
> On Fri, Feb 28, 2025 at 02:29:29PM +0800, Feng Tang wrote:
> > On Tue, Feb 25, 2025 at 12:42:04PM +0800, Feng Tang wrote:
> > > > > There might be some misunderstaning here :), I responded in
> > > > > https://lore.kernel.org/lkml/Z6LRAozZm1UfgjqT@U-2FWC9VHC-2323.local/
> > > > > that your suggestion could solve our issue.
> > > >
> > > > Well, could you test it please?
> > >
> > We just tried the patch on the hardware and initial 5.10 kernel, and
> > the problem cannot be reproduced, as the first PCIe hotplug command
> > of disabling CCIE and HPIE was not issued.
>
> Good!
>
> > Should I post a new version patch with your suggestion?
>
> Yes, please.
Will do, thanks
>
> > Also I would like to separate this patch from the patch dealing the
> > nomsi irq storm issue. How do you think?
>
> Makes sense to me.
>
> The problem with the nomsi irq storm is really that if the platform
> (i.e. BIOS) doesn't grant OSPM control of hotplug, OSPM (i.e. the kernel)
> cannot modify hotplug registers because the assumption is that the
> platform controls them.
Yes, very reasonable. I also talked with some firmware engineer, who
shared there is working sample that on some old x86 platform, the
firmware itself is really capable of handling the hotplug stuff when
MSI is disabled.
> If the platform doesn't actually handle
> hotplug, but keeps the interrupts enabled, that's basically a bug
> of the specific platform.
That's what happened in our case :)
> I think the kernel community's stance in such situations is that the
> BIOS vendor should provide an update with a fix. In some cases
> that's not posible because the product is no longer supported,
> or the vendor doesn't care about Linux issues because it only
> supports Windows or macOS. In those cases, we deal with these
> problems with a quirk. E.g. on x86 we often use a DMI quirk to
> recognize affected hardware and the quirk would then disable the
> hotplug interrupts.
I see.
As you dug out the history in https://lore.kernel.org/lkml/Z6RU-681eXl7hcp6@wunner.de/
Our previous debug could go through the OSC check in nomsi case,
only after below patch:
diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c
index 84030804a763..e7d9328cba45 100644
--- a/drivers/acpi/pci_root.c
+++ b/drivers/acpi/pci_root.c
@@ -38,8 +38,7 @@ static int acpi_pci_root_scan_dependent(struct acpi_device *adev)
#define ACPI_PCIE_REQ_SUPPORT (OSC_PCI_EXT_CONFIG_SUPPORT \
| OSC_PCI_ASPM_SUPPORT \
- | OSC_PCI_CLOCK_PM_SUPPORT \
- | OSC_PCI_MSI_SUPPORT)
+ | OSC_PCI_CLOCK_PM_SUPPORT)
Otherwise, the OSC function won't be executed, but kernel will simply
disable PCIe hotplug, which breaks the working sample I mentioned above.
We'd better also include take this change?
Thanks,
Feng
>
> Thanks,
>
> Lukas
Powered by blists - more mailing lists