[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20240502214300.GA1547650@bhelgaas>
Date: Thu, 2 May 2024 16:43:00 -0500
From: Bjorn Helgaas <helgaas@...nel.org>
To: linux-pci@...r.kernel.org
Cc: Mateusz Kaduk <mateusz.kaduk@...il.com>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...ux.intel.com>, Tj <linux@....tj>,
Andy Shevchenko <andy.shevchenko@...il.com>,
Hans de Goede <hdegoede@...hat.com>, x86@...nel.org,
linux-kernel@...r.kernel.org, Bjorn Helgaas <bhelgaas@...gle.com>,
stable@...r.kernel.org
Subject: Re: [PATCH 1/1] x86/pci: Skip early E820 check for ECAM region
On Wed, Apr 17, 2024 at 03:40:12PM -0500, Bjorn Helgaas wrote:
> From: Bjorn Helgaas <bhelgaas@...gle.com>
>
> Arul, Mateusz, Imcarneiro91, and Aman reported a regression caused by
> 07eab0901ede ("efi/x86: Remove EfiMemoryMappedIO from E820 map"). On the
> Lenovo Legion 9i laptop, that commit removes the area containing ECAM from
> E820, which means the early E820 validation started failing, which meant we
> didn't enable ECAM in the "early MCFG" path
>
> The lack of ECAM caused many ACPI methods to fail, resulting in the
> embedded controller, PS/2, audio, trackpad, and battery devices not being
> detected. The _OSC method also failed, so Linux could not take control of
> the PCIe hotplug, PME, and AER features:
>
> # pci_mmcfg_early_init()
>
> PCI: ECAM [mem 0xc0000000-0xce0fffff] (base 0xc0000000) for domain 0000 [bus 00-e0]
> PCI: not using ECAM ([mem 0xc0000000-0xce0fffff] not reserved)
>
> ACPI Error: AE_ERROR, Returned by Handler for [PCI_Config] (20230628/evregion-300)
> ACPI: Interpreter enabled
> ACPI: Ignoring error and continuing table load
> ACPI BIOS Error (bug): Could not resolve symbol [\_SB.PC00.RP01._SB.PC00], AE_NOT_FOUND (20230628/dswload2-162)
> ACPI Error: AE_NOT_FOUND, During name lookup/catalog (20230628/psobject-220)
> ACPI: Skipping parse of AML opcode: OpcodeName unavailable (0x0010)
> ACPI BIOS Error (bug): Could not resolve symbol [\_SB.PC00.RP01._SB.PC00], AE_NOT_FOUND (20230628/dswload2-162)
> ACPI Error: AE_NOT_FOUND, During name lookup/catalog (20230628/psobject-220)
> ...
> ACPI Error: Aborting method \_SB.PC00._OSC due to previous error (AE_NOT_FOUND) (20230628/psparse-529)
> acpi PNP0A08:00: _OSC: platform retains control of PCIe features (AE_NOT_FOUND)
>
> # pci_mmcfg_late_init()
>
> PCI: ECAM [mem 0xc0000000-0xce0fffff] (base 0xc0000000) for domain 0000 [bus 00-e0]
> PCI: [Firmware Info]: ECAM [mem 0xc0000000-0xce0fffff] not reserved in ACPI motherboard resources
> PCI: ECAM [mem 0xc0000000-0xce0fffff] is EfiMemoryMappedIO; assuming valid
> PCI: ECAM [mem 0xc0000000-0xce0fffff] reserved to work around lack of ACPI motherboard _CRS
>
> Per PCI Firmware r3.3, sec 4.1.2, ECAM space must be reserved by a PNP0C02
> resource, but it need not be mentioned in E820, so we shouldn't look at
> E820 to validate the ECAM space described by MCFG.
>
> 946f2ee5c731 ("[PATCH] i386/x86-64: Check that MCFG points to an e820
> reserved area") added a sanity check of E820 to work around buggy MCFG
> tables, but that over-aggressive validation causes failures like this one.
>
> Keep the E820 validation check only for older BIOSes (pre-2016) so the
> buggy 2006-era machines don't break. Skip the early E820 check for 2016
> and newer BIOSes.
>
> Fixes: 07eab0901ede ("efi/x86: Remove EfiMemoryMappedIO from E820 map")
> Reported-by: Mateusz Kaduk <mateusz.kaduk@...il.com>
> Reported-by: Arul <...>
> Reported-by: Imcarneiro91 <...>
> Reported-by: Aman <...>
> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218444
> Signed-off-by: Bjorn Helgaas <bhelgaas@...gle.com>
> Tested-by: Mateusz Kaduk <mateusz.kaduk@...il.com>
> Cc: stable@...r.kernel.org
I applied this to pci/enumeration for v6.10, thanks everybody for
your testing and review.
> ---
> arch/x86/pci/mmconfig-shared.c | 35 +++++++++++++++++++++++++++-------
> 1 file changed, 28 insertions(+), 7 deletions(-)
>
> diff --git a/arch/x86/pci/mmconfig-shared.c b/arch/x86/pci/mmconfig-shared.c
> index 0cc9520666ef..53c7afa606c3 100644
> --- a/arch/x86/pci/mmconfig-shared.c
> +++ b/arch/x86/pci/mmconfig-shared.c
> @@ -518,7 +518,34 @@ static bool __ref pci_mmcfg_reserved(struct device *dev,
> {
> struct resource *conflict;
>
> - if (!early && !acpi_disabled) {
> + if (early) {
> +
> + /*
> + * Don't try to do this check unless configuration type 1
> + * is available. How about type 2?
> + */
> +
> + /*
> + * 946f2ee5c731 ("Check that MCFG points to an e820
> + * reserved area") added this E820 check in 2006 to work
> + * around BIOS defects.
> + *
> + * Per PCI Firmware r3.3, sec 4.1.2, ECAM space must be
> + * reserved by a PNP0C02 resource, but it need not be
> + * mentioned in E820. Before the ACPI interpreter is
> + * available, we can't check for PNP0C02 resources, so
> + * there's no reliable way to verify the region in this
> + * early check. Keep it only for the old machines that
> + * motivated 946f2ee5c731.
> + */
> + if (dmi_get_bios_year() < 2016 && raw_pci_ops)
> + return is_mmconf_reserved(e820__mapped_all, cfg, dev,
> + "E820 entry");
> +
> + return true;
> + }
> +
> + if (!acpi_disabled) {
> if (is_mmconf_reserved(is_acpi_reserved, cfg, dev,
> "ACPI motherboard resource"))
> return true;
> @@ -554,12 +581,6 @@ static bool __ref pci_mmcfg_reserved(struct device *dev,
> if (pci_mmcfg_running_state)
> return true;
>
> - /* Don't try to do this check unless configuration
> - type 1 is available. how about type 2 ?*/
> - if (raw_pci_ops)
> - return is_mmconf_reserved(e820__mapped_all, cfg, dev,
> - "E820 entry");
> -
> return false;
> }
>
> --
> 2.34.1
>
Powered by blists - more mailing lists