lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <94238be8-023e-a70a-45c8-a7096149e752@redhat.com>
Date:   Sat, 7 May 2022 12:09:03 +0200
From:   Hans de Goede <hdegoede@...hat.com>
To:     Bjorn Helgaas <helgaas@...nel.org>
Cc:     "Rafael J . Wysocki" <rafael@...nel.org>,
        Mika Westerberg <mika.westerberg@...ux.intel.com>,
        Krzysztof Wilczyński <kw@...ux.com>,
        Bjorn Helgaas <bhelgaas@...gle.com>,
        Myron Stowe <myron.stowe@...hat.com>,
        Juha-Pekka Heikkila <juhapekka.heikkila@...il.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        "H . Peter Anvin" <hpa@...or.com>,
        Benoit Grégoire <benoitg@...us.ca>,
        Hui Wang <hui.wang@...onical.com>, linux-acpi@...r.kernel.org,
        linux-pci@...r.kernel.org, x86@...nel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v7 1/1] x86/PCI: Ignore E820 reservations for bridge
 windows on newer systems

Hi Bjorn,

On 5/6/22 18:51, Bjorn Helgaas wrote:
> On Thu, May 05, 2022 at 05:20:16PM +0200, Hans de Goede wrote:
>> Some BIOS-es contain bugs where they add addresses which are already
>> used in some other manner to the PCI host bridge window returned by
>> the ACPI _CRS method. To avoid this Linux by default excludes
>> E820 reservations when allocating addresses since 2010, see:
>> commit 4dc2287c1805 ("x86: avoid E820 regions when allocating address
>> space").
>>
>> Recently (2019) some systems have shown-up with E820 reservations which
>> cover the entire _CRS returned PCI bridge memory window, causing all
>> attempts to assign memory to PCI BARs which have not been setup by the
>> BIOS to fail. For example here are the relevant dmesg bits from a
>> Lenovo IdeaPad 3 15IIL 81WE:
>>
>>  [mem 0x000000004bc50000-0x00000000cfffffff] reserved
>>  pci_bus 0000:00: root bus resource [mem 0x65400000-0xbfffffff window]
>>
>> The ACPI specifications appear to allow this new behavior:
>>
>> The relationship between E820 and ACPI _CRS is not really very clear.
>> ACPI v6.3, sec 15, table 15-374, says AddressRangeReserved means:
>>
>>   This range of addresses is in use or reserved by the system and is
>>   not to be included in the allocatable memory pool of the operating
>>   system's memory manager.
>>
>> and it may be used when:
>>
>>   The address range is in use by a memory-mapped system device.
>>
>> Furthermore, sec 15.2 says:
>>
>>   Address ranges defined for baseboard memory-mapped I/O devices, such
>>   as APICs, are returned as reserved.
>>
>> A PCI host bridge qualifies as a baseboard memory-mapped I/O device,
>> and its apertures are in use and certainly should not be included in
>> the general allocatable pool, so the fact that some BIOS-es reports
>> the PCI aperture as "reserved" in E820 doesn't seem like a BIOS bug.
>>
>> So it seems that the excluding of E820 reserved addresses is a mistake.
>>
>> Ideally Linux would fully stop excluding E820 reserved addresses,
>> but then various old systems will regress.
>> Instead keep the old behavior for old systems, while ignoring
>> the E820 reservations for any systems from now on.
>>
>> Old systems are defined here as BIOS year < 2018, this was chosen to
>> make sure that pci_use_e820 will not be set on the currently affected
>> systems, the oldest known one is from 2019.
>>
>> Testing has shown that some newer systems also have a bad _CRS return.
>> The pci_crs_quirks DMI table is used to keep excluding E820 reservations
>> from the bridge window on these systems.
>>
>> Also add pci=no_e820 and pci=use_e820 options to allow overriding
>> the BIOS year + DMI matching logic.
>>
>> BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=206459
>> BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1868899
>> BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1871793
>> BugLink: https://bugs.launchpad.net/bugs/1878279
>> BugLink: https://bugs.launchpad.net/bugs/1931715
>> BugLink: https://bugs.launchpad.net/bugs/1932069
>> BugLink: https://bugs.launchpad.net/bugs/1921649
>> Cc: Benoit Grégoire <benoitg@...us.ca>
>> Cc: Hui Wang <hui.wang@...onical.com>
>> Signed-off-by: Hans de Goede <hdegoede@...hat.com>
> 
>> +	 * Ideally Linux would fully stop using E820 reservations, but then
>> +	 * various old systems will regress. Instead keep the old behavior for
>> +	 * old systems + known to be broken newer systems in pci_crs_quirks.
>> +	 */
>> +	if (year >= 0 && year < 2018)
>> +		pci_use_e820 = true;
> 
> How did you pick 2018?  Prior to this patch, we used E820 reservations
> for all machines.  This patch would change that for 2019-2022
> machines, so there's a risk of breaking some of them.

Correct. I picked 2018 because the first devices where using E820
reservations are causing issues (i2c controller not getting resources
leading to non working touchpad / thunderbolt hotplug issues) have
BIOS dates starting in 2019. I added a year margin, so we could make
this 2019.

> I'm hesitant about changing the behavior for machines already in the
> field because if they were tested at all with Linux, it was without
> this patch.  So I would lean toward preserving the current behavior
> for BIOS year < 2023.

I see, I presume the idea is to then use DMI to disable E820 clipping
on current devices where this is known to cause problems ?

So for v8 I would:

1. Change the cut-off check to < 2023
2. Drop the DMI quirks I added for models which are known to need E820
   clipping hit by the < 2018 check
3. Add DMI quirks for models for which it is known that we must _not_
   do E820 clipping

Is this the direction you want to go / does that sound right?

Note the DMI list for 3. will initially very likely be incomplete, but
I can ask around for testing once we have settled on this approach
and do one or more follow up patches to extend the list.


>> diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
>> index 9e1e6b8d8876..7e6f79aab6a8 100644
>> --- a/arch/x86/pci/common.c
>> +++ b/arch/x86/pci/common.c
>> @@ -595,6 +595,12 @@ char *__init pcibios_setup(char *str)
>>  	} else if (!strcmp(str, "nocrs")) {
>>  		pci_probe |= PCI_ROOT_NO_CRS;
>>  		return NULL;
>> +	} else if (!strcmp(str, "use_e820")) {
>> +		pci_probe |= PCI_USE_E820;
> 
> I think we should add_taint(TAINT_FIRMWARE_WORKAROUND) for both these
> cases.

Ok, I'll add this for v8.

> 
> We probably should do it for *all* the parameters here, but that would
> be a separate discussion.
> 
>> +		return NULL;
>> +	} else if (!strcmp(str, "no_e820")) {
>> +		pci_probe |= PCI_NO_E820;
>> +		return NULL;
>>  #ifdef CONFIG_PHYS_ADDR_T_64BIT
>>  	} else if (!strcmp(str, "big_root_window")) {
>>  		pci_probe |= PCI_BIG_ROOT_WINDOW;
>> -- 
>> 2.36.0
>>
> 


Regards,

Hans


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ