lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <039f9e8d-6e29-0288-606a-1d298e026c97@redhat.com>
Date:   Mon, 14 Feb 2022 13:42:29 +0100
From:   Hans de Goede <hdegoede@...hat.com>
To:     Mika Westerberg <mika.westerberg@...ux.intel.com>
Cc:     "Rafael J. Wysocki" <rafael.j.wysocki@...el.com>,
        Bjorn Helgaas <bhelgaas@...gle.com>,
        Krzysztof Wilczyński <kw@...ux.com>,
        Myron Stowe <myron.stowe@...hat.com>,
        Juha-Pekka Heikkila <juhapekka.heikkila@...il.com>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        linux-acpi <linux-acpi@...r.kernel.org>,
        Linux PCI <linux-pci@...r.kernel.org>, x86@...nel.org,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Benoit Grégoire <benoitg@...us.ca>,
        Hui Wang <hui.wang@...onical.com>
Subject: Re: [5.17 regression] "x86/PCI: Ignore E820 reservations for bridge
 windows on newer systems" breaks suspend/resume

Hi,

On 2/10/22 07:39, Mika Westerberg wrote:
> Hi Hans,
> 
> On Wed, Feb 09, 2022 at 05:08:13PM +0100, Hans de Goede wrote:
>> As mentioned in my email from 10 seconds ago I think a better simpler
>> fix would be to just do:
>>
>> diff --git a/arch/x86/kernel/resource.c b/arch/x86/kernel/resource.c
>> index 9b9fb7882c20..18656f823764 100644
>> --- a/arch/x86/kernel/resource.c
>> +++ b/arch/x86/kernel/resource.c
>> @@ -28,6 +28,10 @@ static void remove_e820_regions(struct resource *avail)
>>  	int i;
>>  	struct e820_entry *entry;
>>  
>> +	/* Only remove E820 reservations on classic BIOS boot */
>> +	if (efi_enabled(EFI_MEMMAP))
>> +		return;
>> +
>>  	for (i = 0; i < e820_table->nr_entries; i++) {
>>  		entry = &e820_table->entries[i];
>>  
>>
>> I'm curious what you think of that?
> 
> I'm not an expert in this e820 stuff but this one looks really simple
> and makes sense to me. So definitely should go with it assuming there
> are no objections from the x86 maintainers.

Unfortunately with this suspend/resume is still broken on the ThinkPad X1 carbon gen 2 of the reporter reporting the regression. The reporter has been kind enough to also test in EFI mode (at my request) and then the problem is back again with this patch. So just differentiating between EFI / non EFI mode is not an option.

FYI, here is what I believe is the root-cause of the issue on the ThinkPad X1 carbon gen 2:

The E820 reservations table has the following in both BIOS and EFI boot modes:

[    0.000000] BIOS-e820: [mem 0x00000000dceff000-0x00000000dfa0ffff] reserved

Which has a small overlap with:

[    0.884684] pci_bus 0000:00: root bus resource [mem 0xdfa00000-0xfebfffff window]

This leads to the following difference in assignments of PCI resources when honoring E820 reservations

[    0.966573] pci 0000:00:1c.0: BAR 14: assigned [mem 0xdfb00000-0xdfcfffff]
[    0.966698] pci_bus 0000:02: resource 1 [mem 0xdfb00000-0xdfcfffff]

vs the following when ignoring E820 reservations:

[    0.966850] pci 0000:00:1c.0: BAR 14: assigned [mem 0xdfa00000-0xdfbfffff]
[    0.966973] pci_bus 0000:02: resource 1 [mem 0xdfa00000-0xdfbfffff]

And the overlap of 0xdfa00000-0xdfa0ffff from the e820 reservations seems to be what is causing the suspend/resume issue.

###

As already somewhat discussed, I'll go and prepare this solution instead:

1. Add E820_TYPE_MMIO to enum e820_type and modify the 2 places which check for
   type == reserved to treat this as reserved too, so as to not have any
   functional changes there

2. Modify the code building e820 tables from the EFI memmap to use
   E820_TYPE_MMIO for MMIO EFI memmap entries.

3. Modify arch/x86/kernel/resource.c: remove_e820_regions() to skip
   e820 table entries with a type of E820_TYPE_MMIO,
   this would actually be a functional change and should fix the
   issues we are trying to fix.

Regards,

Hans

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ