lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 26 Jan 2024 12:32:34 -0600
From: Mario Limonciello <mario.limonciello@....com>
To: Bjorn Helgaas <helgaas@...nel.org>
Cc: Bjorn Helgaas <bhelgaas@...gle.com>,
 "Rafael J . Wysocki" <rjw@...ysocki.net>, linux-pci@...r.kernel.org,
 linux-acpi@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] x86/pci: Stop requiring ECAM to be declared in E820,
 ACPI or EFI

On 1/25/2024 18:35, Bjorn Helgaas wrote:
> On Wed, Jan 17, 2024 at 11:53:50AM -0600, Mario Limonciello wrote:
>> On 12/15/2023 16:03, Mario Limonciello wrote:
>>> commit 7752d5cfe3d1 ("x86: validate against acpi motherboard resources")
>>> introduced checks for ensuring that MCFG table also has memory region
>>> reservations to ensure no conflicts were introduced from a buggy BIOS.
>>>
>>> This has proceeded over time to add other types of reservation checks
>>> for ACPI PNP resources and EFI MMIO memory type.  The PCI firmware spec
>>> does say that these checks are only required when the operating system
>>> doesn't comprehend the firmware region:
>>>
>>> ```
>>> If the operating system does not natively comprehend reserving the MMCFG
>>> region, the MMCFG region must be reserved by firmware. The address range
>>> reported in the MCFG table or by _CBA method (see Section 4.1.3) must be
>>> reserved by declaring a motherboard resource. For most systems, the
>>> motherboard resource would appear at the root of the ACPI namespace
>>> (under \_SB) in a node with a _HID of EISAID (PNP0C02), and the resources
>>> in this case should not be claimed in the root PCI bus’s _CRS. The
>>> resources can optionally be returned in Int15 E820h or EFIGetMemoryMap
>>> as reserved memory but must always be reported through ACPI as a
>>> motherboard resource.
>>> ```
>>>
>>> Running this check causes problems with accessing extended PCI
>>> configuration space on OEM laptops that don't specify the region in PNP
>>> resources or in the EFI memory map. That later manifests as problems with
>>> dGPU and accessing resizable BAR. Similar problems don't exist in Windows
>>> 11 with exact same laptop/firmware stack.
>>>
>>> Due to the stability of the Windows ecosystem that x86 machines participate
>>> it is unlikely that using the region specified in the MCFG table as
>>> a reservation will cause a problem. The possible worst circumstance could
>>> be that a buggy BIOS causes a larger hole in the memory map that is
>>> unusable for devices than intended.
>>>
>>> Change the default behavior to keep the region specified in MCFG even if
>>> it's not specified in another source. This is expected to improve
>>> machines that otherwise couldn't access PCI extended configuration space.
>>>
>>> In case this change causes problems, add a kernel command line parameter
>>> that can restore the previous behavior.
>>>
>>> Link: https://members.pcisig.com/wg/PCI-SIG/document/15350
>>>         PCI Firmware Specification 3.3
>>>         Section 4.1.2 MCFG Table Description Note 2
>>> Signed-off-by: Mario Limonciello <mario.limonciello@....com>
>>> ---
>>
>> Bjorn,
>>
>> Any thoughts on this version since our last conversation on V1?
> 
> Just to let you know that I'm not ignoring this, and I'm trying to
> formulate a useful response.  

Thanks, I had been wondering.

FYI - we've also added another place to make noise about this ECAM issue 
in AMDGPU.  This warning should go into 6.9:

https://lore.kernel.org/amd-gfx/20240110101319.695169-1-Jun.Ma2@amd.com/

It will at least be interesting to see how many people come out of the 
woodwork to report that new warning.

> mmconfig-shared.c has grown into an
> extremely complicated mess and is a continual source of problems, so
> I'd really like to untangle it a tiny bit if we can.
> 
> One thing is that per spec, ACPI motherboard resources are the ONLY
> way to reserve ECAM space.  I'd like to reclaim a little clarity about
> that and separate out the E820 and EFI stuff as secondary hacks.  But
> there's an insane amount of history that got us here.

I guess you know better than anyone here.  But if my idea is actually 
viable then the E820 and EFI stuff turn into "information only".

> 
> The problem we have to avoid is assigning a BAR that overlaps ECAM.
> We assign BARs for lots of reasons.  dGPU and resizable BARs are
> examples but there are others, like hotplug and SR-IOV.  The fact that
> Windows works is a red herring because it allocates BARs differently.

Have we actually observed a case that assigning the BAR overlaps ECAM 
region thus far or it's hypothetical?

I would think that if the reservation is made by ECAM first, then any 
conflict for any reason that tries to assign it will just get a smaller 
BAR, but not necessarily a functional problem.

But that's also part of why I was thinking kernel command line for us to 
have the escape hatch.
> 
> It's also little bit of a burr under my saddle to add things for a
> problem on unspecified machines, where I don't even know whether the
> machines are already in the field or the firmware could still be
> fixed.

Of the two machines I know of:

* One of them has been fixed by a BIOS change before it's final 
production stage.
* The other is still affected.

Here is the info for the still affected one.  It's been shipping already.

Alienware Alienware m18 R1 AMD/0RU01H, BIOS 1.2.2 04/21/2023

> 
> And of course, if there's any way to solve this safely without adding
> yet another kernel parameter, that would be vastly preferable.

The kernel isn't static though; something we could do is keep the 
parameter around a year or two to get the bug feedback loop of people 
testing it and then rip it out if nothing comes up.

> 
> Sorry, nothing actionable here but wanted to let you know that it's
> keeping me awake at night ;)

:)
> 
> Bjorn
> 
>>> v1->v2:
>>>    * Rebase on pci/next
>>>    * Add an escape hatch
>>>    * Reword commit message
>>> ---
>>>    .../admin-guide/kernel-parameters.txt         |  6 ++++++
>>>    arch/x86/pci/mmconfig-shared.c                | 19 +++++++++++++++----
>>>    2 files changed, 21 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
>>> index 65731b060e3f..eacd0c0521c2 100644
>>> --- a/Documentation/admin-guide/kernel-parameters.txt
>>> +++ b/Documentation/admin-guide/kernel-parameters.txt
>>> @@ -1473,6 +1473,12 @@
>>>    			(in particular on some ATI chipsets).
>>>    			The kernel tries to set a reasonable default.
>>> +	enforce_ecam_resv [X86]
>>> +			Enforce requiring an ECAM reservation specified in
>>> +			BIOS for PCI devices.
>>> +			This parameter is only valid if CONFIG_PCI_MMCONFIG
>>> +			is enabled.
>>> +
>>>    	enforcing=	[SELINUX] Set initial enforcing status.
>>>    			Format: {"0" | "1"}
>>>    			See security/selinux/Kconfig help text.
>>> diff --git a/arch/x86/pci/mmconfig-shared.c b/arch/x86/pci/mmconfig-shared.c
>>> index 0cc9520666ef..aee117c6bbf9 100644
>>> --- a/arch/x86/pci/mmconfig-shared.c
>>> +++ b/arch/x86/pci/mmconfig-shared.c
>>> @@ -34,6 +34,15 @@ static DEFINE_MUTEX(pci_mmcfg_lock);
>>>    LIST_HEAD(pci_mmcfg_list);
>>> +static bool enforce_ecam_resv __read_mostly;
>>> +static int __init parse_ecam_options(char *str)
>>> +{
>>> +	enforce_ecam_resv = true;
>>> +
>>> +	return 1;
>>> +}
>>> +__setup("enforce_ecam_resv", parse_ecam_options);
>>> +
>>>    static void __init pci_mmconfig_remove(struct pci_mmcfg_region *cfg)
>>>    {
>>>    	if (cfg->res.parent)
>>> @@ -569,10 +578,12 @@ static void __init pci_mmcfg_reject_broken(int early)
>>>    	list_for_each_entry(cfg, &pci_mmcfg_list, list) {
>>>    		if (!pci_mmcfg_reserved(NULL, cfg, early)) {
>>> -			pr_info("not using ECAM (%pR not reserved)\n",
>>> -				&cfg->res);
>>> -			free_all_mmcfg();
>>> -			return;
>>> +			pr_info("ECAM %pR not reserved, %s\n", &cfg->res,
>>> +				enforce_ecam_resv ? "ignoring" : "using anyway");
>>> +			if (enforce_ecam_resv) {
>>> +				free_all_mmcfg();
>>> +				return;
>>> +			}
>>>    		}
>>>    	}
>>>    }
>>>
>>> base-commit: 67e04d921cb6902e8c2abdbf748279d43f25213e
>>


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ