[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240429153138.GA681245@bhelgaas>
Date: Mon, 29 Apr 2024 10:31:38 -0500
From: Bjorn Helgaas <helgaas@...nel.org>
To: PJ Waskiewicz <ppwaskie@...nel.org>
Cc: Dan Williams <dan.j.williams@...el.com>, linux-cxl@...r.kernel.org,
linux-pci@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/1] cxl/acpi.c: Add buggy BIOS hint for CXL ACPI lookup
failure
On Sun, Apr 28, 2024 at 10:57:13PM -0700, PJ Waskiewicz wrote:
> On Tue, 2024-04-09 at 08:22 -0500, Bjorn Helgaas wrote:
> > On Sun, Apr 07, 2024 at 02:05:26PM -0700, ppwaskie@...nel.org wrote:
> > > From: PJ Waskiewicz <ppwaskie@...nel.org>
> > >
> > > Currently, Type 3 CXL devices (CXL.mem) can train using host CXL
> > > drivers on Emerald Rapids systems. However, on some production
> > > systems from some vendors, a buggy BIOS exists that improperly
> > > populates the ACPI => PCI mappings.
> >
> > Can you be more specific about what this ACPI => PCI mapping is?
> > If you already know what the problem is, I'm sure this is obvious,
> > but
> > otherwise it's not.
>
> Apologies for the delay in response. Things got a bit busy with travel
> and whatnot...
>
> On one of these particular hosts, in /sys/bus/acpi/devices/ACPI0016:00,
> for example, the UID would be something like CX01. It isn't an u64 at
> all, and there's no atoi() or other conversions that would match what
> the UID should be.
>
> In my case, /sys/bus/acpi/devices/ACPI0016:02/ is my CXL device in
> question. The UID that is presented from enumeration was CX02.
> However, if I scour the CEDT manually, the UID of my particular CXL
> device is really UID 49.
>
> So, if I went from the PCI/CXL device side, and called something along
> the lines of to_cxl_host_bridge() and tried to go from the pci_dev to
> the acpi_handle, I'd get CX02 back. Then trying to use that to call
> acpi_table_parse_cedt() would fail.
>
> The BIOS fix from the vendor corrected the UID enumeration on the ACPI
> side. This allowed things to properly line up when traversing through
> the kernel APIs and parsing the ACPI tables.
IIUC, _HID ACPI0016 indicates a CXL host bridge. ACPI r6.5, sec
6.5.11, says "The _UID object is required in order to allow OSPM to
match entries in the CEDT to devices present in the ACPI namespace."
I don't see anything about a requirement to map an ACPI0016 devices to
a PCI device. At least in the non-CXL world, there *is* no way to map
a PNP0A08 device to a PCI device because a host bridge is not a PCI
devices itself (it has an unspecified non-PCI primary interface and a
PCI secondary interface).
So from the patch and the ACPI/CXL specs, it looks like the problem
doesn't involve PCI at all; it just looks like an ACPI0016 object is
required to contain a _UID, and on this buggy BIOS it doesn't.
My question was just prompted by the "ACPI => PCI mapping" in the
commit log. Since PCI doesn't seem involved, maybe just drop that
reference?
It's just a buggy BIOS that doesn't supply _UID for an ACPI0016
object, so you can't locate the corresponding CEDT entry, right?
Bjorn
Powered by blists - more mailing lists