[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aLW6e0tOI4HDSl7u@rric.localdomain>
Date: Mon, 1 Sep 2025 17:23:39 +0200
From: Robert Richter <rrichter@....com>
To: "Fabio M. De Francesco" <fabio.m.de.francesco@...ux.intel.com>
Cc: linux-cxl@...r.kernel.org, Davidlohr Bueso <dave@...olabs.net>,
Jonathan Cameron <jonathan.cameron@...wei.com>,
Dave Jiang <dave.jiang@...el.com>,
Alison Schofield <alison.schofield@...el.com>,
Vishal Verma <vishal.l.verma@...el.com>,
Ira Weiny <ira.weiny@...el.com>,
Dan Williams <dan.j.williams@...el.com>,
Jonathan Corbet <corbet@....net>, linux-doc@...r.kernel.org,
linux-kernel@...r.kernel.org,
ALOK TIWARI <alok.a.tiwari@...cle.com>,
Randy Dunlap <rdunlap@...radead.org>,
Gregory Price <gourry@...rry.net>
Subject: Re: [PATCH v4] cxl: docs/driver-api/conventions resolve conflicts
between CFMWS, LMH, Decoders
On 01.09.25 14:22:00, Fabio M. De Francesco wrote:
> Hi Robert,
>
> On Tuesday, August 26, 2025 3:49:58 PM Central European Summer Time Robert Richter wrote:
> > Hi Fabio,
> >
> > questions inline.
> >
> [snip]
> >
> > > +
> > > +On these systems, BIOS publishes CFMWS to communicate the active System
> > > +Physical Address (SPA) ranges that map to a subset of the Host Physical
> > > +Address (HPA) ranges. The SPA range trims out the hole, resulting in lost
> > > +capacity in the endpoint with no SPA to map to the CXL HPA range that
> > > +exceeds the matching CFMWS range.
> > > +
> > > +E.g, a real x86 platform with two CFMWS, 384 GB total memory, and LMH
> > > +starting at 2 GB:
> > > +
> > > +Window | CFMWS Base | CFMWS Size | HDM Decoder Base | HDM Decoder Size | Ways | Granularity
> > > + 0 | 0 GB | 2 GB | 0 GB | 3 GB | 12 | 256
> >
> > Could you explain the zero-base limit and how this is special to LMH
> >
> Linux follows the CXL specs and so it allows the construction of CXL Regions
> and the attachment of HDM Decoders to them only if the Specs are not violated.
>
> This document addresses only one of many possible violations. The proposed
> solution is not general to every possible Memory Hole on purpose.[1]
>
> The proposed strategy wants to allow exceptions only if the CFMWS HPA range
> starts at 0 and ends before 4GB. It only deals with Holes in Low Memory. The
> many other combination of circumstances that lead to failures are out of the
> scope of this document.
> >
> > or multiple of 3-way configs?
> >
> It applies to all possible NIW configs.
> >
> > What if the HPA range is non-cxl already?
> >
> This solution applies to all CFMWS HPA range that start at zero and end
> before 4GB, regardless of the motivation behind memory reserve.
> >
> > E.g. my system shows this:
> >
> > [ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009ffff] usable
> > [ 0.000000] BIOS-e820: [mem 0x00000000000a0000-0x00000000000fffff] reserved
> > [ 0.000000] BIOS-e820: [mem 0x0000000000100000-0x0000000075b5ffff] usable
> > [ 0.000000] BIOS-e820: [mem 0x0000000075b60000-0x0000000075baafff] ACPI NVS
> > ...
See below, it is still unclear you you are handling this.
> >
> > > + 1 | 4 GB | 380 GB | 0 GB | 380 GB | 12 | 256
> >
> > The EP's HDM decoder's HPA ranges overlap now as both start at 0.
> > Isn't that a spec violation: "Decoder m must cover an HPA range that
> > is below decoder m+1."?
> >
> The HDM Decoder's HPA range in the second line starts at the fourth GB.
> I made a copy/paste mistake and I'll fix it with the next version of this
> patch. Thanks for spotting it.
> >
> > For the second decoder, shouldn't the upper limit be cut at 378 GB
> > (multiple of 3, or 372, multiple of 12)? But since the CFMWS Base is
> > non-zero that range is not detected to cut it?
> >
> Another mistake. Anyway, please notice that all ranges above 4GB are
> out of the scope of this document. On purpose.
Let's see your fixes.
> > >
> [snip]
> >
> > > +Detailed Description of the Change
> > > +----------------------------------
> > > +
> > > +The description of the Window Size field in table 9-22 needs to account
> > > +for platforms with Low Memory Holes, where SPA ranges might be subsets of
> > > +the endpoints' HPA. Therefore, it has to be changed to the following:
> > > +
> > > +"The total number of consecutive bytes of HPA this window represents.
> > > +This value shall be a multiple of NIW * 256 MB. On platforms that reserve
Add a line break to mark the text as a special case.
> > > +physical addresses below 4 GB, such as the Low Memory Hole for PCIe MMIO
> > > +on x86 or a requirement for greater than 8-way interleave CXL Regions
> > > +starting at address 0, an instance of CFMWS whose Base HPA is 0 might have
> > > +a window size that doesn't align with the NIW * 256 MB constraint. Note
> > > +that the matching intermediate Switch and Endpoint Decoders' HPA range
> > > +sizes must still align to the above-mentioned rule, but the memory capacity
> > > +that exceeds the CFMWS window size will not be accessible."
> >
> > Have you considered to just allow smaller CFMWS ranges that just cut
> > the boundaries accordingly? That is, just search for a CFMWS range
> > within the EP's HPA ranges (or even multiple CFMWS ranges) and only
> > enable that HPA range? That would be more general and removes some
> > limitations, such as zero-base and below 4 GB only.
> >
> This solution doesn't want to be a general solution for all kinds of Memory
> Holes. Dan has been very clear about cutting out solutions to general cases.
> This solution is limited on purpose.[1]
I still don't get how you set the actual base address if it must be
zero to trigger the quirk. How does this work with SRAT and what about
other memory ranges that are in the same range?
It should be clearly marked as a quirk that only under the specific
conditions. The implemenation should be separate from the main path to
better isolate the change. That was unclear to me. Still, there are
the questions above.
Thanks,
-Robert
>
> Thanks,
>
> Fabio
>
> [1] https://lore.kernel.org/linux-cxl/67ec4d61c3fd6_288d2947b@dwillia2-xfh.jf.intel.com.notmuch/
>
>
>
Powered by blists - more mailing lists