[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <2791149.btlEUcBR6m@fdefranc-mobl3>
Date: Mon, 25 Nov 2024 18:22:10 +0100
From: "Fabio M. De Francesco" <fabio.m.de.francesco@...ux.intel.com>
To: Li Ming <ming4.li@...el.com>, Li Ming <ming4.li@...look.com>
Cc: Davidlohr Bueso <dave@...olabs.net>,
Jonathan Cameron <jonathan.cameron@...wei.com>,
Dave Jiang <dave.jiang@...el.com>,
Alison Schofield <alison.schofield@...el.com>,
Vishal Verma <vishal.l.verma@...el.com>, Ira Weiny <ira.weiny@...el.com>,
Dan Williams <dan.j.williams@...el.com>, Huang Ying <ying.huang@...el.com>,
Yao Xingtao <yaoxt.fnst@...itsu.com>, linux-kernel@...r.kernel.org,
linux-cxl@...r.kernel.org
Subject:
Re: [PATCH 2/3] cxl/core: Enable Region creation on x86 with Low Memory Hole
On Monday, November 25, 2024 9:42:56 AM GMT+1 Li Ming wrote:
>
> On 11/22/2024 11:51 PM, Fabio M. De Francesco wrote:
> > The CXL Fixed Memory Window Structure (CFMWS) describes zero or more Host
> > Physical Address (HPA) windows that are associated with each CXL Host
> > Bridge. Each window represents a contiguous HPA that may be interleaved
> > with one or more targets (CXL v3.1 - 9.18.1.3).
> >
> > The Low Memory Hole (LMH) of x86 is a range of addresses of physical low
> > memory to which systems cannot send transactions. In some cases the size
> > of that hole is not compatible with the CXL hardware decoder constraint
> > that the size is always aligned to 256M * Interleave Ways.
> >
> > On those systems, BIOS publishes CFMWS which communicate the active System
> > Physical Address (SPA) ranges that map to a subset of the Host Physical
> > Address (HPA) ranges. The SPA range trims out the hole, and capacity in
> > the endpoint is lost with no SPA to map to CXL HPA in that hole.
> >
> > In the early stages of CXL Regions construction and attach on platforms
> > with Low Memory Holes, cxl_add_to_region() fails and returns an error
> > because it can't find any CXL Window that matches a given CXL Endpoint
> > Decoder.
> >
> > Detect Low Memory Holes by comparing Root Decoders and Endpoint Decoders
> > ranges: both must start at physical address 0 and end below 4 GB, while
> > the Root Decoder ranges end at lower addresses than the matching Endpoint
> > Decoders which instead must always respect the above-mentioned CXL hardware
> > decoders HPA alignment constraint.
>
> Hi Fabio,
>
> Here mentions that the end address must be below 4GB, but I don't find any checking about that in the implementation, is it not needed?
>
Hi Ming,
Good question, thanks!
While a first version of arch_match_spa() had a check of 'r2->end < SZ_4G',
I dropped it for the final one. It ended up out of sync with the commit message.
I think that we don't want that check. I'll rework the commit message for v2.
If the hardware decoders HPA ranges extended beyond the end of Low
Memory, the LMH would still need to be detected and the decoders capacity
would still need to be trimmed to match their corresponding CFMWS range end.
>
> [snip]
>
>
> > diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> > index ac2c486c16e9..3cb5a69e9731 100644
> > --- a/drivers/cxl/core/region.c
> > +++ b/drivers/cxl/core/region.c
> > @@ -836,8 +836,12 @@ static int match_auto_decoder(struct device *dev, void *data)
> > cxld = to_cxl_decoder(dev);
> > r = &cxld->hpa_range;
> >
> > - if (p->res && p->res->start == r->start && p->res->end == r->end)
> > - return 1;
> > + if (p->res) {
> > + if (p->res->start == r->start && p->res->end == r->end)
> > + return 1;
> > + if (arch_match_region(p, cxld))
> > + return 1;
> > + }
>
> I think that it is better to check LMH cases before checking (p->res->start == r->start && p->res->end == r->end).
> Per the changelog and the implementation of arch_match_region(), below case should fails but current checking will succeed:
> the value of 'p->res->start' is MISALIGNED_CFMWS_RANGE_BASE and the the value of 'p->res->end' is equals to the value of 'r->end'.
>
I think that the expected "normal" case should always come first. In the expected
scenarios the driver deals either with SPA range == HPA range
(e.g, in match_auto_decoder()) or with SPA range that fully contains the HPA range
(match_decoder_by_range()).
If either one of those two cases holds, arch_match_*() helper doesn't need to be
called and the match must succeed.
Besides, other architectures are free to define holes with many constraints that
the driver doesn't want to check first if the expected case is already met.
>
> >
> > return 0;
> > }
> > @@ -1414,7 +1418,8 @@ static int cxl_port_setup_targets(struct cxl_port *port,
> > if (cxld->interleave_ways != iw ||
> > cxld->interleave_granularity != ig ||
> > cxld->hpa_range.start != p->res->start ||
> > - cxld->hpa_range.end != p->res->end ||
> > + (cxld->hpa_range.end != p->res->end &&
> > + !arch_match_region(p, cxld)) ||
> > ((cxld->flags & CXL_DECODER_F_ENABLE) == 0)) {
> > dev_err(&cxlr->dev,
> > "%s:%s %s expected iw: %d ig: %d %pr\n",
> > @@ -1726,6 +1731,7 @@ static int match_switch_decoder_by_range(struct device *dev, void *data)
> > {
> > struct cxl_endpoint_decoder *cxled = data;
> > struct cxl_switch_decoder *cxlsd;
> > + struct cxl_root_decoder *cxlrd;
> > struct range *r1, *r2;
> >
> > if (!is_switch_decoder(dev))
> > @@ -1735,8 +1741,13 @@ static int match_switch_decoder_by_range(struct device *dev, void *data)
> > r1 = &cxlsd->cxld.hpa_range;
> > r2 = &cxled->cxld.hpa_range;
> >
> > - if (is_root_decoder(dev))
> > - return range_contains(r1, r2);
> > + if (is_root_decoder(dev)) {
> > + if (range_contains(r1, r2))
> > + return 1;
> > + cxlrd = to_cxl_root_decoder(dev);
> > + if (arch_match_spa(cxlrd, cxled))
> > + return 1;
>
> Same as above, what will happen if the root decoder's address range still contains endpoint decoder's address range in LMH cases? should fails or succeed?
>
If r1 contains r2, there is no LMH and the driver is dealing with the regular,
expected, case. It must succeed.
Think of the arch_match_*() helpers like something that avoid unwanted
failures if arch permits exceptions. Before returning errors when the "normal"
tests fail, check if the arch allows any exceptions and make the driver
ignore that "strange" SPA/HPA misalignment.
>
> [snip]
>
Thanks,
Fabio
Powered by blists - more mailing lists