[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z-tIkR_bEkDPUyp4@gourry-fedora-PF4VCD3F>
Date: Mon, 31 Mar 2025 21:59:45 -0400
From: Gregory Price <gourry@...rry.net>
To: Robert Richter <rrichter@....com>
Cc: Alison Schofield <alison.schofield@...el.com>,
Vishal Verma <vishal.l.verma@...el.com>,
Ira Weiny <ira.weiny@...el.com>,
Dan Williams <dan.j.williams@...el.com>,
Jonathan Cameron <Jonathan.Cameron@...wei.com>,
Dave Jiang <dave.jiang@...el.com>,
Davidlohr Bueso <dave@...olabs.net>, linux-cxl@...r.kernel.org,
linux-kernel@...r.kernel.org,
"Fabio M. De Francesco" <fabio.m.de.francesco@...ux.intel.com>,
Terry Bowman <terry.bowman@....com>
Subject: Re: [PATCH v2 10/15] cxl/region: Use root decoders interleaving
parameters to create a region
On Tue, Feb 18, 2025 at 02:23:51PM +0100, Robert Richter wrote:
> @@ -1955,12 +1971,23 @@ static int cxl_port_calc_interleave(struct cxl_port *port,
> if (is_cxl_root(port))
> return 0;
>
> - rc = find_pos_and_ways(port, ctx->hpa_range, &parent_pos, &parent_ways);
> + rc = find_pos_and_ways(port, ctx->hpa_range, &parent_pos, &parent_ways,
> + &parent_granularity);
> if (rc)
> return rc;
>
> ctx->pos = ctx->pos * parent_ways + parent_pos;
>
> + if (ctx->interleave_ways)
> + ctx->interleave_ways *= parent_ways;
> + else
> + ctx->interleave_ways = parent_ways;
> +
> + if (ctx->interleave_granularity)
> + ctx->interleave_granularity *= ctx->interleave_ways;
> + else
> + ctx->interleave_granularity = parent_granularity;
> +
> return ctx->pos;
> }
>
I have discovered on my Zen5 that either this code is incorrect, or my
decoders are programmed incorrectly.
decoderN.M | ig iw
----------------------
decoder0.0 | 2 256
decoder1.0 | 1 256
decoder3.0 | 1 256
decoder5.0 | 1 256
decoder6.0 | 1 256
region0 | 2 512 <--- Wrong
*Arch quirk aside*, everything except region is as expected.
I finally dropped a bunch of hacks from my branch, and my Zen5 stopped
bringing devices up correctly, with the error:
[]cxl region0: pci0000:d2:port1 cxl_port_setup_targets expected
iw: 1 ig: 1024 [... snip ...]
[]cxl region0: pci0000:d2:port1 cxl_port_setup_targets got
iw: 1 ig: 256 [... snip ...]
Sitting here scratching my head how I could possibly end up with an ig
of 1024 with the above set of decoders when I realized the region
inherited interleave_ways/granularity from the ENDPOINT decoder, which
is not exposed to sysfs.
Had to come back around to realize this patch set adds new
ways/granularity fields to the endpoint decoder.
struct cxl_endpoint_decoder {
struct cxl_decoder cxld;
...
int interleave_ways;
int interleave_granularity;
}
struct cxl_decoder {
...
int interleave_ways;
int interleave_granularity;
}
1) the cxl_endpoint_decoder descriptor needs to be updated to explain
why these ways/granularity differ from the cxl_decoder inside of the
cxl_endpoint_decoder. This is very, very confusing.
The reason appears to be that the endpoint decoder ways/granularity
is the region ways/granularity. So the endpoint decoder is passing
this information along.
Makes me think the region creation code should call this directly,
rather than all this indirection.
2) This calculation appears to just be plain wrong.
static int cxl_endpoint_decoder_initialize(struct cxl_endpoint_decoder *cxled)
{
ctx = (struct cxl_interleave_context) {
.hpa_range = &hpa,
};
...
while (iter && parent) {
endpoint host bridge root
decoder6.0 -> decoder3.0 -> decoder0.0
/* Convert interleave settings to next port upstream. */
rc = cxl_port_calc_interleave(iter, &ctx);
...
}
...
cxled->interleave_ways = ctx.interleave_ways;
cxled->interleave_granularity = ctx.interleave_granularity;
}
On my setup, I would expect to iterate decoder3.0 and decoder0.0
decoderN.M | ig iw
----------------------
decoder0.0 | 2 256
decoder3.0 | 1 256
on entry [iw,ig] = [0,0]
[parent_ways, parent_gran] -> [1,256]
[iw * piw, ig * piw] -> [2,512]
Looking at a normal system, we'd expect this configuration:
decoderN.M | ig iw
----------------------
decoder0.0 | 2 256
decoder1.0 | 1 512
decoder3.0 | 1 512
decoder5.0 | 2 256
decoder6.0 | 2 256
The above code produces the following:
[1,512]
[2,1024] <--- still wrong
in cxl_port_setup_targets we have this comment:
if (is_cxl_root(parent_port)) {
/*
* Root decoder IG is always set to value in CFMWS which
* may be different than this region's IG. We can use the
* region's IG here since interleave_granularity_store()
* does not allow interleaved host-bridges with
* root IG != region IG.
*/
parent_ig = p->interleave_granularity;
parent_iw = cxlrd->cxlsd.cxld.interleave_ways;
}
Can we not just always report the parent ways/granularity, and skip all
the math? We'll always iterate to the root, and that's what we want the
region to match, right?
Better yet, can we not just do this in the region creation code, rather
than having the endpoint carry it through to the region for some reason?
Avoid adding the duplicate ways/granularity field all together.
~Gregory
Powered by blists - more mailing lists