linux-kernel - Re: [PATCH v3 08/11] cxl/region: Implement endpoint decoder address translation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <aMsfWfwMhewTjHD3@gourry-fedora-PF4VCD3F>
Date: Wed, 17 Sep 2025 16:51:37 -0400
From: Gregory Price <gourry@...rry.net>
To: Jonathan Cameron <jonathan.cameron@...wei.com>
Cc: Robert Richter <rrichter@....com>,
	Alison Schofield <alison.schofield@...el.com>,
	Vishal Verma <vishal.l.verma@...el.com>,
	Ira Weiny <ira.weiny@...el.com>,
	Dan Williams <dan.j.williams@...el.com>,
	Dave Jiang <dave.jiang@...el.com>,
	Davidlohr Bueso <dave@...olabs.net>, linux-cxl@...r.kernel.org,
	linux-kernel@...r.kernel.org,
	"Fabio M. De Francesco" <fabio.m.de.francesco@...ux.intel.com>,
	Terry Bowman <terry.bowman@....com>,
	Joshua Hahn <joshua.hahnjy@...il.com>
Subject: Re: [PATCH v3 08/11] cxl/region: Implement endpoint decoder address
 translation

On Mon, Sep 15, 2025 at 11:46:14AM +0100, Jonathan Cameron wrote:
> > +	/*
> > +	 * Since translated addresses include the interleaving
> > +	 * offsets, align the range to 256 MB.
> 
> So we pass in an HPA range without interleaving offsets and get back
> one with them?  Is that unavoidable, or can we potentially push
> this bit into the callback?  Probably with separate callbacks to
> get the interleave details.
> 
> Overall I'm not really following what is going on here.  Maybe
> some ascii art would help?
>

The endpoints in this case are encoded with "normalized" (base-0) with
a size of only the memory they provide. As a result, the decoder
interleave settings will always be passthrough (iw=1, ig=ignored).

This chunk translates the normalized address region to the relevant SPA
region, and translates the IW/IG to what it actually is (i.e. what it 
*would have* been on a "normal" system).

Took me a while when i originally reviewed and tested this set.

Example - this is how you'd expect a real system supported by this code
to be programmed:

region {
    .start = 0x20000000
    .end   = 0x3fffffff
    .iw    = 2
    .ig    = 256
}

endpoint1_decoder {
    .start = 0x0
    .end   = 0xfffffff
    .iw    = 1
    .ig    = 256
}

endpoint2_decoder {
    .start = 0x0
    .end   = 0xfffffff
    .iw    = 1
    .ig    = 256
}

when you do the translation from either decoder's hpa start/end,
you want the following output:

range {
    .start = 0x20000000
    .end   = 0x3fffffff
    .iw    = 2
    .ig    = 256
}

If you assume a "normal" system - this is the settings the decoders
would have been programmed with in the first place.

You have to do the alignment because the translation function (may)
only work on granularity alignment.

Example:
endpoint1->to_hpa(0)         => 0x0
endpoint1->to_hpa(0xfffffff) => 0xffffe00
endpoint2->to_hpa(0)         => 0x100
endpoint2->to_hpa(0xfffffff) => 0xfffff00

So this code applies the appropriate alignment and returns the
translated iw/ig for use elsewhere in the stack when validating the rest
of the decoders.

(haven't gotten to later commits, but iirc it was eventually used)

~Gregory

> > +	 */
> > +	range.start = ALIGN_DOWN(range.start, SZ_256M);
> > +	range.end = ALIGN(range.end, SZ_256M) - 1;
> > +
> > +	spa_len = range_len(&range);
> > +	if (!len || !spa_len || spa_len % len) {
> > +		dev_warn(&port->dev,
> > +			"CXL address translation: HPA range not contiguous: %#llx-%#llx:%#llx-%#llx(%s)\n",
> > +			range.start, range.end, ctx->hpa_range.start,
> > +			ctx->hpa_range.end, dev_name(&cxld->dev));
> > +		return -ENXIO;
> > +	}
> > +
> > +	ways = spa_len / len;
> > +	gran = SZ_256;
> > +