lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z_Siq6JrfST1T7la@gourry-fedora-PF4VCD3F>
Date: Tue, 8 Apr 2025 00:14:35 -0400
From: Gregory Price <gourry@...rry.net>
To: "Zhijian Li (Fujitsu)" <lizhijian@...itsu.com>
Cc: "lsf-pc@...ts.linux-foundation.org" <lsf-pc@...ts.linux-foundation.org>,
	"linux-mm@...ck.org" <linux-mm@...ck.org>,
	"linux-cxl@...r.kernel.org" <linux-cxl@...r.kernel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: CXL Boot to Bash - Section 2a (Drivers): CXL Decoder Programming

On Tue, Apr 08, 2025 at 03:10:24AM +0000, Zhijian Li (Fujitsu) wrote:
> >> On 07/03/2025 07:56, Gregory Price wrote:
> >>> What if instead, we had two 256MB endpoints on the same host bridge?
> >>>
> >>> ```
> >>> CEDT
> >>>              Subtable Type : 01 [CXL Fixed Memory Window Structure]
> >>>                   Reserved : 00
> >>>                     Length : 002C
> >>>                   Reserved : 00000000
> >>>        Window base address : 0000000100000000   <- Memory Region
> >>>                Window size : 0000000020000000   <- 512MB
> >>> Interleave Members (2^n) : 00                 <- Not interleaved
> >>>
> >>> Memory Map:
> >>>     [mem 0x0000000100000000-0x0000000120000000] usable  <- SPA
> >>>
> >>> Decoders
> >>>                               decoder0.0
> >>>                     range=[0x100000000, 0x120000000]
> >>>                                   |
> >>>                               decoder1.0
> >>>                     range=[0x100000000, 0x120000000]
> >>>                     /                              \
> >>>               decoded2.0                        decoder3.0
> >>>     range=[0x100000000, 0x110000000]   range=[0x110000000, 0x120000000]
> >>> ```
> >>
> >> It reminds me that during construct_region(), it requires decoder range in the
> >> switch/host-bridge is exact same with the endpoint decoder. see
> >> match_switch_decoder_by_range()
> 
> 
>  From the code, we can infer this point. However, is this just a solution implemented in software,
> or is it explicitly mandated by the CXL SPEC or elsewhere? If you are aware, please let me know.
> 

The description you've quoted here is incorrect, as I didn't fully
understand the correct interleave configuration.  I plan on re-writing
this portion with correct configurations over the next month.

Linux does expect all decoders from root to endpoint to be programmed
with the same range*[2].

please keep an eye on [1] for updates, i won't be updating this thread
with further edits.

> I have been trying for days to find documentary evidence to persuade our firmware team that,
> during device provisioning, the programming of the HDM decoder should adhere to this principle:
> The range in the HDM decoder should be exactly the same between the device and its upstream switch.
> 

In general, everything included in this guide does not care about what
the spec says is possible - it only concerns itself with what linux
supports.  If there is a mechanism described in the spec that isn't
supported, it is expected that an interested vendor will come along to
help support it.

However, the current Linux driver absolutely expects the range in the
HDM decoders should be exactly the same from root to endpoint*.

My reading of the 3.1 spec suggests this is also defined by implication
of the "Implementation Notes" at the end of section

8.2.4.20 CXL HDM Decoder Capability Structure

IMPLEMENTATION NOTE
CXL Host Bridge and Upstream Switch Port Decode Flow

IMPLEMENTATION NOTE
Device Decode Logic

The host bridge/USP implementation note describes extracting bits for
routing, while the device decode logic describes active translation from
HPA to DPA.

~Gregory

[1] https://gourryinverse.github.io/cxl-boot-to-bash/

^ with the exception of Zen5 [2], which I don't recommend you replicate
[2] https://lore.kernel.org/linux-cxl/20250218132356.1809075-1-rrichter@amd.com/T/#t

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ