[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <082de146-d5ed-4b49-ba0f-d6f018436e5b@fujitsu.com>
Date: Wed, 30 Apr 2025 03:24:42 +0000
From: "Zhijian Li (Fujitsu)" <lizhijian@...itsu.com>
To: Gregory Price <gourry@...rry.net>
CC: "linux-cxl@...r.kernel.org" <linux-cxl@...r.kernel.org>, Jonathan Cameron
<jonathan.cameron@...wei.com>, Dave Jiang <dave.jiang@...el.com>, Alison
Schofield <alison.schofield@...el.com>, Vishal Verma
<vishal.l.verma@...el.com>, Ira Weiny <ira.weiny@...el.com>, Dan Williams
<dan.j.williams@...el.com>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>
Subject: Re: [RFC PATCH] cxl: Allow reprogramming misconfigured hdm decoders
On 30/04/2025 10:17, Gregory Price wrote:
> On Wed, Apr 30, 2025 at 09:29:15AM +0800, Li Zhijian wrote:
>> During kernel booting, CXL drivers will attempt to construct the CXL region
>> according to the pre-programed(firmware provisioning) HDM decoders.
>>
>> This construction process will fail for some reasons, in this case, the
>> userspace cli like ndctl/cxl cannot destroy nor create regions upon the
>> existing decoders.
>>
>> Introuce a new flag CXL_DECODER_F_NEED_RESET tell the driver to reset
>> the decoder during `cxl destroy-region regionN`, so that region can be
>> create again after that.
>>
>
> My best understanding of why this is disallowed is that firmware/bios
> programmed decoders need to be locked because there is an assumption
> that the platform programmed it that way *for a reason* - and that
> changing the programming would break it (cause MCEs for other reasons,
> etc).
Hi Gregory,
Thank you for the feedback. Based on current CXL driver behavior, user-space tools
can indeed reprogram firmware-provisioned HDM decoders in practice.
For example, after a successful boot, one may destroy the auto-constructed region
via cxl destroy-region and create a new different region.
This indicates that the kernel does not inherently lock down these decoders.
As for the locking rationale you mentioned, platform vendors might enforce their policies
through mechanisms like the *Lock-On-Commit* in CXL HDM Decoder n Control Register
While platform vendors may have valid considerations (as you noted), from a driver and
end-user perspective, depending solely on firmware updates to fix transient failures
is not smooth sometimes :).
>
> So the appropriate solution here is for the platform vendor to fix their
> firmware.
>
> But I am not a platform people - so I will defer to them on whether my
> understanding is correct.
Yeah, it's still in the RFC stage, let's hear more voices.
Thanks
Zhijian
>
> ~Gregory
>
>
Powered by blists - more mailing lists