lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <038bc7b2-ae01-4d47-aa73-659b9cb8028d@fujitsu.com>
Date: Mon, 7 Apr 2025 02:30:24 +0000
From: "Zhijian Li (Fujitsu)" <lizhijian@...itsu.com>
To: Dan Williams <dan.j.williams@...el.com>, Ira Weiny <ira.weiny@...el.com>,
	"linux-cxl@...r.kernel.org" <linux-cxl@...r.kernel.org>
CC: Jonathan Cameron <jonathan.cameron@...wei.com>, Dave Jiang
	<dave.jiang@...el.com>, Alison Schofield <alison.schofield@...el.com>, Vishal
 Verma <vishal.l.verma@...el.com>, "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] cxl/acpi: Verify CHBS length for CXL2.0



On 05/04/2025 06:19, Dan Williams wrote:
> Zhijian Li (Fujitsu) wrote:
>>
>>
>> On 27/03/2025 21:36, Dan Williams wrote:
>>> Zhijian Li (Fujitsu) wrote:
>>>>
>>>>
>>>> On 27/03/2025 11:44, Ira Weiny wrote:
>>>>> Li Zhijian wrote:
>>>>>> Per CXL Spec r3.1 Table 9-21, both CXL1.1 and CXL2.0 have defined their
>>>>>> own length, verify it to avoid an invalid CHBS
>>>>>
>>>>>
>>>>> I think this looks fine.  But did a platform have issues with this?
>>>>
>>>> Not really, actually, I discovered it while reviewing the code and
>>>> CXL specification.
>>>>
>>>> Currently, this issue arises only when I inject an incorrect length
>>>> via QEMU environment. Our hardware does not experience this problem.
>>>>
>>>>
>>>>> Does this need to be backported?
>>>> I remain neutral :)
>>>
>>> What does the kernel do with this invalid CHBS from QEMU? I would be
>>> happy to let whatever bad effect from injecting a corrupted CHBS just
>>> happen because there are plenty of ways for QEMU to confuse the kernel
>>> even if the table lengths are correct.
>>>
>>> Unless it has real impact I would rather not touch the kernel for every
>>> possible way that QEMU can make a mistake.
>>
>>
>>
>> Thank you for the feedback.
>>
>> If your earlier comments were specifically about ***backporting*** this patch,
>> I agree there might not be an urgent need for that.
>>
>> However, regarding the discussion on whether this patch should be accepted
>> upstream, TBH, I believe it is necessary.
>>
>> 1. The **CXL Specification (r3.1, Table 9-21)** explicitly defines `length`
>> requirements for CHBS in both CXL 1.1 and CXL 2.0 cases. Failing to
>> validate this field against the spec risks misinterpretation of invalid
>> configurations.
> 
> The point is that the kernel has gotten by without this check and does
> not need to be aggressive. Anything more than this hunk below is
> overkill:


Ok, I will update it as your suggestion.

Thanks
Zhijian

> 
> diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> index cb14829bb9be..fbcb93e5beb5 100644
> --- a/drivers/cxl/acpi.c
> +++ b/drivers/cxl/acpi.c
> @@ -482,6 +482,10 @@ static int cxl_get_chbs_iter(union acpi_subtable_headers *header, void *arg,
>              chbs->length != CXL_RCRB_SIZE)
>                  return 0;
>   
> +       if (chbs->cxl_version == ACPI_CEDT_CHBS_VERSION_CXL20 &&
> +           chbs->length != ACPI_CEDT_CHBS_LENGTH_CXL20)
> +               return 0;
> +
>          if (!chbs->base)
>                  return 0;
>   
>> 2. As mentioned in section **2.13.8** of the *CXL Memory Device Software Guide (Rev 1.0)*,
>> It's recommended to verify the CHBS length.
>>
>> While the immediate impact might be limited to edge cases (e.g., incorrect QEMU configurations),
>> upstreaming this aligns the kernel with spec-mandated checks and improves
>> robustness for future use cases.
> 
> What set me off was that this patch was:
> 
>   1 file changed, 29 insertions(+), 16 deletions(-)
> 
> ...motivated by a buggy QEMU configuration, and that the kernel has been
> fine to not carry self-defense against for years. So the check has
> literally not mattered in practice for a long time.> 
> I think it is ok to do that minimal validation I suggest above to pair
> with the v1.1 length check, but in general there are more ways than the
> length to produce a broken CHBS and I do not want to encourage a
> cxl_chbs_verify() approach to gather more and more theoretical checks
> unless and until we start seeing these quirks impacting the kernel in
> production use cases. Buggy QEMU is not a suitable justification for
> code refactoring.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ