[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9a6f573c-1346-4173-a5f6-27678c011aa8@intel.com>
Date: Thu, 14 Aug 2025 15:59:25 -0700
From: Dave Jiang <dave.jiang@...el.com>
To: dan.j.williams@...el.com, linux-cxl@...r.kernel.org,
linux-acpi@...r.kernel.org, linux-kernel@...r.kernel.org
Cc: gregkh@...uxfoundation.org, rafael@...nel.org, dakr@...nel.org,
dave@...olabs.net, jonathan.cameron@...wei.com, alison.schofield@...el.com,
vishal.l.verma@...el.com, ira.weiny@...el.com, marc.herbert@...ux.intel.com,
akpm@...ux-foundation.org, david@...hat.com
Subject: Re: [PATCH 3/4] cxl, acpi/hmat: Update CXL access coordinates
directly instead of through HMAT
On 8/14/25 3:33 PM, dan.j.williams@...el.com wrote:
> Dave Jiang wrote:
>> The current implementation of CXL memory hotplug notifier gets called
>> before the HMAT memory hotplug notifier. The CXL driver calculates the
>> access coordinates (bandwidth and latency values) for the CXL end to
>> end path (i.e. CPU to endpoint). When the CXL region is onlined, the CXL
>> memory hotplug notifier writes the access coordinates to the HMAT target
>> structs. Then the HMAT memory hotplug notifier is called and it creates
>> the access coordinates for the node sysfs attributes.
>
> Perhaps summarize quickly here the before and after of sysfs, so people
> know if they are impacted by this bug, and backporters can verify they
> fixed it?
ok
>
>> The original intent of the 'ext_updated' flag in HMAT handling code was to
>> stop HMAT memory hotplug callback from clobbering the access coordinates
>> after CXL has injected its calculated coordinates and replaced the generic
>> target access coordinates provided by the HMAT table in the HMAT target
>> structs. However the flag is hacky at best and blocks the updates from
>> other CXL regions that are onlined in the same node later on. Remove the
>> 'ext_updated' flag usage and just update the access coordinates for the
>> nodes directly without touching HMAT target data.
>>
>> The hotplug memory callback ordering is changed. Instead of changing CXL,
>> move HMAT back so there's room for the levels rather than have CXL share
>> the same level as SLAB_CALLBACK_PRI. The change will resulting in the CXL
>> callback to be executed after the HMAT callback.
>>
>> With the change, the CXL hotplug memory notifier runs after the HMAT
>> callback. The HMAT callback will create the node sysfs attributes for
>> access coordinates. The CXL callback will write the access coordinates to
>> the now created node sysfs attributes directly and will not pollute the
>> HMAT target values.
>>
>> Fixes: debdce20c4f2 ("cxl/region: Deal with numa nodes not enumerated by SRAT")
>
> Why that one and not?
>
> 067353a46d8c cxl/region: Add memory hotplug notifier for cxl region
I think I grabbed the wrong line for 'git blame'.
>
> It is the ext_updated machinery that is the main problem that messes up
> sysfs, right?
>
> ...and per the backport concern this should be cc: stable as well.
>
> Other than that you can add:
>
> Reviewed-by: Dan Williams <dan.j.williams@...el.com>
Powered by blists - more mailing lists