[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a6e9ef76-cdb4-4dde-a7e9-955549f3a825@amd.com>
Date: Mon, 29 Sep 2025 21:01:39 -0700
From: "Koralahalli Channabasappa, Smita" <skoralah@....com>
To: "Zhijian Li (Fujitsu)" <lizhijian@...itsu.com>,
Smita Koralahalli <Smita.KoralahalliChannabasappa@....com>,
"linux-cxl@...r.kernel.org" <linux-cxl@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"nvdimm@...ts.linux.dev" <nvdimm@...ts.linux.dev>,
"linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>,
"linux-pm@...r.kernel.org" <linux-pm@...r.kernel.org>
Cc: Davidlohr Bueso <dave@...olabs.net>,
Jonathan Cameron <jonathan.cameron@...wei.com>,
Dave Jiang <dave.jiang@...el.com>,
Alison Schofield <alison.schofield@...el.com>,
Vishal Verma <vishal.l.verma@...el.com>, Ira Weiny <ira.weiny@...el.com>,
Dan Williams <dan.j.williams@...el.com>, Matthew Wilcox
<willy@...radead.org>, Jan Kara <jack@...e.cz>,
"Rafael J . Wysocki" <rafael@...nel.org>, Len Brown <len.brown@...el.com>,
Pavel Machek <pavel@...nel.org>, Li Ming <ming.li@...omail.com>,
Jeff Johnson <jeff.johnson@....qualcomm.com>,
Ying Huang <huang.ying.caritas@...il.com>,
"Xingtao Yao (Fujitsu)" <yaoxt.fnst@...itsu.com>,
Peter Zijlstra <peterz@...radead.org>, Greg KH <gregkh@...uxfoundation.org>,
Nathan Fontenot <nathan.fontenot@....com>,
Terry Bowman <terry.bowman@....com>, Robert Richter <rrichter@....com>,
Benjamin Cheatham <benjamin.cheatham@....com>,
PradeepVineshReddy Kodamati <PradeepVineshReddy.Kodamati@....com>
Subject: Re: [PATCH 1/6] dax/hmem, e820, resource: Defer Soft Reserved
registration until hmem is ready
Hi Zhijian,
Sorry for the delay here.
On 8/31/2025 7:59 PM, Zhijian Li (Fujitsu) wrote:
>
>
> On 22/08/2025 11:41, Smita Koralahalli wrote:
>> Insert Soft Reserved memory into a dedicated soft_reserve_resource tree
>> instead of the iomem_resource tree at boot.
>>
>> Publishing Soft Reserved ranges into iomem too early causes conflicts with
>> CXL hotplug and region assembly failure, especially when Soft Reserved
>> overlaps CXL regions.
>>
>> Re-inserting these ranges into iomem will be handled in follow-up patches,
>> after ensuring CXL window publication ordering is stabilized and when the
>> dax_hmem is ready to consume them.
>>
>> This avoids trimming or deleting resources later and provides a cleaner
>> handoff between EFI-defined memory and CXL resource management.
>>
>> Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@....com>
>> Signed-off-by: Dan Williams <dan.j.williams@...el.com>
>> ---
>> arch/x86/kernel/e820.c | 2 +-
>> drivers/dax/hmem/device.c | 4 +--
>> drivers/dax/hmem/hmem.c | 8 +++++
>> include/linux/ioport.h | 24 +++++++++++++
>> kernel/resource.c | 73 +++++++++++++++++++++++++++++++++------
>> 5 files changed, 97 insertions(+), 14 deletions(-)
>>
>> diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
>> index c3acbd26408b..aef1ff2cabda 100644
>> --- a/arch/x86/kernel/e820.c
>> +++ b/arch/x86/kernel/e820.c
>> @@ -1153,7 +1153,7 @@ void __init e820__reserve_resources_late(void)
>> res = e820_res;
>> for (i = 0; i < e820_table->nr_entries; i++) {
>> if (!res->parent && res->end)
>> - insert_resource_expand_to_fit(&iomem_resource, res);
>> + insert_resource_late(res);
>> res++;
>> }
>>
>> diff --git a/drivers/dax/hmem/device.c b/drivers/dax/hmem/device.c
>> index f9e1a76a04a9..22732b729017 100644
>> --- a/drivers/dax/hmem/device.c
>> +++ b/drivers/dax/hmem/device.c
>> @@ -83,8 +83,8 @@ static __init int hmem_register_one(struct resource *res, void *data)
>>
>> static __init int hmem_init(void)
>> {
>> - walk_iomem_res_desc(IORES_DESC_SOFT_RESERVED,
>> - IORESOURCE_MEM, 0, -1, NULL, hmem_register_one);
>> + walk_soft_reserve_res_desc(IORES_DESC_SOFT_RESERVED, IORESOURCE_MEM, 0,
>> + -1, NULL, hmem_register_one);
>> return 0;
>> }
>>
>> diff --git a/drivers/dax/hmem/hmem.c b/drivers/dax/hmem/hmem.c
>> index c18451a37e4f..d5b8f06d531e 100644
>> --- a/drivers/dax/hmem/hmem.c
>> +++ b/drivers/dax/hmem/hmem.c
>> @@ -73,10 +73,18 @@ static int hmem_register_device(struct device *host, int target_nid,
>> return 0;
>> }
>>
>> +#ifdef CONFIG_EFI_SOFT_RESERVE
>
>
> Note that dax_kmem currently depends on CONFIG_EFI_SOFT_RESERVED, so this conditional check may be redundant.
Removed in v2.
>
>
>
>> + rc = region_intersects_soft_reserve(res->start, resource_size(res),
>> + IORESOURCE_MEM,
>> + IORES_DESC_SOFT_RESERVED);
>> + if (rc != REGION_INTERSECTS)
>> + return 0;
>> +#else
>> rc = region_intersects(res->start, resource_size(res), IORESOURCE_MEM,
>> IORES_DESC_SOFT_RESERVED);
>> if (rc != REGION_INTERSECTS)
>> return 0;
>> +#endif
>>
>
> Additionally, please add a TODO note here (e.g., "Add soft-reserved memory back to iomem").
Added.
>
>
>> id = memregion_alloc(GFP_KERNEL);
>> if (id < 0) {
>> diff --git a/include/linux/ioport.h b/include/linux/ioport.h
>> index e8b2d6aa4013..889bc4982777 100644
>> --- a/include/linux/ioport.h
>> +++ b/include/linux/ioport.h
>> @@ -232,6 +232,9 @@ struct resource_constraint {
>> /* PC/ISA/whatever - the normal PC address spaces: IO and memory */
>> extern struct resource ioport_resource;
>> extern struct resource iomem_resource;
>> +#ifdef CONFIG_EFI_SOFT_RESERVE
>> +extern struct resource soft_reserve_resource;
>> +#endif
>>
>> extern struct resource *request_resource_conflict(struct resource *root, struct resource *new);
>> extern int request_resource(struct resource *root, struct resource *new);
>> @@ -255,6 +258,22 @@ int adjust_resource(struct resource *res, resource_size_t start,
>> resource_size_t size);
>> resource_size_t resource_alignment(struct resource *res);
>>
>> +
>> +#ifdef CONFIG_EFI_SOFT_RESERVE
>> +static inline void insert_resource_late(struct resource *new)
>> +{
>> + if (new->desc == IORES_DESC_SOFT_RESERVED)
>> + insert_resource_expand_to_fit(&soft_reserve_resource, new);
>> + else
>> + insert_resource_expand_to_fit(&iomem_resource, new);
>> +}
>> +#else
>> +static inline void insert_resource_late(struct resource *new)
>> +{
>> + insert_resource_expand_to_fit(&iomem_resource, new);
>> +}
>> +#endif
>> +
>> /**
>> * resource_set_size - Calculate resource end address from size and start
>> * @res: Resource descriptor
>> @@ -409,6 +428,11 @@ walk_system_ram_res_rev(u64 start, u64 end, void *arg,
>> extern int
>> walk_iomem_res_desc(unsigned long desc, unsigned long flags, u64 start, u64 end,
>> void *arg, int (*func)(struct resource *, void *));
>> +int walk_soft_reserve_res_desc(unsigned long desc, unsigned long flags,
>> + u64 start, u64 end, void *arg,
>> + int (*func)(struct resource *, void *));
>> +int region_intersects_soft_reserve(resource_size_t start, size_t size,
>> + unsigned long flags, unsigned long desc);
>>
>> struct resource *devm_request_free_mem_region(struct device *dev,
>> struct resource *base, unsigned long size);
>> diff --git a/kernel/resource.c b/kernel/resource.c
>> index f9bb5481501a..8479a99441e2 100644
>> --- a/kernel/resource.c
>> +++ b/kernel/resource.c
>> @@ -321,13 +321,14 @@ static bool is_type_match(struct resource *p, unsigned long flags, unsigned long
>> }
>>
>> /**
>> - * find_next_iomem_res - Finds the lowest iomem resource that covers part of
>> - * [@start..@end].
>> + * find_next_res - Finds the lowest resource that covers part of
>> + * [@start..@end].
>> *
>> * If a resource is found, returns 0 and @*res is overwritten with the part
>> * of the resource that's within [@start..@end]; if none is found, returns
>> * -ENODEV. Returns -EINVAL for invalid parameters.
>> *
>> + * @parent: resource tree root to search
>> * @start: start address of the resource searched for
>> * @end: end address of same resource
>> * @flags: flags which the resource must have
>> @@ -337,9 +338,9 @@ static bool is_type_match(struct resource *p, unsigned long flags, unsigned long
>> * The caller must specify @start, @end, @flags, and @desc
>> * (which may be IORES_DESC_NONE).
>> */
>> -static int find_next_iomem_res(resource_size_t start, resource_size_t end,
>> - unsigned long flags, unsigned long desc,
>> - struct resource *res)
>> +static int find_next_res(struct resource *parent, resource_size_t start,
>> + resource_size_t end, unsigned long flags,
>> + unsigned long desc, struct resource *res)
>> {
>> struct resource *p;
>>
>> @@ -351,7 +352,7 @@ static int find_next_iomem_res(resource_size_t start, resource_size_t end,
>>
>> read_lock(&resource_lock);
>>
>> - for_each_resource(&iomem_resource, p, false) {
>> + for_each_resource(parent, p, false) {
>> /* If we passed the resource we are looking for, stop */
>> if (p->start > end) {
>> p = NULL;
>> @@ -382,16 +383,23 @@ static int find_next_iomem_res(resource_size_t start, resource_size_t end,
>> return p ? 0 : -ENODEV;
>> }
>>
>> -static int __walk_iomem_res_desc(resource_size_t start, resource_size_t end,
>> - unsigned long flags, unsigned long desc,
>> - void *arg,
>> - int (*func)(struct resource *, void *))
>> +static int find_next_iomem_res(resource_size_t start, resource_size_t end,
>> + unsigned long flags, unsigned long desc,
>> + struct resource *res)
>> +{
>> + return find_next_res(&iomem_resource, start, end, flags, desc, res);
>> +}
>> +
>> +static int walk_res_desc(struct resource *parent, resource_size_t start,
>> + resource_size_t end, unsigned long flags,
>> + unsigned long desc, void *arg,
>> + int (*func)(struct resource *, void *))
>> {
>> struct resource res;
>> int ret = -EINVAL;
>>
>> while (start < end &&
>> - !find_next_iomem_res(start, end, flags, desc, &res)) {
>> + !find_next_res(parent, start, end, flags, desc, &res)) {
>> ret = (*func)(&res, arg);
>> if (ret)
>> break;
>> @@ -402,6 +410,15 @@ static int __walk_iomem_res_desc(resource_size_t start, resource_size_t end,
>> return ret;
>> }
>>
>> +static int __walk_iomem_res_desc(resource_size_t start, resource_size_t end,
>> + unsigned long flags, unsigned long desc,
>> + void *arg,
>> + int (*func)(struct resource *, void *))
>> +{
>> + return walk_res_desc(&iomem_resource, start, end, flags, desc, arg, func);
>> +}
>> +
>> +
>> /**
>> * walk_iomem_res_desc - Walks through iomem resources and calls func()
>> * with matching resource ranges.
>> @@ -426,6 +443,26 @@ int walk_iomem_res_desc(unsigned long desc, unsigned long flags, u64 start,
>> }
>> EXPORT_SYMBOL_GPL(walk_iomem_res_desc);
>>
>> +#ifdef CONFIG_EFI_SOFT_RESERVE
>> +struct resource soft_reserve_resource = {
>> + .name = "Soft Reserved",
>> + .start = 0,
>> + .end = -1,
>> + .desc = IORES_DESC_SOFT_RESERVED,
>> + .flags = IORESOURCE_MEM,
>> +};
>> +EXPORT_SYMBOL_GPL(soft_reserve_resource);
>> +
>> +int walk_soft_reserve_res_desc(unsigned long desc, unsigned long flags,
>> + u64 start, u64 end, void *arg,
>> + int (*func)(struct resource *, void *))
>> +{
>> + return walk_res_desc(&soft_reserve_resource, start, end, flags, desc,
>> + arg, func);
>> +}
>> +EXPORT_SYMBOL_GPL(walk_soft_reserve_res_desc);
>> +#endif
>> +
>> /*
>> * This function calls the @func callback against all memory ranges of type
>> * System RAM which are marked as IORESOURCE_SYSTEM_RAM and IORESOUCE_BUSY.
>> @@ -648,6 +685,20 @@ int region_intersects(resource_size_t start, size_t size, unsigned long flags,
>> }
>> EXPORT_SYMBOL_GPL(region_intersects);
>>
>> +int region_intersects_soft_reserve(resource_size_t start, size_t size,
>> + unsigned long flags, unsigned long desc)
>
>
> Shouldn't this function be implemented uder `#if CONFIG_EFI_SOFT_RESERVE`? Otherwise it may cause compilation failures when the config is disabled.
Fixed it.
Thanks
Smita
>
> Thanks
> Zhijian
Powered by blists - more mailing lists