[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aMPluIk8EnOuIWbi@li-2b55cdcc-350b-11b2-a85c-a78bff51fc11.ibm.com>
Date: Fri, 12 Sep 2025 11:19:52 +0200
From: Sumanth Korikkar <sumanthk@...ux.ibm.com>
To: David Hildenbrand <david@...hat.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>, linux-mm <linux-mm@...ck.org>,
linux-s390@...r.kernel.org, LKML <linux-kernel@...r.kernel.org>,
Dan Williams <dan.j.williams@...el.com>,
Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
Gerald Schaefer <gerald.schaefer@...ux.ibm.com>,
Heiko Carstens <hca@...ux.ibm.com>, Vasily Gorbik <gor@...ux.ibm.com>,
Alexander Gordeev <agordeev@...ux.ibm.com>
Subject: Re: [PATCH] resource: Improve child resource handling in
release_mem_region_adjustable()
> > lsmem
> > RANGE SIZE STATE REMOVABLE BLOCK
> > 0x0000000000000000-0x000000014fffffff 5.3G online yes 0-41
> > 0x0000000150000000-0x0000000157ffffff 128M offline 42
> > 0x0000000158000000-0x00000002ffffffff 6.6G online yes 43-95
> >
>
> First of all
>
> 1) How are you triggering this :)
>
> 2) Why do you say "and removing the range" when it's still listed in lsmem
> output.
>
> lsmem will only list present memory block devices. So if it's still listed
> there, nothing was removed. Or is that prior to actually removing it.
>
>
> Is this some powerpc dlpar use case?
Hi David,
I am working on the item related to the last discussion - dynamic
runtime (de)configuration of memory on s390:
https://lore.kernel.org/all/aCx-SJdHd-3Z12af@li-2b55cdcc-350b-11b2-a85c-a78bff51fc11.ibm.com/
I came across the problem when I tried to offline and remove the memory
via /sys/firmware/memory interface.
I have also modified lsmem (not yet upstream) to list deconfigured
memory, which currently appears as offline. An additional "configured"
column is also introduced to show the configuration state, but it is not
displayed here yet (without --output-all).
> > 0x0000000150000000-0x0000000157ffffff 128M offline 42
True, this will not be shown with the master lsmem, since the sysfs
entry is removed after deconfiguration.
> Do we need a Fixes: and CC stable?
It will reference commit 825f787bb496 ("resource: add
release_mem_region_adjustable()"). Since the commit already states
"enhance this logic when necessary," I did not add a Fixes tag.
> > Signed-off-by: Sumanth Korikkar <sumanthk@...ux.ibm.com>
> > ---
> > kernel/resource.c | 44 +++++++++++++++++++++++++++++++++++++++-----
> > 1 file changed, 39 insertions(+), 5 deletions(-)
> >
> > diff --git a/kernel/resource.c b/kernel/resource.c
> > index f9bb5481501a..c329b8a4aa2f 100644
> > --- a/kernel/resource.c
> > +++ b/kernel/resource.c
> > @@ -1388,6 +1388,41 @@ void __release_region(struct resource *parent, resource_size_t start,
> > EXPORT_SYMBOL(__release_region);
> > #ifdef CONFIG_MEMORY_HOTREMOVE
> > +static void append_child_to_parent(struct resource *new_parent, struct resource *new_child)
> > +{
> > + struct resource *child;
> > +
> > + child = new_parent->child;
> > + if (child) {
> > + while (child->sibling)
> > + child = child->sibling;
> > + child->sibling = new_child;
>
> Shouldn't we take care of the address ordering here? I guess this works
> because we process them in left-to-right (lowest-to-highest) address.
__request_resource() adds the child resources in the increasing order.
With that, we dont need to check the ordering again here. True, here we
process the child resources from lowest to highest address.
> > + } else {
> > + new_parent->child = new_child;
> > + }
> > + new_child->parent = new_parent;
> > + new_child->sibling = NULL;
> > +}
> > +
> > +static void move_children_to_parent(struct resource *old_parent,
> > + struct resource *new_parent,
> > + resource_size_t split_addr)
>
> I'd call this "reparent_child_resources". But actually the function is
> weird. Because you only reparents some resources from old to now.
>
> Two questions:
>
> a) Is split_addr really required. Couldn't we derive that from "old_parent"
old_parent->end points to old end range before the split, so I think it
doesnt tell where the split boundary is, until __adjust_resource() is
called. Hence, split_addr was added.
> b) Could we somehow make it clearer (variable names etc) that we are only
> reparenting from "left" to "right" (maybe we can find better names).
>
> Something like
>
> /*
> * Reparent all child resources that no longer belong to "low" after
> * a split to "high". Note that "high" does not have any children,
> * because "low" is the adjusted original resource and "high" is a new
> * resource.
> */
> static void reparent_children_after_split(struct resource *low,
> struct resource *high)
Nice. I will use this convention with split_addr.
> > *
> > * Note:
> > * - Additional release conditions, such as overlapping region, can be
> > * supported after they are confirmed as valid cases.
> > - * - When a busy memory resource gets split into two entries, the code
> > - * assumes that all children remain in the lower address entry for
> > - * simplicity. Enhance this logic when necessary.
> > + * - When a busy memory resource gets split into two entries, its children is
>
> s/is/are/
Will correct it. Thank you
Powered by blists - more mailing lists