[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <fca5fe72-55a8-456c-a179-56776848091d@redhat.com>
Date: Tue, 20 May 2025 19:55:18 +0200
From: David Hildenbrand <david@...hat.com>
To: Sumanth Korikkar <sumanthk@...ux.ibm.com>
Cc: linux-mm <linux-mm@...ck.org>, Andrew Morton <akpm@...ux-foundation.org>,
Oscar Salvador <osalvador@...e.de>,
Gerald Schaefer <gerald.schaefer@...ux.ibm.com>,
Heiko Carstens <hca@...ux.ibm.com>, Vasily Gorbik <gor@...ux.ibm.com>,
Alexander Gordeev <agordeev@...ux.ibm.com>,
linux-s390 <linux-s390@...r.kernel.org>, LKML <linux-kernel@...r.kernel.org>
Subject: Re: [RFC PATCH 1/4] mm/memory_hotplug: Add interface for runtime
(de)configuration of memory
On 20.05.25 15:06, Sumanth Korikkar wrote:
>> Maybe "standby memory" might make it clearer. The concept is s390x specific,
>> and it will likely stay s390x specific.
>>
>> I like the idea (frontend/tool interface), all we need is a way for these
>> commands to detect ranges and turn them from standby into usable memory.
>>
>>>
>>> The user can still determine the available memory ranges and make them
>>> configurable using tools like lsmem or chmem with this approach atleast
>>> on s390 with this approach.
>>>
>>>> My thinking was that s390x would expose the standby memory ranges somewhere
>>>> arch specific in sysfs. From there, one could simply trigger the adding
>>>> (maybe specifying e.g, memmap_on_memory) of selected ranges.
>
> Hi David,
Hi!
>
> Sorry for the late reply.
>
> Potential design approach for enabling dynamic configuration and
> deconfiguration of hotplug memory with support for both altmap and
> non-altmap usage.
>
> Introduces flexibility, allowing users to specify at runtime which
> memory ranges should utilize altmap, rather than relying on a static
> system-wide setting that applies uniformly to all hotplugged memory.
>
> Introduce new interface on s390 with the following attributes:
>
> 1) Attribute1:
> /sys/firmware/memory/block_size_bytes
I assume this will be the storage increment size.
> > 2) Attribute2:
> /sys/firmware/memory/memoryX/config
> echo 0 > /sys/firmware/memory/memoryX/config -> deconfigure memoryX
> echo 1 > /sys/firmware/memory/memoryX/config -> configure memoryX
And these would configure individual storage increments, essentially
calling add_memory() and (if possible because we could offline the
memory) remove_memory().
>
> 3) Attribute3:
> /sys/firmware/memory/memoryX/altmap_required
> echo 0 > /sys/firmware/memory/memoryX/altmap_required -> noaltmap
> echo 1 > /sys/firmware/memory/memoryX/altmap_required -> altmap
> echo N > /sys/firmware/memory/memoryX/altmap_required -> variable size
> altmap grouping (possible future requirements),
> where N specifies the number of memory blocks that the current
> memory block manages altmap. There are two possibilities here:
> * If the altmap cannot fit entirely within memoryX, it can
> extend into memoryX+1, meaning the altmap metadata will span
> across multiple memory blocks.
> * If the altmap for memory range cannot fit within memoryX,
> then config will return -EINVAL.
Do we really still need this when we can configure/deconfigure?
I mean, on s390x, the most important use case for memmap-on-memory was
not wasting memory for offline memory blocks.
But with a configuration interface like this ... the only benefit is
being able to more-reliably add memory in low-memory conditions. An
unlikely scenario with standby storage IMHO.
Note that I dislike exposing "altmap" to the user :) Dax calls it
"memmap_on_memory", and it is a device attrivute.
As soon as we go down that path we have the complexity of having to
group memory blocks etc, and if we can just not go down that path right
now it will make things a lot simpler.
(especially, as you document above, the semantics become *really* weird)
As yet another point, I am not sure if someone really needs a per-memory
block control of the memmap-on-memory feature.
If we could simplify here, that would be great ...
>
> NOTE: “altmap_required” attribute must be set before setting the block as
> configured via “config” attribute. (Dependancy)
>
> 4) Additionally add the patch to check if the memory block is configured
> with altmap or not. Similar to [RFC PATCH 2/4] mm/memory_hotplug: Add
> memory block altmap sysfs attribute.
>
> Most of the code changes will be s390 specific with this interface.
>
> Request your inputs on the potential interface. Thank you.
>
> Other questions:
> 1. I’m just wondering how variable-sized altmap grouping will be
> structured in the future. Is it organized by grouping the memory blocks
> that require altmap, with the first memory block storing the altmap
> metadata for all of them? Or is it possible for the altmap metadata to
> span across multiple memory blocks?
That exactly is unclear, which is why we should probably avoid doing
that for now. Also, with other developments happening (memdesc), and
ongoing effort to shrink "struct page", maybe we will not even need most
of this in the future?
>
> 2. OR, will dedicated memory blocks be used exclusively for altmap
> metadata, which the memory blocks requiring altmap would then consume? (To
> prevent fragmentation) ?
One idea I had was that you would do the add_memory() in bigger granularity.
Then, the memory blocks hosting the memmap would have to get onlined
first. And offlining of them would fail until all dependent ones were
offlined.
That would at least limit the impact.
Then, the question would be, how could you "group" these memory blocks
from your interface to do a single add_memory() etc.
But again, maybe we can leave that part out for now ...
--
Cheers,
David / dhildenb
Powered by blists - more mailing lists