[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <748d40cb-35f4-98d6-a940-055de88bbc8b@redhat.com>
Date: Thu, 12 Oct 2023 10:40:50 +0200
From: David Hildenbrand <david@...hat.com>
To: "Verma, Vishal L" <vishal.l.verma@...el.com>,
"Huang, Ying" <ying.huang@...el.com>
Cc: "Hocko, Michal" <mhocko@...e.com>,
"Jiang, Dave" <dave.jiang@...el.com>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
"osalvador@...e.de" <osalvador@...e.de>,
"linux-cxl@...r.kernel.org" <linux-cxl@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"Williams, Dan J" <dan.j.williams@...el.com>,
"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
"nvdimm@...ts.linux.dev" <nvdimm@...ts.linux.dev>,
"aneesh.kumar@...ux.ibm.com" <aneesh.kumar@...ux.ibm.com>,
"dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
"jmoyer@...hat.com" <jmoyer@...hat.com>,
"Jonathan.Cameron@...wei.com" <Jonathan.Cameron@...wei.com>
Subject: Re: [PATCH v5 1/2] mm/memory_hotplug: split memmap_on_memory requests
across memblocks
On 12.10.23 07:53, Verma, Vishal L wrote:
> On Mon, 2023-10-09 at 17:04 +0200, David Hildenbrand wrote:
>> On 07.10.23 10:55, Huang, Ying wrote:
>>> Vishal Verma <vishal.l.verma@...el.com> writes:
>>>
>>>> @@ -2167,47 +2221,28 @@ static int __ref try_remove_memory(u64 start, u64 size)
>>>> if (rc)
>>>> return rc;
>>>>
>>>> + mem_hotplug_begin();
>>>> +
>>>> /*
>>>> - * We only support removing memory added with MHP_MEMMAP_ON_MEMORY in
>>>> - * the same granularity it was added - a single memory block.
>>>> + * For memmap_on_memory, the altmaps could have been added on
>>>> + * a per-memblock basis. Loop through the entire range if so,
>>>> + * and remove each memblock and its altmap.
>>>> */
>>>> if (mhp_memmap_on_memory()) {
>>>
>>> IIUC, even if mhp_memmap_on_memory() returns true, it's still possible
>>> that the memmap is put in DRAM after [2/2]. So that,
>>> arch_remove_memory() are called for each memory block unnecessarily. Can
>>> we detect this (via altmap?) and call remove_memory_block_and_altmap()
>>> for the whole range?
>>
>> Good point. We should handle memblock-per-memblock onny if we have to
>> handle the altmap. Otherwise, just call a separate function that doesn't
>> care about -- e.g., called remove_memory_blocks_no_altmap().
>>
>> We could simply walk all memory blocks and make sure either all have an
>> altmap or none has an altmap. If there is a mix, we should bail out with
>> WARN_ON_ONCE().
>>
> Ok I think I follow - based on both of these threads, here's my
> understanding in an incremental diff from the original patches (may not
> apply directly as I've already committed changes from the other bits of
> feedback - but this should provide an idea of the direction) -
>
> ---
>
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 507291e44c0b..30addcb063b4 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -2201,6 +2201,40 @@ static void __ref remove_memory_block_and_altmap(u64 start, u64 size)
> }
> }
>
> +static bool memblocks_have_altmaps(u64 start, u64 size)
> +{
> + unsigned long memblock_size = memory_block_size_bytes();
> + u64 num_altmaps = 0, num_no_altmaps = 0;
> + struct memory_block *mem;
> + u64 cur_start;
> + int rc = 0;
> +
> + if (!mhp_memmap_on_memory())
> + return false;
Probably can remove that, checked by the caller. (or drop the one in the
caller)
> +
> + for (cur_start = start; cur_start < start + size;
> + cur_start += memblock_size) {
> + if (walk_memory_blocks(cur_start, memblock_size, &mem,
> + test_has_altmap_cb))
> + num_altmaps++;
> + else
> + num_no_altmaps++;
> + }
You should do that without the outer loop, by doing the counting in the
callback function instead.
> +
> + if (!num_altmaps && num_no_altmaps > 0)
> + return false;
> +
> + if (!num_no_altmaps && num_altmaps > 0)
> + return true;
> +
> + /*
> + * If there is a mix of memblocks with and without altmaps,
> + * something has gone very wrong. WARN and bail.
> + */
> + WARN_ONCE(1, "memblocks have a mix of missing and present altmaps");
It would be better if we could even make try_remove_memory() fail in
this case.
> + return false;
> +}
> +
> static int __ref try_remove_memory(u64 start, u64 size)
> {
> int rc, nid = NUMA_NO_NODE;
> @@ -2230,7 +2264,7 @@ static int __ref try_remove_memory(u64 start, u64 size)
> * a per-memblock basis. Loop through the entire range if so,
> * and remove each memblock and its altmap.
> */
> - if (mhp_memmap_on_memory()) {
> + if (mhp_memmap_on_memory() && memblocks_have_altmaps(start, size)) {
> unsigned long memblock_size = memory_block_size_bytes();
> u64 cur_start;
>
> @@ -2239,7 +2273,8 @@ static int __ref try_remove_memory(u64 start, u64 size)
> remove_memory_block_and_altmap(cur_start,
> memblock_size);
^ probably cleaner move the loop into remove_memory_block_and_altmap()
and call it remove_memory_blocks_and_altmaps(start, size) instead.
> } else {
> - remove_memory_block_and_altmap(start, size);
> + remove_memory_block_devices(start, size);
> + arch_remove_memory(start, size, NULL);
> }
>
> if (IS_ENABLED(CONFIG_ARCH_KEEP_MEMBLOCK)) {
>
--
Cheers,
David / dhildenb
Powered by blists - more mailing lists