[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251007214412.3832340-1-gourry@gourry.net>
Date: Tue, 7 Oct 2025 17:44:12 -0400
From: Gregory Price <gourry@...rry.net>
To: linux-mm@...ck.org
Cc: corbet@....net,
muchun.song@...ux.dev,
osalvador@...e.de,
david@...hat.com,
akpm@...ux-foundation.org,
hannes@...xchg.org,
laoar.shao@...il.com,
gourry@...rry.net,
brauner@...nel.org,
mclapinski@...gle.com,
joel.granados@...nel.org,
linux-doc@...r.kernel.org,
linux-kernel@...r.kernel.org,
Mel Gorman <mgorman@...e.de>,
Michal Hocko <mhocko@...e.com>,
Alexandru Moise <00moses.alexander00@...il.com>,
Mike Kravetz <mike.kravetz@...cle.com>,
David Rientjes <rientjes@...gle.com>
Subject: [PATCH] Revert "mm, hugetlb: remove hugepages_treat_as_movable sysctl"
This reverts commit d6cb41cc44c63492702281b1d329955ca767d399.
This sysctl provides some flexibility between multiple requirements which
are difficult to square without adding significantly more complexity.
1) onlining memory in ZONE_MOVABLE to maintain hotplug compatibility
2) onlining memory in ZONE_MOVABLE to prevent GFP_KERNEL usage
3) passing NUMA structure through to a virtual machine (node0=vnode0,
node1=vnode1) so a guest can make good placement decisions.
4) utilizing 1GB hugepages for VM host memory to reduce TLB pressure
5) Managing device memory after init-time to avoid incidental usage
at boot (due to being placed in ZONE_NORMAL), or to provide users
configuration flexibility.
When device-hotplugged memory does not require hot-unplug assurances,
there is no reason to avoid allowing otherwise non-migratable hugepages
in this zone. This allows for allocation of 1GB gigantic pages for VMs
with existing mechanisms.
Boot-time CMA is not possible for driver-managed hotplug memory, as CMA
requires the memory to be registered as SystemRAM at boot time.
Updated the code to land in appropriate locations since it all moved.
Updated the documentation to add more context when this is useful.
Cc: David Hildenbrand <david@...hat.com>
Cc: Mel Gorman <mgorman@...e.de>
Cc: Michal Hocko <mhocko@...e.com>
Cc: Alexandru Moise <00moses.alexander00@...il.com>
Cc: Mike Kravetz <mike.kravetz@...cle.com>
Suggested-by: David Rientjes <rientjes@...gle.com>
Signed-off-by: Gregory Price <gourry@...rry.net>
Link: https://lore.kernel.org/all/20180201193132.Hk7vI_xaU%25akpm@linux-foundation.org/
---
Documentation/admin-guide/sysctl/vm.rst | 31 +++++++++++++++++++++++++
include/linux/hugetlb.h | 4 +++-
mm/hugetlb.c | 9 +++++++
3 files changed, 43 insertions(+), 1 deletion(-)
diff --git a/Documentation/admin-guide/sysctl/vm.rst b/Documentation/admin-guide/sysctl/vm.rst
index 4d71211fdad8..c9f26cd447d7 100644
--- a/Documentation/admin-guide/sysctl/vm.rst
+++ b/Documentation/admin-guide/sysctl/vm.rst
@@ -40,6 +40,7 @@ Currently, these files are in /proc/sys/vm:
- enable_soft_offline
- extfrag_threshold
- highmem_is_dirtyable
+- hugepages_treat_as_movable
- hugetlb_shm_group
- laptop_mode
- legacy_va_layout
@@ -356,6 +357,36 @@ only use the low memory and they can fill it up with dirty data without
any throttling.
+hugepages_treat_as_movable
+==========================
+
+This parameter controls whether otherwise immovable hugepages (e.g. 1GB
+gigantic pages) may be allocated from from ZONE_MOVABLE. If set to non-zero,
+gigantic hugepages can be allocated from ZONE_MOVABLE. ZONE_MOVABLE memory
+may be created via the kernel boot parameter `kernelcore` or via memory
+hotplug as discussed in Documentation/admin-guide/mm/memory-hotplug.rst.
+
+Support may depend on specific architecture and/or the hugepage size. If
+a hugepage supports migration, allocation from ZONE_MOVABLE is always
+enabled (for example 2MB on x86) for the hugepage regardless of the value
+of this parameter. IOW, this parameter affects only non-migratable hugepages.
+
+Assuming that hugepages are not migratable in your system, one usecase of
+this parameter is that users can make hugepage pool more extensible by
+enabling the allocation from ZONE_MOVABLE. This is because on ZONE_MOVABLE
+page reclaim/migration/compaction work more and you can get contiguous
+memory more likely. Note that using ZONE_MOVABLE for non-migratable
+hugepages can do harm to other features like memory hotremove (because
+memory hotremove expects that memory blocks on ZONE_MOVABLE are always
+removable,) so it's a trade-off responsible for the users.
+
+One common use-case of this feature is allocate 1GB gigantic pages for
+virtual machines from otherwise not-hotplugged memory which has been
+isolated from kernel allocations by being onlined into ZONE_MOVABLE.
+These pages tend to be allocated and released more explicitly, and so
+hotplug can still be achieved with appropriate orchestration.
+
+
hugetlb_shm_group
=================
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 526d27e88b3b..bbaa1b4908b6 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -172,6 +172,7 @@ bool hugetlbfs_pagecache_present(struct hstate *h,
struct address_space *hugetlb_folio_mapping_lock_write(struct folio *folio);
+extern int hugepages_treat_as_movable;
extern int sysctl_hugetlb_shm_group;
extern struct list_head huge_boot_pages[MAX_NUMNODES];
@@ -926,7 +927,8 @@ static inline gfp_t htlb_alloc_mask(struct hstate *h)
{
gfp_t gfp = __GFP_COMP | __GFP_NOWARN;
- gfp |= hugepage_movable_supported(h) ? GFP_HIGHUSER_MOVABLE : GFP_HIGHUSER;
+ gfp |= (hugepage_movable_supported(h) || hugepages_treat_as_movable) ?
+ GFP_HIGHUSER_MOVABLE : GFP_HIGHUSER;
return gfp;
}
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 753f99b4c718..4b2213ccbb29 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -55,6 +55,8 @@
#include "hugetlb_cma.h"
#include <linux/page-isolation.h>
+int hugepages_treat_as_movable;
+
int hugetlb_max_hstate __read_mostly;
unsigned int default_hstate_idx;
struct hstate hstates[HUGE_MAX_HSTATE];
@@ -5195,6 +5197,13 @@ static const struct ctl_table hugetlb_table[] = {
.mode = 0644,
.proc_handler = hugetlb_overcommit_handler,
},
+ {
+ .procname = "hugepages_treat_as_movable",
+ .data = &hugepages_treat_as_movable,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = proc_dointvec,
+ },
};
static void __init hugetlb_sysctl_init(void)
--
2.51.0
Powered by blists - more mailing lists