lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <402170e6-c49f-4d28-a010-eb253fc2f923@redhat.com>
Date: Wed, 8 Oct 2025 10:58:23 +0200
From: David Hildenbrand <david@...hat.com>
To: Gregory Price <gourry@...rry.net>, linux-mm@...ck.org
Cc: corbet@....net, muchun.song@...ux.dev, osalvador@...e.de,
 akpm@...ux-foundation.org, hannes@...xchg.org, laoar.shao@...il.com,
 brauner@...nel.org, mclapinski@...gle.com, joel.granados@...nel.org,
 linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
 Mel Gorman <mgorman@...e.de>, Michal Hocko <mhocko@...e.com>,
 Alexandru Moise <00moses.alexander00@...il.com>,
 Mike Kravetz <mike.kravetz@...cle.com>, David Rientjes <rientjes@...gle.com>
Subject: Re: [PATCH] Revert "mm, hugetlb: remove hugepages_treat_as_movable
 sysctl"

On 07.10.25 23:44, Gregory Price wrote:
> This reverts commit d6cb41cc44c63492702281b1d329955ca767d399.
> 
> This sysctl provides some flexibility between multiple requirements which
> are difficult to square without adding significantly more complexity.
> 
> 1) onlining memory in ZONE_MOVABLE to maintain hotplug compatibility
> 2) onlining memory in ZONE_MOVABLE to prevent GFP_KERNEL usage
> 3) passing NUMA structure through to a virtual machine (node0=vnode0,
>     node1=vnode1) so a guest can make good placement decisions.
> 4) utilizing 1GB hugepages for VM host memory to reduce TLB pressure
> 5) Managing device memory after init-time to avoid incidental usage
>     at boot (due to being placed in ZONE_NORMAL), or to provide users
>     configuration flexibility.
> 
> When device-hotplugged memory does not require hot-unplug assurances,
> there is no reason to avoid allowing otherwise non-migratable hugepages
> in this zone.  This allows for allocation of 1GB gigantic pages for VMs
> with existing mechanisms.
> 
> Boot-time CMA is not possible for driver-managed hotplug memory, as CMA
> requires the memory to be registered as SystemRAM at boot time.
> 
> Updated the code to land in appropriate locations since it all moved.
> Updated the documentation to add more context when this is useful.
> 
> Cc: David Hildenbrand <david@...hat.com>
> Cc: Mel Gorman <mgorman@...e.de>
> Cc: Michal Hocko <mhocko@...e.com>
> Cc: Alexandru Moise <00moses.alexander00@...il.com>
> Cc: Mike Kravetz <mike.kravetz@...cle.com>
> Suggested-by: David Rientjes <rientjes@...gle.com>
> Signed-off-by: Gregory Price <gourry@...rry.net>
> Link: https://lore.kernel.org/all/20180201193132.Hk7vI_xaU%25akpm@linux-foundation.org/
> ---
>   Documentation/admin-guide/sysctl/vm.rst | 31 +++++++++++++++++++++++++
>   include/linux/hugetlb.h                 |  4 +++-
>   mm/hugetlb.c                            |  9 +++++++
>   3 files changed, 43 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/admin-guide/sysctl/vm.rst b/Documentation/admin-guide/sysctl/vm.rst
> index 4d71211fdad8..c9f26cd447d7 100644
> --- a/Documentation/admin-guide/sysctl/vm.rst
> +++ b/Documentation/admin-guide/sysctl/vm.rst
> @@ -40,6 +40,7 @@ Currently, these files are in /proc/sys/vm:
>   - enable_soft_offline
>   - extfrag_threshold
>   - highmem_is_dirtyable
> +- hugepages_treat_as_movable
>   - hugetlb_shm_group
>   - laptop_mode
>   - legacy_va_layout
> @@ -356,6 +357,36 @@ only use the low memory and they can fill it up with dirty data without
>   any throttling.
>   
>   
> +hugepages_treat_as_movable
> +==========================
> +
> +This parameter controls whether otherwise immovable hugepages (e.g. 1GB
> +gigantic pages) may be allocated from from ZONE_MOVABLE. If set to non-zero,
> +gigantic hugepages can be allocated from ZONE_MOVABLE. ZONE_MOVABLE memory
> +may be created via the kernel boot parameter `kernelcore` or via memory
> +hotplug as discussed in Documentation/admin-guide/mm/memory-hotplug.rst.
> +
> +Support may depend on specific architecture and/or the hugepage size. If
> +a hugepage supports migration, allocation from ZONE_MOVABLE is always
> +enabled (for example 2MB on x86) for the hugepage regardless of the value
> +of this parameter. IOW, this parameter affects only non-migratable hugepages.
> +
> +Assuming that hugepages are not migratable in your system, one usecase of
> +this parameter is that users can make hugepage pool more extensible by
> +enabling the allocation from ZONE_MOVABLE. This is because on ZONE_MOVABLE
> +page reclaim/migration/compaction work more and you can get contiguous
> +memory more likely. Note that using ZONE_MOVABLE for non-migratable
> +hugepages can do harm to other features like memory hotremove (because
> +memory hotremove expects that memory blocks on ZONE_MOVABLE are always
> +removable,) so it's a trade-off responsible for the users.
> +
> +One common use-case of this feature is allocate 1GB gigantic pages for
> +virtual machines from otherwise not-hotplugged memory which has been
> +isolated from kernel allocations by being onlined into ZONE_MOVABLE.
> +These pages tend to be allocated and released more explicitly, and so
> +hotplug can still be achieved with appropriate orchestration.
> +
> +
>   hugetlb_shm_group
>   =================
>   
> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
> index 526d27e88b3b..bbaa1b4908b6 100644
> --- a/include/linux/hugetlb.h
> +++ b/include/linux/hugetlb.h
> @@ -172,6 +172,7 @@ bool hugetlbfs_pagecache_present(struct hstate *h,
>   
>   struct address_space *hugetlb_folio_mapping_lock_write(struct folio *folio);
>   
> +extern int hugepages_treat_as_movable;
>   extern int sysctl_hugetlb_shm_group;
>   extern struct list_head huge_boot_pages[MAX_NUMNODES];
>   
> @@ -926,7 +927,8 @@ static inline gfp_t htlb_alloc_mask(struct hstate *h)
>   {
>   	gfp_t gfp = __GFP_COMP | __GFP_NOWARN;
>   
> -	gfp |= hugepage_movable_supported(h) ? GFP_HIGHUSER_MOVABLE : GFP_HIGHUSER;
> +	gfp |= (hugepage_movable_supported(h) || hugepages_treat_as_movable) ?
> +	       GFP_HIGHUSER_MOVABLE : GFP_HIGHUSER;

I mean, this is as ugly as it gets.

Can't we just let that old approach RIP where it belongs? :)

If something unmovable, it does not belong on ZONE_MOVABLE, as simple as that.

Something I could sympathize is is treaing gigantic pages that are actually
migratable as movable.


Like

diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 526d27e88b3b2..78da85b1308dd 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -896,37 +896,12 @@ static inline bool hugepage_migration_supported(struct hstate *h)
         return arch_hugetlb_migration_supported(h);
  }
  
-/*
- * Movability check is different as compared to migration check.
- * It determines whether or not a huge page should be placed on
- * movable zone or not. Movability of any huge page should be
- * required only if huge page size is supported for migration.
- * There won't be any reason for the huge page to be movable if
- * it is not migratable to start with. Also the size of the huge
- * page should be large enough to be placed under a movable zone
- * and still feasible enough to be migratable. Just the presence
- * in movable zone does not make the migration feasible.
- *
- * So even though large huge page sizes like the gigantic ones
- * are migratable they should not be movable because its not
- * feasible to migrate them from movable zone.
- */
-static inline bool hugepage_movable_supported(struct hstate *h)
-{
-       if (!hugepage_migration_supported(h))
-               return false;
-
-       if (hstate_is_gigantic(h))
-               return false;
-       return true;
-}
-
  /* Movability of hugepages depends on migration support. */
  static inline gfp_t htlb_alloc_mask(struct hstate *h)
  {
         gfp_t gfp = __GFP_COMP | __GFP_NOWARN;
  
-       gfp |= hugepage_movable_supported(h) ? GFP_HIGHUSER_MOVABLE : GFP_HIGHUSER;
+       gfp |= hugepage_migration_supported(h) ? GFP_HIGHUSER_MOVABLE : GFP_HIGHUSER;
  
         return gfp;
  }


Assume you want to offline part of the ZONE_MOVABLE there might still be sufficient
space to possibly allocate a 1 GiB area elsewhere and actually move the gigantic page.

IIRC, we do the same for memory offlining already.


Now, maybe we want to make the configurable. But then, I would much rather tweak the
hstate_is_gigantic() check in hugepage_movable_supported(). And the parameter
would need a much better name than some "treat as movable".

-- 
Cheers

David / dhildenb


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ