lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2cb55d76-4da7-4ebe-b23b-0abbc4d963f3@kernel.org>
Date: Mon, 9 Feb 2026 13:44:45 +0100
From: "David Hildenbrand (Arm)" <david@...nel.org>
To: Mike Rapoport <rppt@...nel.org>
Cc: Tianyou Li <tianyou.li@...el.com>, Oscar Salvador <osalvador@...e.de>,
 Wei Yang <richard.weiyang@...il.com>, Michal Hocko <mhocko@...e.com>,
 linux-mm@...ck.org, Yong Hu <yong.hu@...el.com>,
 Nanhai Zou <nanhai.zou@...el.com>, Yuan Liu <yuan1.liu@...el.com>,
 Tim Chen <tim.c.chen@...ux.intel.com>, Qiuxu Zhuo <qiuxu.zhuo@...el.com>,
 Yu C Chen <yu.c.chen@...el.com>, Pan Deng <pan.deng@...el.com>,
 Chen Zhang <zhangchen.kidd@...com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v9 2/2] mm/memory hotplug/unplug: Optimize
 zone->contiguous update when changes pfn range

On 2/9/26 11:52, David Hildenbrand (Arm) wrote:
> On 2/8/26 20:39, Mike Rapoport wrote:
>> On Sat, Feb 07, 2026 at 12:00:09PM +0100, David Hildenbrand (Arm) wrote:
>>>
>>> Thanks for all your work on this and sorry for being slower with
>>> review the last month.
>>>
>>> While I was in the shower I was thinking about how much I hate
>>> zone->contiguous + the pageblock walking, and how we could just get
>>> rid of it.
>>>
>>> You know, just what you do while having a relaxing shower.
>>>
>>>
>>> And I was wondering:
>>>
>>> (a) in which case would we have zone_spanned_pages == zone_present_pages
>>> and the zone *not* being contiguous? I assume this just cannot happen,
>>> otherwise BUG.
>>>
>>> (b) in which case would we have zone_spanned_pages != zone_present_pages
>>> and the zone *being* contiguous? I assume in some cases where we have 
>>> small
>>> holes within a pageblock?
>>>
>>> Reading the doc of __pageblock_pfn_to_page(), there are some weird
>>> scenarios with holes in pageblocks.
>> It seems that "zone->contigous" is really bad name for what this thing
>> represents.
>>
>> tl;dr I don't think zone_spanned_pages == zone_present_pages is 
>> related to
>> zone->contigous at all :)
> 
> My point in (a) was that with "zone_spanned_pages == zone_present_pages" 
> there are no holes so -> contiguous.
> 
> (b), and what I said further below, is exactly about memory holes where 
> we have a memmap, but it's not present memory.
> 
>>
>> If you look at pageblock_pfn_to_page() and __pageblock_pfn_to_page(), the
>> check for zone->contigous should guarantee that the entire pageblock 
>> has a
>> valid memory map and that the entire pageblock fits a zone and does not
>> cross zone/node boundaries.
> 
> Right. But that must hold for each and ever pageblock in the spanned 
> zone range for it to be contiguous.
> 
> zone->contigous tells you "pfn_to_page()" is valid on the complete zone 
> range"
> 
> That's why set_zone_contiguous() probes __pageblock_pfn_to_page() on ech 
> and ever pageblock.
> 
>>
>> For coldplug memory the memory map is valid for every section that has
>> present memory, i.e. even it there is a hole in a section, it's memory 
>> map
>> will be populated and will have struct pages.
> 
> There is this sub-section thing, and holes larger than a section might 
> not have a memmap (unless reserved I guess).
> 
>>
>> When zone->contigous is false, the slow path in __pageblock_pfn_to_page()
>> essentially checks if the first page in a pageblock is online and if 
>> first
>> and last pages are in the zone being compacted.
>> AFAIU, in the hotplug case an entire pageblock is always onlined to the
>> same zone, so zone->contigous won't change after the hotplug is complete.
> 
> I think you are missing a point: hotp(un)plug might create holes in the 
> zone span. Then, pfn_to_page() is no longer valid to be called on 
> arbitrary pageblocks within the zone.
> 
>>
>> We might set it to false in the beginning of the hotplug to avoid 
>> scanning
>> offline pages, although I'm not sure if it's possible.
>>
>> But in the end of hotplug we can simply restore the old value and move 
>> on.
> 
> No, you might create holes.
> 
>>
>> For the coldplug case I'm also not sure it's worth the hassle, we could
>> just let compaction scan a few more pfns for those rare weird pageblocks
>> and bail out on wrong page conditions.
> 
> To recap:
> 
> My idea is that "zone_spanned_pages == zone_present_pages" tells you 
> that the zone is contiguous because there are no holes.
> 
> To handle "non-memory with a struct page", you'd have to check
> 
>      "zone_spanned_pages == zone_present_pages +
>           zone_non_present_memmap_pages"
> 
> Or shorter
> 
>      "zone_spanned_pages == zone_pages_with_memmap"
> 
> Then, pfn_to_page() is valid within the complete zone.
> 
> The question is how to best calculate the "zone_pages_with_memmap" 
> during boot.
> 
> During hot(un)plug we only add/remove zone_present_pages. The 
> zone_non_present_memmap_pages will not change due to hot(un)plug later.
> 

The following hack does the trick. But

(a) I wish we could get rid of the pageblock walking in calc_online_pages().
(b) "online_pages" has weird semantics due to the pageblock handling.
     "online_pageblock_pages"? not sure.
(c) Calculating "online_pages" when we know there is a hole does not make sense,
     as we could just keep it 0 if there are holes and simply set it to
     zone->online_pageblock_pages->zone->spanned_pages in case all are online.


 From d4cb825e91a6363afc68fb994c5d9b29c38c5f42 Mon Sep 17 00:00:00 2001
From: "David Hildenbrand (Arm)" <david@...nel.org>
Date: Mon, 9 Feb 2026 13:40:24 +0100
Subject: [PATCH] tmp

Signed-off-by: David Hildenbrand (Arm) <david@...nel.org>
---
  include/linux/mmzone.h | 25 +++++++++++++++++++++++--
  mm/internal.h          |  8 +-------
  mm/memory_hotplug.c    | 20 ++++++--------------
  mm/mm_init.c           | 12 ++++++------
  4 files changed, 36 insertions(+), 29 deletions(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index fc5d6c88d2f0..3f7d8d88c597 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -943,6 +943,11 @@ struct zone {
  	 * cma pages is present pages that are assigned for CMA use
  	 * (MIGRATE_CMA).
  	 *
+	 * online_pages is pages within the zone that have an online memmap.
+	 * online_pages include present pages and memory holes that have a
+	 * memmap. When spanned_pages == online_pages, pfn_to_page() can be
+	 * performed without further checks on any pfn within the zone span.
+	 *
  	 * So present_pages may be used by memory hotplug or memory power
  	 * management logic to figure out unmanaged pages by checking
  	 * (present_pages - managed_pages). And managed_pages should be used
@@ -967,6 +972,7 @@ struct zone {
  	atomic_long_t		managed_pages;
  	unsigned long		spanned_pages;
  	unsigned long		present_pages;
+	unsigned long		online_pages;
  #if defined(CONFIG_MEMORY_HOTPLUG)
  	unsigned long		present_early_pages;
  #endif
@@ -1051,8 +1057,6 @@ struct zone {
  	bool			compact_blockskip_flush;
  #endif
  
-	bool			contiguous;
-
  	CACHELINE_PADDING(_pad3_);
  	/* Zone statistics */
  	atomic_long_t		vm_stat[NR_VM_ZONE_STAT_ITEMS];
@@ -1124,6 +1128,23 @@ static inline bool zone_spans_pfn(const struct zone *zone, unsigned long pfn)
  	return zone->zone_start_pfn <= pfn && pfn < zone_end_pfn(zone);
  }
  
+/**
+ * zone_is_contiguous - test whether a zone is contiguous
+ * @zone: the zone to test.
+ *
+ * In a contiguous zone, it is valid to call pfn_to_page() on any pfn in the
+ * spanned zone without requiting pfn_valid() or pfn_to_online_page() checks.
+ *
+ * Returns: true if contiguous, otherwise false.
+ */
+static inline bool zone_is_contiguous(const struct zone *zone)
+{
+	return READ_ONCE(zone->spanned_pages) == READ_ONCE(zone->online_pages);
+}
+
  static inline bool zone_is_initialized(const struct zone *zone)
  {
  	return zone->initialized;
diff --git a/mm/internal.h b/mm/internal.h
index f35dbcf99a86..6062f9b8ee62 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -716,21 +716,15 @@ extern struct page *__pageblock_pfn_to_page(unsigned long start_pfn,
  static inline struct page *pageblock_pfn_to_page(unsigned long start_pfn,
  				unsigned long end_pfn, struct zone *zone)
  {
-	if (zone->contiguous)
+	if (zone_is_contiguous(zone))
  		return pfn_to_page(start_pfn);
  
  	return __pageblock_pfn_to_page(start_pfn, end_pfn, zone);
  }
  
-void set_zone_contiguous(struct zone *zone);
  bool pfn_range_intersects_zones(int nid, unsigned long start_pfn,
  			   unsigned long nr_pages);
  
-static inline void clear_zone_contiguous(struct zone *zone)
-{
-	zone->contiguous = false;
-}
-
  extern int __isolate_free_page(struct page *page, unsigned int order);
  extern void __putback_isolated_page(struct page *page, unsigned int order,
  				    int mt);
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index a63ec679d861..76496c1039a9 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -492,11 +492,11 @@ static void shrink_zone_span(struct zone *zone, unsigned long start_pfn,
  		pfn = find_smallest_section_pfn(nid, zone, end_pfn,
  						zone_end_pfn(zone));
  		if (pfn) {
-			zone->spanned_pages = zone_end_pfn(zone) - pfn;
+			WRITE_ONCE(zone->spanned_pages, zone_end_pfn(zone) - pfn);
  			zone->zone_start_pfn = pfn;
  		} else {
  			zone->zone_start_pfn = 0;
-			zone->spanned_pages = 0;
+			WRITE_ONCE(zone->spanned_pages, 0);
  		}
  	} else if (zone_end_pfn(zone) == end_pfn) {
  		/*
@@ -508,10 +508,10 @@ static void shrink_zone_span(struct zone *zone, unsigned long start_pfn,
  		pfn = find_biggest_section_pfn(nid, zone, zone->zone_start_pfn,
  					       start_pfn);
  		if (pfn)
-			zone->spanned_pages = pfn - zone->zone_start_pfn + 1;
+			WRITE_ONCE(zone->spanned_pages, pfn - zone->zone_start_pfn + 1);
  		else {
  			zone->zone_start_pfn = 0;
-			zone->spanned_pages = 0;
+			WRITE_ONCE(zone->spanned_pages, 0);
  		}
  	}
  }
@@ -565,18 +565,13 @@ void remove_pfn_range_from_zone(struct zone *zone,
  
  	/*
  	 * Zone shrinking code cannot properly deal with ZONE_DEVICE. So
-	 * we will not try to shrink the zones - which is okay as
-	 * set_zone_contiguous() cannot deal with ZONE_DEVICE either way.
+	 * we will not try to shrink the zones.
  	 */
  	if (zone_is_zone_device(zone))
  		return;
  
-	clear_zone_contiguous(zone);
-
  	shrink_zone_span(zone, start_pfn, start_pfn + nr_pages);
  	update_pgdat_span(pgdat);
-
-	set_zone_contiguous(zone);
  }
  
  /**
@@ -753,8 +748,6 @@ void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
  	struct pglist_data *pgdat = zone->zone_pgdat;
  	int nid = pgdat->node_id;
  
-	clear_zone_contiguous(zone);
-
  	if (zone_is_empty(zone))
  		init_currently_empty_zone(zone, start_pfn, nr_pages);
  	resize_zone_range(zone, start_pfn, nr_pages);
@@ -782,8 +775,6 @@ void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
  	memmap_init_range(nr_pages, nid, zone_idx(zone), start_pfn, 0,
  			 MEMINIT_HOTPLUG, altmap, migratetype,
  			 isolate_pageblock);
-
-	set_zone_contiguous(zone);
  }
  
  struct auto_movable_stats {
@@ -1079,6 +1070,7 @@ void adjust_present_page_count(struct page *page, struct memory_group *group,
  	if (early_section(__pfn_to_section(page_to_pfn(page))))
  		zone->present_early_pages += nr_pages;
  	zone->present_pages += nr_pages;
+	WRITE_ONCE(zone->online_pages,  zone->online_pages + nr_pages);
  	zone->zone_pgdat->node_present_pages += nr_pages;
  
  	if (group && movable)
diff --git a/mm/mm_init.c b/mm/mm_init.c
index 2a809cd8e7fa..e33caa6fb6fc 100644
--- a/mm/mm_init.c
+++ b/mm/mm_init.c
@@ -2263,9 +2263,10 @@ void __init init_cma_pageblock(struct page *page)
  }
  #endif
  
-void set_zone_contiguous(struct zone *zone)
+static void calc_online_pages(struct zone *zone)
  {
  	unsigned long block_start_pfn = zone->zone_start_pfn;
+	unsigned long online_pages = 0;
  	unsigned long block_end_pfn;
  
  	block_end_pfn = pageblock_end_pfn(block_start_pfn);
@@ -2277,12 +2278,11 @@ void set_zone_contiguous(struct zone *zone)
  
  		if (!__pageblock_pfn_to_page(block_start_pfn,
  					     block_end_pfn, zone))
-			return;
+			continue;
  		cond_resched();
+		online_pages += block_end_pfn - block_start_pfn;
  	}
-
-	/* We confirm that there is no hole */
-	zone->contiguous = true;
+	zone->online_pages = online_pages;
  }
  
  /*
@@ -2348,7 +2348,7 @@ void __init page_alloc_init_late(void)
  		shuffle_free_memory(NODE_DATA(nid));
  
  	for_each_populated_zone(zone)
-		set_zone_contiguous(zone);
+		calc_online_pages(zone);
  
  	/* Initialize page ext after all struct pages are initialized. */
  	if (deferred_struct_pages)
-- 
2.43.0


-- 
Cheers,

David

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ