lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <127930dd-3dd8-4b18-9bdd-9b8dea9496ae@intel.com>
Date: Fri, 19 Dec 2025 09:31:22 +0800
From: "Li, Tianyou" <tianyou.li@...el.com>
To: Oscar Salvador <osalvador@...e.de>, David Hildenbrand <david@...hat.com>,
	Wei Yang <richard.weiyang@...il.com>, Mike Rapoport <rppt@...nel.org>
CC: <linux-mm@...ck.org>, Yong Hu <yong.hu@...el.com>, Nanhai Zou
	<nanhai.zou@...el.com>, Yuan Liu <yuan1.liu@...el.com>, Tim Chen
	<tim.c.chen@...ux.intel.com>, Qiuxu Zhuo <qiuxu.zhuo@...el.com>, Yu C Chen
	<yu.c.chen@...el.com>, Pan Deng <pan.deng@...el.com>, Chen Zhang
	<zhangchen.kidd@...com>, LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v5 1/2] mm/memory hotplug/unplug: Optimize
 zone->contiguous update when changes pfn range

Thank you all for your time and efforts. We have submitted the patch v6, 
tested through the last weekend. Any comments or suggestions are 
welcomed. Appreciated.


Regards,

Tianyou


On 12/12/2025 1:27 PM, Li, Tianyou wrote:
>
> On 12/11/2025 1:07 PM, Oscar Salvador wrote:
>> On Mon, Dec 08, 2025 at 11:25:43PM +0800, Tianyou Li wrote:
>>> When invoke move_pfn_range_to_zone or remove_pfn_range_from_zone, it 
>>> will
>>> update the zone->contiguous by checking the new zone's pfn range 
>>> from the
>>> beginning to the end, regardless the previous state of the old zone. 
>>> When
>>> the zone's pfn range is large, the cost of traversing the pfn range to
>>> update the zone->contiguous could be significant.
>>>
>>> Add fast paths to quickly detect cases where zone is definitely not
>>> contiguous without scanning the new zone. The cases are: when the 
>>> new range
>>> did not overlap with previous range, the contiguous should be false; 
>>> if the
>>> new range adjacent with the previous range, just need to check the new
>>> range; if the new added pages could not fill the hole of previous 
>>> zone, the
>>> contiguous should be false.
>>>
>>> The following test cases of memory hotplug for a VM [1], tested in the
>>> environment [2], show that this optimization can significantly 
>>> reduce the
>>> memory hotplug time [3].
>>>
>>> +----------------+------+---------------+--------------+----------------+ 
>>>
>>> |                | Size | Time (before) | Time (after) | Time 
>>> Reduction |
>>> | +------+---------------+--------------+----------------+
>>> | Plug Memory    | 256G |      10s      |      2s      | 80%      |
>>> | +------+---------------+--------------+----------------+
>>> |                | 512G |      33s      |      6s      | 81%      |
>>> +----------------+------+---------------+--------------+----------------+ 
>>>
>>>
>>> +----------------+------+---------------+--------------+----------------+ 
>>>
>>> |                | Size | Time (before) | Time (after) | Time 
>>> Reduction |
>>> | +------+---------------+--------------+----------------+
>>> | Unplug Memory  | 256G |      10s      |      2s      | 80%      |
>>> | +------+---------------+--------------+----------------+
>>> |                | 512G |      34s      |      6s      | 82%      |
>>> +----------------+------+---------------+--------------+----------------+ 
>>>
>>>
>>> [1] Qemu commands to hotplug 256G/512G memory for a VM:
>>>      object_add memory-backend-ram,id=hotmem0,size=256G/512G,share=on
>>>      device_add virtio-mem-pci,id=vmem1,memdev=hotmem0,bus=port1
>>>      qom-set vmem1 requested-size 256G/512G (Plug Memory)
>>>      qom-set vmem1 requested-size 0G (Unplug Memory)
>>>
>>> [2] Hardware     : Intel Icelake server
>>>      Guest Kernel : v6.18-rc2
>>>      Qemu         : v9.0.0
>>>
>>>      Launch VM    :
>>>      qemu-system-x86_64 -accel kvm -cpu host \
>>>      -drive file=./Centos10_cloud.qcow2,format=qcow2,if=virtio \
>>>      -drive file=./seed.img,format=raw,if=virtio \
>>>      -smp 3,cores=3,threads=1,sockets=1,maxcpus=3 \
>>>      -m 2G,slots=10,maxmem=2052472M \
>>>      -device 
>>> pcie-root-port,id=port1,bus=pcie.0,slot=1,multifunction=on \
>>>      -device pcie-root-port,id=port2,bus=pcie.0,slot=2 \
>>>      -nographic -machine q35 \
>>>      -nic user,hostfwd=tcp::3000-:22
>>>
>>>      Guest kernel auto-onlines newly added memory blocks:
>>>      echo online > /sys/devices/system/memory/auto_online_blocks
>>>
>>> [3] The time from typing the QEMU commands in [1] to when the output of
>>>      'grep MemTotal /proc/meminfo' on Guest reflects that all 
>>> hotplugged
>>>      memory is recognized.
>>>
>>> Reported-by: Nanhai Zou <nanhai.zou@...el.com>
>>> Reported-by: Chen Zhang <zhangchen.kidd@...com>
>>> Tested-by: Yuan Liu <yuan1.liu@...el.com>
>>> Reviewed-by: Tim Chen <tim.c.chen@...ux.intel.com>
>>> Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@...el.com>
>>> Reviewed-by: Yu C Chen <yu.c.chen@...el.com>
>>> Reviewed-by: Pan Deng <pan.deng@...el.com>
>>> Reviewed-by: Nanhai Zou <nanhai.zou@...el.com>
>>> Reviewed-by: Yuan Liu <yuan1.liu@...el.com>
>>> Signed-off-by: Tianyou Li <tianyou.li@...el.com>
>> Overall this looks good to me, thanks Tianyou Li for working on this.
>> Just some minor comments below:
>
>
> Very appreciated Oscar for your time to review the patch! Yuan and I 
> will work on the patch v6 to address all your comments/suggestions.
>
> We will test the patch v6 across the weekend. If any other comments 
> during those days, we will need more days probably. Thanks.
>
>
>>> ---
>>>   mm/internal.h       |  8 +++++-
>>>   mm/memory_hotplug.c | 64 
>>> ++++++++++++++++++++++++++++++++++++++++++---
>>>   mm/mm_init.c        | 13 +++++++--
>>>   3 files changed, 79 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/mm/internal.h b/mm/internal.h
>>> index 1561fc2ff5b8..1b5bba6526d4 100644
>>> --- a/mm/internal.h
>>> +++ b/mm/internal.h
>>> @@ -730,7 +730,13 @@ static inline struct page 
>>> *pageblock_pfn_to_page(unsigned long start_pfn,
>>>       return __pageblock_pfn_to_page(start_pfn, end_pfn, zone);
>>>   }
>>>   -void set_zone_contiguous(struct zone *zone);
>>> +enum zone_contig_state {
>>> +    ZONE_CONTIG_YES,
>>> +    ZONE_CONTIG_NO,
>>> +    ZONE_CONTIG_MAYBE,
>>> +};
>>> +
>>> +void set_zone_contiguous(struct zone *zone, enum zone_contig_state 
>>> state);
>>>   bool pfn_range_intersects_zones(int nid, unsigned long start_pfn,
>>>                  unsigned long nr_pages);
>>>   diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
>>> index 0be83039c3b5..d711f6e2c87f 100644
>>> --- a/mm/memory_hotplug.c
>>> +++ b/mm/memory_hotplug.c
>>> @@ -544,6 +544,28 @@ static void update_pgdat_span(struct 
>>> pglist_data *pgdat)
>>>       pgdat->node_spanned_pages = node_end_pfn - node_start_pfn;
>>>   }
>>>   +static enum zone_contig_state __meminit 
>>> zone_contig_state_after_shrinking(
>>> +        struct zone *zone, unsigned long start_pfn, unsigned long 
>>> nr_pages)
>> Why do we need the __meminit? These functions are only used from 
>> memory-hotplug
>> code so we should not need it?
>
>
> Got it. Will remove the __meminit in patch v6.
>
>
>>> +{
>>> +    const unsigned long end_pfn = start_pfn + nr_pages;
>>> +
>>> +    /*
>>> +     * If the removed pfn range inside the original zone span, the 
>>> contiguous
>>> +     * property is surely false.
>>> +     */
>>> +    if (start_pfn > zone->zone_start_pfn && end_pfn < 
>>> zone_end_pfn(zone))
>>> +        return ZONE_CONTIG_NO;
>>> +
>>> +    /* If the removed pfn range is at the beginning or end of the
>>> +     * original zone span, the contiguous property is preserved when
>>> +     * the original zone is contiguous.
>>> +     */
>>> +    if (start_pfn == zone->zone_start_pfn || end_pfn == 
>>> zone_end_pfn(zone))
>>> +        return zone->contiguous ? ZONE_CONTIG_YES : ZONE_CONTIG_MAYBE;
>>> +
>>> +    return ZONE_CONTIG_MAYBE;
>>> +}
>>> +
>>>   void remove_pfn_range_from_zone(struct zone *zone,
>>>                         unsigned long start_pfn,
>>>                         unsigned long nr_pages)
>>> @@ -551,6 +573,7 @@ void remove_pfn_range_from_zone(struct zone *zone,
>>>       const unsigned long end_pfn = start_pfn + nr_pages;
>>>       struct pglist_data *pgdat = zone->zone_pgdat;
>>>       unsigned long pfn, cur_nr_pages;
>>> +    enum zone_contig_state contiguous_state = ZONE_CONTIG_MAYBE;
>> I think that new_contiguous_state is clearer, but I do not have a strong
>> opinion here.
>
>
> Will do in the patch v6.
>
>
>>>       /* Poison struct pages because they are now uninitialized 
>>> again. */
>>>       for (pfn = start_pfn; pfn < end_pfn; pfn += cur_nr_pages) {
>>> @@ -571,12 +594,13 @@ void remove_pfn_range_from_zone(struct zone 
>>> *zone,
>>>       if (zone_is_zone_device(zone))
>>>           return;
>>>   +    contiguous_state = zone_contig_state_after_shrinking(zone, 
>>> start_pfn, nr_pages);
>>>       clear_zone_contiguous(zone);
>>>         shrink_zone_span(zone, start_pfn, start_pfn + nr_pages);
>>>       update_pgdat_span(pgdat);
>>>   -    set_zone_contiguous(zone);
>>> +    set_zone_contiguous(zone, contiguous_state);
>>>   }
>> ...
>>> @@ -752,7 +809,8 @@ void move_pfn_range_to_zone(struct zone *zone, 
>>> unsigned long start_pfn,
>>>   {
>>>       struct pglist_data *pgdat = zone->zone_pgdat;
>>>       int nid = pgdat->node_id;
>>> -
>>> +    const enum zone_contig_state contiguous_state =
>>> +        zone_contig_state_after_growing(zone, start_pfn, nr_pages);
>> Same comment from remove_pfn_range_from_zone.
>
>
> Will do in the patch v6.
>
>
>>>       clear_zone_contiguous(zone);
>>>         if (zone_is_empty(zone))
>>> @@ -783,7 +841,7 @@ void move_pfn_range_to_zone(struct zone *zone, 
>>> unsigned long start_pfn,
>>>                MEMINIT_HOTPLUG, altmap, migratetype,
>>>                isolate_pageblock);
>>>   -    set_zone_contiguous(zone);
>>> +    set_zone_contiguous(zone, contiguous_state);
>>>   }
>>>     struct auto_movable_stats {
>>> diff --git a/mm/mm_init.c b/mm/mm_init.c
>>> index 7712d887b696..e296bd9fac9e 100644
>>> --- a/mm/mm_init.c
>>> +++ b/mm/mm_init.c
>>> @@ -2263,11 +2263,19 @@ void __init init_cma_pageblock(struct page 
>>> *page)
>>>   }
>>>   #endif
>>>   -void set_zone_contiguous(struct zone *zone)
>>> +void set_zone_contiguous(struct zone *zone, enum zone_contig_state 
>>> state)
>>>   {
>>>       unsigned long block_start_pfn = zone->zone_start_pfn;
>>>       unsigned long block_end_pfn;
>>>   +    if (state == ZONE_CONTIG_YES) {
>>> +        zone->contiguous = true;
>>> +        return;
>>> +    }
>>> +
>>> +    if (state == ZONE_CONTIG_NO)
>>> +        return;
>>> +
>>>       block_end_pfn = pageblock_end_pfn(block_start_pfn);
>>>       for (; block_start_pfn < zone_end_pfn(zone);
>>>               block_start_pfn = block_end_pfn,
>>> @@ -2283,6 +2291,7 @@ void set_zone_contiguous(struct zone *zone)
>>>         /* We confirm that there is no hole */
>>>       zone->contiguous = true;
>>> +
>> Not needed?
>
>
> My mistake, thanks. Will do in the patch v6.
>
>
> Regards,
>
> Tianyou
>
>
>>>   }
>>
>
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ