[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20251207231056.71294-1-swarajgaikwad1925@gmail.com>
Date: Sun, 7 Dec 2025 23:10:56 +0000
From: Swaraj Gaikwad <swarajgaikwad1925@...il.com>
To: david@...nel.org
Cc: akpm@...ux-foundation.org,
david.hunter.linux@...il.com,
linux-kernel@...r.kernel.org,
linux-mm@...ck.org,
osalvador@...e.de,
skhan@...uxfoundation.org,
swarajgaikwad1925@...il.com
Subject: Re: [PATCH] mm/memory_hotplug: Cache auto_movable stats to optimize online check
Hi David,
Thank you for the feedback. I've conducted benchmarking to measure the
performance impact of caching the global statistics.
Test Setup:
I created a QEMU VM with the following configuration to simulate
a multi-NUMA environment:
- 8 NUMA nodes, 1GB each
- movablecore=2G kernel parameter
- Additional hotpluggable memory region via memmap=1G!0xC00000000
Benchmark Method:
I added a debugfs interface to directly invoke auto_movable_can_online_movable()
with NUMA_NO_NODE (the code path that walks all zones), and then
triggered it via: `echo 1 > /sys/kernel/debug/movable_benchmark`
Results:
Without patch (walks all zones): 2402 ns
With patch (uses cached values): 453 ns
While the absolute time difference is small in this test setup,
the improvement becomes more significant with:
- More NUMA nodes/zones in the system
- Frequent memory hotplug operations
- Systems with many populated zones
If this performance improvement is not considered significant enough to
justify the patch, I'm happy to send an updated patch that simply clarifies
the TODO comment to: "cache values if walking all zones becomes a
performance problem" as you suggested, for future reference.
Testing code:
static int benchmark_set(void *data, u64 val)
{
ktime_t start, end;
s64 duration;
bool result;
int nid = NUMA_NO_NODE;
unsigned long nr_pages = 32768;
start = ktime_get();
result = auto_movable_can_online_movable(nid, NULL, nr_pages);
end = ktime_get();
duration = ktime_to_ns(ktime_sub(end, start));
pr_info("BENCHMARK_RESULT: Result=%d | Time=%lld ns\n", result, duration);
return 0;
}
QEMU Configuration:
qemu-system-x86_64 \
-m 8G,slots=16,maxmem=16G \
-smp 2 \
-netdev user,id=net0,hostfwd=tcp::10022-:22 \
-device virtio-net-pci,netdev=net0 \
-enable-kvm \
-cpu host \
-initrd "${DEFAULT_INITRD}" \
-kernel "${DEFAULT_KERNEL}" \
-object memory-backend-ram,id=mem0,size=1G \
-object memory-backend-ram,id=mem1,size=1G \
-object memory-backend-ram,id=mem2,size=1G \
-object memory-backend-ram,id=mem3,size=1G \
-object memory-backend-ram,id=mem4,size=1G \
-object memory-backend-ram,id=mem5,size=1G \
-object memory-backend-ram,id=mem6,size=1G \
-object memory-backend-ram,id=mem7,size=1G \
-numa node,nodeid=0,memdev=mem0 \
-numa node,nodeid=1,memdev=mem1 \
-numa node,nodeid=2,memdev=mem2 \
-numa node,nodeid=3,memdev=mem3 \
-numa node,nodeid=4,memdev=mem4 \
-numa node,nodeid=5,memdev=mem5 \
-numa node,nodeid=6,memdev=mem6 \
-numa node,nodeid=7,memdev=mem7 \
-append "loglevel=8 root=/dev/vda3 rootwait console=ttyS0 idle=poll movablecore=2G memmap=1G!0xC00000000" \
-drive if=none,file="${DEFAULT_DISK}",format=qcow2,id=hd \
-device virtio-blk-pci,drive=hd \
-nographic \
-machine q35\
-snapshot \
-s
Thanks,
Swaraj
Powered by blists - more mailing lists