[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <666ce353-0fc4-4f5f-9e5d-9bb95464e939@airmail.cc>
Date: Thu, 23 Oct 2025 09:47:00 +0000
From: craftfever <craftfever@...mail.cc>
To: David Hildenbrand <david@...hat.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Pedro Demarchi Gomes <pedrodemargomes@...il.com>
Cc: Xu Xin <xu.xin16@....com.cn>, Chengming Zhou <chengming.zhou@...ux.dev>,
linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4] ksm: use range-walk function to jump over holes in
scan_get_next_rmap_item
David Hildenbrand wrote:
> On 22.10.25 22:31, Andrew Morton wrote:
>> On Wed, 22 Oct 2025 12:30:59 -0300 Pedro Demarchi Gomes
>> <pedrodemargomes@...il.com> wrote:
>>
>>> Currently, scan_get_next_rmap_item() walks every page address in a VMA
>>> to locate mergeable pages. This becomes highly inefficient when scanning
>>> large virtual memory areas that contain mostly unmapped regions.
>>>
>>> This patch replaces the per-address lookup with a range walk using
>>> walk_page_range(). The range walker allows KSM to skip over entire
>>> unmapped holes in a VMA, avoiding unnecessary lookups.
>>> This problem was previously discussed in [1].
>>>
>>> [1] https://lore.kernel.org/linux-
>>> mm/423de7a3-1c62-4e72-8e79-19a6413e420c@...hat.com/
>>>
>>
>> Thanks. It would be helpful of the changelog were to tell people how
>> significant this change is for our users.
>>
>>> Reported-by: craftfever <craftfever@...mail.cc>
>>> Closes: https://lkml.kernel.org/
>>> r/020cf8de6e773bb78ba7614ef250129f11a63781@...ena.io
>>
>> Buried in here is a claim that large amount of CPU are being used, but
>> nothing quantitative.
>>
>> So is there something we can tell people who are looking at this patch
>> in Feb 2026 and wondering "hm, should I add that to our kernel"?
>>
>>> Suggested-by: David Hildenbrand <david@...hat.com>
>>> Co-developed-by: David Hildenbrand <david@...hat.com>
>>> Signed-off-by: David Hildenbrand <david@...hat.com>
>>> Fixes: 31dbd01f3143 ("ksm: Kernel SamePage Merging")
>>
>> If the observed runtime problem is bad enough then a cc:stable might be
>> justified. But a description of that observed runtime behavior would
>> be needed for that, please.
>
> Agreed.
>
> With the following simple program
>
> #include <unistd.h>
> #include <stdio.h>
> #include <sys/mman.h>
>
> /* 32 TiB */
> const size_t size = 32ul * 1024 * 1024 * 1024 * 1024;
>
> int main() {
> char *area = mmap(NULL, size, PROT_READ | PROT_WRITE,
> MAP_NORESERVE | MAP_PRIVATE | MAP_ANON, -1, 0);
>
> if (area == MAP_FAILED) {
> perror("mmap() failed\n");
> return -1;
> }
>
> /* Populate a single page such that we get an anon_vma. */
> *area = 0;
>
> /* Enable KSM. */
> madvise(area, size, MADV_MERGEABLE);
> pause();
> return 0;
> }
>
> $ ./ksm-sparse &
> $ echo 1 > /sys/kernel/mm/ksm/run
>
> ksmd goes to 100% for quite a long time.
>
> Now imagine if a cloud user spins up a couple of these programs.
>
> KSM in the system is essentially deadlocked not able to deduplicate
> anything of value.
>
> @Pedro, can you incorporate all that in the patch description?
>
Thanks for example and explanation, that's exactly what I meant. Big
datacenters and servers are primary use cases with Linux< for example,
when many VMs are using and KSM operation have to be very robust to deal
with so huge amount of memory. And, as being said, that bug is happened
with consumers apps, like Chromkum/Electron (and based on it, like VS
Code), when user decides to apply KSM for his apps. This patch is really
necessary to apply in master branch and very preferably backport
6.17-stable branch, it has high importance. I've tested by the way v4,
and it's fine, stable, effective and very lite. Is v5 version with more
comprehensive description is final version?
Powered by blists - more mailing lists