[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <26e65878-f214-4890-8bcb-24a45122bfd6@kernel.org>
Date: Thu, 18 Dec 2025 10:29:18 +0100
From: "David Hildenbrand (Red Hat)" <david@...nel.org>
To: Vernon Yang <vernon2gm@...il.com>, akpm@...ux-foundation.org,
lorenzo.stoakes@...cle.com
Cc: ziy@...dia.com, npache@...hat.com, baohua@...nel.org,
lance.yang@...ux.dev, linux-mm@...ck.org, linux-kernel@...r.kernel.org,
Vernon Yang <yanglincheng@...inos.cn>
Subject: Re: [PATCH 2/4] mm: khugepaged: remove mm when all memory has been
collapsed
On 12/15/25 10:04, Vernon Yang wrote:
> The following data is traced by bpftrace on a desktop system. After
> the system has been left idle for 10 minutes upon booting, a lot of
> SCAN_PMD_MAPPED or SCAN_PMD_NONE are observed during a full scan by
> khugepaged.
>
> @scan_pmd_status[1]: 1 ## SCAN_SUCCEED
> @scan_pmd_status[4]: 158 ## SCAN_PMD_MAPPED
> @scan_pmd_status[3]: 174 ## SCAN_PMD_NONE
> total progress size: 701 MB
> Total time : 440 seconds ## include khugepaged_scan_sleep_millisecs
>
> The khugepaged_scan list save all task that support collapse into hugepage,
> as long as the take is not destroyed, khugepaged will not remove it from
> the khugepaged_scan list. This exist a phenomenon where task has already
> collapsed all memory regions into hugepage, but khugepaged continues to
> scan it, which wastes CPU time and invalid, and due to
> khugepaged_scan_sleep_millisecs (default 10s) causes a long wait for
> scanning a large number of invalid task, so scanning really valid task
> is later.
>
> After applying this patch, when all memory is either SCAN_PMD_MAPPED or
> SCAN_PMD_NONE, the mm is automatically removed from khugepaged's scan
> list. If the page fault or MADV_HUGEPAGE again, it is added back to
> khugepaged.
I don't like that, as it assumes that memory within such a process would
be rather static, which is easily not the case (e.g., allocators just
doing MADV_DONTNEED to free memory).
If most stuff is collapsed to PMDs already, can't we just skip over
these regions a bit faster?
--
Cheers
David
Powered by blists - more mailing lists