linux-kernel - Re: [PATCH] mm/khugepaged: Fix skipping of alloc sleep after second failure

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20251125042619.GA119062@i85a15111.eu95sqa>
Date: Tue, 25 Nov 2025 12:26:19 +0800
From: Zhiheng Tao <junchuan.tzh@...group.com>
To: Lance Yang <lance.yang@...ux.dev>
Cc: ziy@...dia.com, baolin.wang@...ux.alibaba.com, Liam.Howlett@...cle.com,
	npache@...hat.com, ryan.roberts@....com, dev.jain@....com,
	baohua@...nel.org, shy828301@...il.com, zokeefe@...gle.com,
	peterx@...hat.com, akpm@...ux-foundation.org, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org, lorenzo.stoakes@...cle.com,
	"David Hildenbrand (Red Hat)" <david@...nel.org>
Subject: Re: [PATCH] mm/khugepaged: Fix skipping of alloc sleep after second
 failure

On Mon, Nov 24, 2025 at 05:27:23PM +0800, Lance Yang wrote:
> 
> 
> On 2025/11/24 17:14, David Hildenbrand (Red Hat) wrote:
> >On 11/24/25 07:19, Zhiheng Tao wrote:
> >>In khugepaged_do_scan(), two consecutive allocation failures cause
> >>the logic to skip the dedicated 60s throttling sleep
> >>(khugepaged_alloc_sleep_millisecs), forcing a fallback to the
> >>shorter 10s scanning interval via the outer loop
> >>
> >>Since fragmentation is unlikely to resolve in 10s, this results in
> >>wasted CPU cycles on immediate retries.
> >
> >Why shouldn't memory comapction be able to compact a single THP in 10s?
> >
> >Why should it resolve in 60s?
> >
> >>
> >>Reorder the failure logic to ensure khugepaged_alloc_sleep() is
> >>always called on each allocation failure.
> >>
> >>Fixes: c6a7f445a272 ("mm: khugepaged: don't carry huge page to
> >>the next loop for !CONFIG_NUMA")
> >
> >What are we fixing here? This sounds like a change that might be
> >better on some systems, but worse on others?
> 
> Seems like we're not honoring khugepaged_alloc_sleep_millisecs on the
> second allocation failure... but is that actually a problem?
> 
Is it more appropriate to honor the second allocation failure?
It was a problem before commit c6a7f445a272 when
khugepaged_pages_to_scan=512.
> >
> >We really need more information on when/how an issue was hit, and
> >how this patch here really moves the needle in any way.
> 
> +1
>