[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <j5pf6l745hp4r56fndlshzcjpyi3nttgywouowhmfiewx6p56j@b64l6tmupykt>
Date: Wed, 17 Sep 2025 11:52:34 +0100
From: Kiryl Shutsemau <kirill@...temov.name>
To: Zach O'Keefe <zokeefe@...gle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
Andrew Morton <akpm@...ux-foundation.org>, David Hildenbrand <david@...hat.com>, Zi Yan <ziy@...dia.com>,
Baolin Wang <baolin.wang@...ux.alibaba.com>, "Liam R. Howlett" <Liam.Howlett@...cle.com>,
Nico Pache <npache@...hat.com>, Ryan Roberts <ryan.roberts@....com>, Dev Jain <dev.jain@....com>,
Barry Song <baohua@...nel.org>, linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCHv2] mm/khugepaged: Do not fail collapse_pte_mapped_thp()
on SCAN_PMD_NULL
On Tue, Sep 16, 2025 at 11:06:30AM -0700, Zach O'Keefe wrote:
> So, since we are trying to aim for consistency here, I think we ought
> to also support the anonymous case.
>
> I don't have a patch, but can spot at least two things we'd need to adjust:
>
> First, we are defeated by the check in __thp_vma_allowable_orders();
>
> /*
> * THPeligible bit of smaps should show 1 for proper VMAs even
> * though anon_vma is not initialized yet.
> *
> * Allow page fault since anon_vma may be not initialized until
> * the first page fault.
> */
> if (!vma->anon_vma)
> return (smaps || in_pf) ? orders : 0;
>
> I think we can probably just delete that check, but would need to confirm.
Do you want MADV_COLLAPSE to work on VMAs that never got a page fault?
I think it should be fine as long as we agree that MADV_COLLAPSE implies
memory population. I think it should, but I want to be sure we are on
the same page.
I also brings a question on holes in the files on MADV_COLLAPSE. We
might want to populate them too. But it means the logic between
MADV_COLLAPSE and khugepaged will diverge. It requires larger
refactoring.
> And second, madvise_collapse() doesn't route SCAN_PMD_NULL to
> collapse_pte_mapped_thp(). I think we just need to audit places where
> we return this code, to make sure it's faithfully describing a
> situation where we can go ahead and install a new pmd. As a hasty
> check, the return codes in check_pmd_state() don't look to follow
> that, with !present and pmd_bad() returning SCAN_PMD_NULL. Likewise,
> there are many underlying failure reasons for
> pte_offset_map_ro_nolock()=>___pte_offset_map() that aren't "no PMD
> entry".
Sounds like a plan :)
--
Kiryl Shutsemau / Kirill A. Shutemov
Powered by blists - more mailing lists