lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <j5pf6l745hp4r56fndlshzcjpyi3nttgywouowhmfiewx6p56j@b64l6tmupykt>
Date: Wed, 17 Sep 2025 11:52:34 +0100
From: Kiryl Shutsemau <kirill@...temov.name>
To: Zach O'Keefe <zokeefe@...gle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>, 
	Andrew Morton <akpm@...ux-foundation.org>, David Hildenbrand <david@...hat.com>, Zi Yan <ziy@...dia.com>, 
	Baolin Wang <baolin.wang@...ux.alibaba.com>, "Liam R. Howlett" <Liam.Howlett@...cle.com>, 
	Nico Pache <npache@...hat.com>, Ryan Roberts <ryan.roberts@....com>, Dev Jain <dev.jain@....com>, 
	Barry Song <baohua@...nel.org>, linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCHv2] mm/khugepaged: Do not fail collapse_pte_mapped_thp()
 on SCAN_PMD_NULL

On Tue, Sep 16, 2025 at 11:06:30AM -0700, Zach O'Keefe wrote:
> So, since we are trying to aim for consistency here, I think we ought
> to also support the anonymous case.
> 
> I don't have a patch, but can spot at least two things we'd need to adjust:
> 
> First, we are defeated by the check in __thp_vma_allowable_orders();
> 
>         /*
>          * THPeligible bit of smaps should show 1 for proper VMAs even
>          * though anon_vma is not initialized yet.
>          *
>          * Allow page fault since anon_vma may be not initialized until
>          * the first page fault.
>          */
>         if (!vma->anon_vma)
>                 return (smaps || in_pf) ? orders : 0;
> 
> I think we can probably just delete that check, but would need to confirm.

Do you want MADV_COLLAPSE to work on VMAs that never got a page fault?

I think it should be fine as long as we agree that MADV_COLLAPSE implies
memory population. I think it should, but I want to be sure we are on
the same page.

I also brings a question on holes in the files on MADV_COLLAPSE. We
might want to populate them too. But it means the logic between
MADV_COLLAPSE and khugepaged will diverge. It requires larger
refactoring.

> And second, madvise_collapse() doesn't route SCAN_PMD_NULL to
> collapse_pte_mapped_thp(). I think we just need to audit places where
> we return this code, to make sure it's faithfully describing a
> situation where we can go ahead and install a new pmd. As a hasty
> check, the return codes in check_pmd_state() don't look to follow
> that, with !present and pmd_bad() returning SCAN_PMD_NULL. Likewise,
> there are many underlying failure reasons for
> pte_offset_map_ro_nolock()=>___pte_offset_map() that aren't "no PMD
> entry".

Sounds like a plan :)

-- 
  Kiryl Shutsemau / Kirill A. Shutemov

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ