lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <3c572565-0b21-4136-b0e0-59a5ed858104@redhat.com>
Date: Wed, 8 Oct 2025 10:18:09 +0200
From: David Hildenbrand <david@...hat.com>
To: Lance Yang <lance.yang@...ux.dev>, akpm@...ux-foundation.org
Cc: lorenzo.stoakes@...cle.com, Liam.Howlett@...cle.com, baohua@...nel.org,
 baolin.wang@...ux.alibaba.com, dev.jain@....com, hughd@...gle.com,
 ioworker0@...il.com, kirill@...temov.name, linux-kernel@...r.kernel.org,
 linux-mm@...ck.org, mpenttil@...hat.com, npache@...hat.com,
 ryan.roberts@....com, ziy@...dia.com, richard.weiyang@...il.com
Subject: Re: [PATCH mm-new v3 1/1] mm/khugepaged: abort collapse scan on
 non-swap entries

On 08.10.25 05:26, Lance Yang wrote:
> From: Lance Yang <lance.yang@...ux.dev>
> 
> Currently, special non-swap entries (like PTE markers) are not caught
> early in hpage_collapse_scan_pmd(), leading to failures deep in the
> swap-in logic.
> 
> A function that is called __collapse_huge_page_swapin() and documented
> to "Bring missing pages in from swap" will handle other types as well.
> 
> As analyzed by David[1], we could have ended up with the following
> entry types right before do_swap_page():
> 
>    (1) Migration entries. We would have waited.
>        -> Maybe worth it to wait, maybe not. We suspect we don't stumble
>           into that frequently such that we don't care. We could always
>           unlock this separately later.
> 
>    (2) Device-exclusive entries. We would have converted to non-exclusive.
>        -> See make_device_exclusive(), we cannot tolerate PMD entries and
>           have to split them through FOLL_SPLIT_PMD. As popped up during
>           a recent discussion, collapsing here is actually
>           counter-productive, because the next conversion will PTE-map
>           it again.
>        -> Ok to not collapse.
> 
>    (3) Device-private entries. We would have migrated to RAM.
>        -> Device-private still does not support THPs, so collapsing right
>           now just means that the next device access would split the
>           folio again.
>        -> Ok to not collapse.
> 
>    (4) HWPoison entries
>        -> Cannot collapse
> 
>    (5) Markers
>        -> Cannot collapse
> 
> First, this patch adds an early check for these non-swap entries. If
> any one is found, the scan is aborted immediately with the
> SCAN_PTE_NON_PRESENT result, as Lorenzo suggested[2], avoiding wasted
> work. While at it, convert pte_swp_uffd_wp_any() to pte_swp_uffd_wp()
> since we are in the swap pte branch.
> 
> Second, as Wei pointed out[3], we may have a chance to get a non-swap
> entry, since we will drop and re-acquire the mmap lock before
> __collapse_huge_page_swapin(). To handle this, we also add a
> non_swap_entry() check there.
> 
> Note that we can unlock later what we really need, and not account it
> towards max_swap_ptes.
> 
> [1] https://lore.kernel.org/linux-mm/09eaca7b-9988-41c7-8d6e-4802055b3f1e@redhat.com
> [2] https://lore.kernel.org/linux-mm/7df49fe7-c6b7-426a-8680-dcd55219c8bd@lucifer.local
> [3] https://lore.kernel.org/linux-mm/20251005010511.ysek2nqojebqngf3@master
> 
> Acked-by: David Hildenbrand <david@...hat.com>
> Reviewed-by: Wei Yang <richard.weiyang@...il.com>
> Reviewed-by: Dev Jain <dev.jain@....com>
> Suggested-by: David Hildenbrand <david@...hat.com>
> Suggested-by: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
> Signed-off-by: Lance Yang <lance.yang@...ux.dev>
> ---

Sorry for not replying earlier to your other mail.

LGTM.

We can always handle migration entries later if this shows up to be a 
problem (this time, in a clean way ...) and not count them towards 
actual "swap" entries.

-- 
Cheers

David / dhildenb


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ