lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+EESO4Z6wtX7ZMdDHQRe5jAAS_bQ-POq5+4aDx5jh2DvY6UHg@mail.gmail.com>
Date: Fri, 22 Aug 2025 10:29:52 -0700
From: Lokesh Gidra <lokeshgidra@...gle.com>
To: David Hildenbrand <david@...hat.com>, Lorenzo Stoakes <lorenzo.stoakes@...cle.com>, 
	Andrew Morton <akpm@...ux-foundation.org>
Cc: Harry Yoo <harry.yoo@...cle.com>, Zi Yan <ziy@...dia.com>, Barry Song <21cnbao@...il.com>, 
	"open list:MEMORY MANAGEMENT" <linux-mm@...ck.org>, Peter Xu <peterx@...hat.com>, 
	Suren Baghdasaryan <surenb@...gle.com>, Kalesh Singh <kaleshsingh@...gle.com>, 
	android-mm <android-mm@...gle.com>, linux-kernel <linux-kernel@...r.kernel.org>, 
	Jann Horn <jannh@...gle.com>, Rik van Riel <riel@...riel.com>, Vlastimil Babka <vbabka@...e.cz>, 
	"Liam R. Howlett" <Liam.Howlett@...cle.com>
Subject: [DISCUSSION] Unconditionally lock folios when calling rmap_walk()

Hi all,

Currently, some callers of rmap_walk() conditionally avoid try-locking
non-ksm anon folios. This necessitates serialization through anon_vma
write-lock elsewhere when folio->mapping and/or folio->index (fields
involved in rmap_walk()) are to be updated. This hurts scalability due
to coarse granularity of the lock. For instance, when multiple threads
invoke userfaultfd’s MOVE ioctl simultaneously to move distinct pages
from the same src VMA, they all contend for the corresponding
anon_vma’s lock. Field traces for arm64 android devices reveal over
30ms of uninterruptible sleep in the main UI thread, leading to janky
user interactions.

Among all rmap_walk() callers that don’t lock anon folios,
folio_referenced() is the most critical (others are
page_idle_clear_pte_refs(), damon_folio_young(), and
damon_folio_mkold()). The relevant code in folio_referenced() is:

if (!is_locked && (!folio_test_anon(folio) || folio_test_ksm(folio))) {
        we_locked = folio_trylock(folio);
        if (!we_locked)
                return 1;
}

It’s unclear why locking anon_vma exclusively (when updating
folio->mapping, like in uffd MOVE) is beneficial over walking rmap
with folio locked. It’s in the reclaim path, so should not be a
critical path that necessitates some special treatment, unless I’m
missing something.

Therefore, I propose simplifying the locking mechanism by ensuring the
folio is locked before calling rmap_walk(). This helps avoid locking
anon_vma when updating folio->mapping, which, for instance, will help
eliminate the uninterruptible sleep observed in the field traces
mentioned earlier. Furthermore, it enables us to simplify the code in
folio_lock_anon_vma_read() by removing the re-check to ensure that the
field hasn’t changed under us.

Thanks,
Lokesh

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ