lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+EESO6dR5=4zaecmYqQqOX4702wwGSTX=4+Ani_Q9+o+WUnQA@mail.gmail.com>
Date: Wed, 20 Aug 2025 21:23:24 -0700
From: Lokesh Gidra <lokeshgidra@...gle.com>
To: linux-mm@...ck.kernel.org
Cc: Peter Xu <peterx@...hat.com>, Barry Song <21cnbao@...il.com>, 
	David Hildenbrand <david@...hat.com>, Suren Baghdasaryan <surenb@...gle.com>, 
	Kalesh Singh <kaleshsingh@...gle.com>, Andrew Morton <akpm@...ux-foundation.org>, 
	android-mm <android-mm@...gle.com>, linux-kernel <linux-kernel@...r.kernel.org>, 
	Jann Horn <jannh@...gle.com>
Subject: [RFC] Unconditionally lock folios when calling rmap_walk()

Hi all,

Currently, some callers of rmap_walk() conditionally avoid try-locking
non-ksm anon folios. This necessitates serialization through anon_vma
write-lock when folio->mapping and/or folio->index (fields involved in
rmap_walk()) are to be updated. This hurts scalability due to coarse
granularity of the lock. For instance, when multiple threads invoke
userfaultfd’s MOVE ioctl simultaneously to move distinct pages from
the same src VMA, they all contend for the corresponding anon_vma’s
lock. Field traces for arm64 android devices reveal over 30ms of
uninterruptible sleep in the main UI thread, leading to janky user
interactions.

Among all rmap_walk() callers that don’t lock anon folios,
folio_referenced() is the most critical (others are
page_idle_clear_pte_refs(), damon_folio_young(), and
damon_folio_mkold()). The relevant code in folio_referenced() is:

if (!is_locked && (!folio_test_anon(folio) || folio_test_ksm(folio))) {
        we_locked = folio_trylock(folio);
        if (!we_locked)
                return 1;
}

It’s unclear why locking anon_vma (when updating folio->mapping) is
beneficial over locking the folio here. It’s in the reclaim path, so
should not be a critical path that necessitates some special
treatment, unless I’m missing something.

Therefore, I propose simplifying the locking mechanism by
unconditionally try-locking the folio in such cases. This helps avoid
locking anon_vma when updating folio->mapping, which, for instance,
will help eliminate the uninterruptible sleep observed in the field
traces mentioned earlier. Furthermore, it enables us to simplify the
code in folio_lock_anon_vma_read() by removing the re-check to ensure
that the field hasn’t changed under us.

Thanks,
Lokesh

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ