[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGsJ_4xccre0rz5zgRTA=NbFzF4FLS-ZUohgLFnfTGY9Jdequg@mail.gmail.com>
Date: Thu, 21 Aug 2025 20:01:54 +0800
From: Barry Song <21cnbao@...il.com>
To: Lokesh Gidra <lokeshgidra@...gle.com>
Cc: "open list:MEMORY MANAGEMENT" <linux-mm@...ck.org>, Peter Xu <peterx@...hat.com>,
David Hildenbrand <david@...hat.com>, Suren Baghdasaryan <surenb@...gle.com>,
Kalesh Singh <kaleshsingh@...gle.com>, Andrew Morton <akpm@...ux-foundation.org>,
android-mm <android-mm@...gle.com>, linux-kernel <linux-kernel@...r.kernel.org>,
Jann Horn <jannh@...gle.com>
Subject: Re: [RFC] Unconditionally lock folios when calling rmap_walk()
On Thu, Aug 21, 2025 at 12:29 PM Lokesh Gidra <lokeshgidra@...gle.com> wrote:
>
> Adding linux-mm mailing list. Mistakenly used the wrong email address.
>
> On Wed, Aug 20, 2025 at 9:23 PM Lokesh Gidra <lokeshgidra@...gle.com> wrote:
> >
> > Hi all,
> >
> > Currently, some callers of rmap_walk() conditionally avoid try-locking
> > non-ksm anon folios. This necessitates serialization through anon_vma
> > write-lock when folio->mapping and/or folio->index (fields involved in
> > rmap_walk()) are to be updated. This hurts scalability due to coarse
> > granularity of the lock. For instance, when multiple threads invoke
> > userfaultfd’s MOVE ioctl simultaneously to move distinct pages from
> > the same src VMA, they all contend for the corresponding anon_vma’s
> > lock. Field traces for arm64 android devices reveal over 30ms of
> > uninterruptible sleep in the main UI thread, leading to janky user
> > interactions.
> >
> > Among all rmap_walk() callers that don’t lock anon folios,
> > folio_referenced() is the most critical (others are
> > page_idle_clear_pte_refs(), damon_folio_young(), and
> > damon_folio_mkold()). The relevant code in folio_referenced() is:
> >
> > if (!is_locked && (!folio_test_anon(folio) || folio_test_ksm(folio))) {
> > we_locked = folio_trylock(folio);
> > if (!we_locked)
> > return 1;
> > }
> >
> > It’s unclear why locking anon_vma (when updating folio->mapping) is
> > beneficial over locking the folio here. It’s in the reclaim path, so
> > should not be a critical path that necessitates some special
> > treatment, unless I’m missing something.
> >
> > Therefore, I propose simplifying the locking mechanism by
> > unconditionally try-locking the folio in such cases. This helps avoid
> > locking anon_vma when updating folio->mapping, which, for instance,
> > will help eliminate the uninterruptible sleep observed in the field
> > traces mentioned earlier. Furthermore, it enables us to simplify the
> > code in folio_lock_anon_vma_read() by removing the re-check to ensure
> > that the field hasn’t changed under us.
Thanks, I’m personally quite interested in this topic and will take a
closer look as well. Beyond this one userfaultfd move, we’ve observed
severe anon_vma lock contention between fork, unmap (process exit), and
memory reclamation. This has caused noticeable UI stutters, especially
when many VMAs share the same anon_vma root.
Thanks
Barry
Powered by blists - more mailing lists