lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aD9I-y6lDHM92zUm@x1.local>
Date: Tue, 3 Jun 2025 15:11:55 -0400
From: Peter Xu <peterx@...hat.com>
To: David Hildenbrand <david@...hat.com>
Cc: Oscar Salvador <osalvador@...e.de>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Muchun Song <muchun.song@...ux.dev>,
	James Houghton <jthoughton@...gle.com>,
	Gavin Guo <gavinguo@...lia.com>, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH 1/3] mm, hugetlb: Clean up locking in hugetlb_fault
 and hugetlb_wp

On Tue, Jun 03, 2025 at 07:19:13PM +0200, David Hildenbrand wrote:
> > > As stated elsewhere, the mapcount check + folio_move_anon_rmap need the
> > > folio lock.
> > 
> > Could you elaborate what would go wrong if we do folio_move_anon_rmap()
> > without folio lock here?  Just to make sure we're on the same page: we
> > already have pgtable lock held, and we decided to reuse an anonymous
> > hugetlb page.
> 
> For now we have
> 
> VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio);
> 
> right at the beginning of folio_move_anon_rmap().
> 
> That dates back to
> 
> commit c44b674323f4a2480dbeb65d4b487fa5f06f49e0
> Author: Rik van Riel <riel@...hat.com>
> Date:   Fri Mar 5 13:42:09 2010 -0800
> 
>     rmap: move exclusively owned pages to own anon_vma in do_wp_page()
>     When the parent process breaks the COW on a page, both the original which
>     is mapped at child and the new page which is mapped parent end up in that
>     same anon_vma.  Generally this won't be a problem, but for some workloads
>     it could preserve the O(N) rmap scanning complexity.
>     A simple fix is to ensure that, when a page which is mapped child gets
>     reused in do_wp_page, because we already are the exclusive owner, the page
>     gets moved to our own exclusive child's anon_vma.
> 
> 
> My recollection is that the folio lock protects folio->mapping. So relevant rmap walks

Yes, I had similar impression but only for file, as the comment discussed
in rmap_walk_file().  For anonymous, it was always not clear to me, as at
least rmap walk anon doesn't seem to need folio lock in some special paths
like damon/page_idle/folio_referenced.

[1]

> that hold the folio lock can assume that folio->mapping and
> thereby folio_anon_vma() cannot change.
> 
> folio_lock_anon_vma_read() documents something regarding the folio lock protecting the
> anon_vma.

Right, I remember that change, though I was expecting the comment was
referring to the assert(locked) above.  Unfortunately we didn't have more
clue on the folio lock, even though the change itself makes perfect sense
regardless, to double check anon_vma from changing (after UFFDIO_MOVE).

> 
> I can only speculate that locking the folio is cheaper than locking the relevant anon_vma, and
> that rmap code depends on that.

I see this as two separate things to protect: folio->mapping, and the
anon_vma tree inside of it.

For now it looks like we're "almost" using folio lock to protect
folio->mapping for anon, however we could still read folio->mapping without
folio lock, per discussed above [1].  Below commit should be another sign
of that, where Alex mentioned the WRITE_ONCE needed for page_idle.

IOW, even with folio lock held, something can be reading folio->mapping,
and further walking the anon_vma..

I had a long-standing feeling that it works out only because anon_vma
updates can only happen within parent/child processes, so they're
internally holding the same anon_vma lock (anon_vma_trylock_read, taking
the root lock).  Hence even if a race happened it'll still take the same
lock.

I think it means as long as we decided to reuse an anon page (hugetlb or
not, as long as holding the pgtable lock), updating folio->mapping
with/without the lock should keep working.. But maybe I missed something.
And it may not be extremely important either so far; taking the lock
doesn't seem to be bad anyway.  It's only some confusion I never figured
out myself.

Thanks,

> 
> 
> I'll note that in the introducing commit we didn't use the WRITE_ONCE, though. That was added in
> 
> commit 16f5e707d6f6f7644ff07e583b8f18c3dcc5499f
> Author: Alex Shi <alexs@...nel.org>
> Date:   Tue Dec 15 12:33:42 2020 -0800
> 
>     mm/rmap: stop store reordering issue on page->mapping
> 
> But I don't think that the folio lock was a replacement to that WRITE_ONCE.

-- 
Peter Xu


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ