[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aEychl8ZkJDG1-5K@localhost.localdomain>
Date: Fri, 13 Jun 2025 23:47:50 +0200
From: Oscar Salvador <osalvador@...e.de>
To: David Hildenbrand <david@...hat.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Muchun Song <muchun.song@...ux.dev>,
James Houghton <jthoughton@...gle.com>,
Peter Xu <peterx@...hat.com>, Gavin Guo <gavinguo@...lia.com>,
linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/5] mm,hugetlb: Document the reason to lock the folio in
the faulting path
On Fri, Jun 13, 2025 at 09:57:23PM +0200, David Hildenbrand wrote:
> What I meant is:
>
> Assume we have a pagecache page mapped into our page tables R/O (MAP_PRIVATE
> mapping).
>
> During a write fault on such a pagecache page, we end up in
> do_wp_page()->wp_page_copy() we perform the copy via __wp_page_copy_user()
> without the folio lock.
Yes, it would be similar to doing
hugetlb_fault()->hugetlb_no_page() which would map it R/O.
Then, if we write to it, we will go to hugetlb_wp().
Since it is a private mapping, we would only need to lock the folio to
see if we can re-use it (the wp_can_reuse_anon_folio() analog to
hugetlb).
> In wp_page_copy(), we retake the pt lock, to make sure that the page is
> still mapped (pte_same). If the page is no longer mapped, we retry the
> fault.
>
> In that case, we only want to make sure that the folio is still mapped after
> possibly dropping the page table lock in between.
>
> As we are holding an additional folio reference in
> do_wp_page()->wp_page_copy(), the folio cannot get freed concurrently.
>
>
> There is indeed the do_cow_fault() path where we avoid faulting in the
> pagecache page in the first place. So no page table reference, an I can
> understand why we would need the folio lock there.
But do_cow_fault() does take a reference via __do_fault()->filemap_fault().
> Regarding hugetlb_no_page(): I think we could drop the folio lock for a
> pagecache folio after inserting the folio into the page table. Just like
> do_wp_page()->wp_page_copy(), we would have to verify again under PTL if the
> folio is still mapped
>
> ... which we already do through pte_same() checks?
But that is somewhat similar what we do in the generic faulting path.
Assume you fault in a file for a private mapping and do COW.
So, do_pte_missing()->do_fault()->do_cow_fault().
do_cow_fault()->__do_fault() will a) get a reference and b) lock the folio.
And then we will proceed with copying the file to the page we will map privately.
This would be something like hugetlb_fault()->hugetlb_no_page()->hugetlb_wp().
So we have to hold the lock throughout hugetlb_wp() for file pages we are copying
to private mappings.
Now, let us assume you map the file R/O. And after a while you write-fault to it.
In the generic faulting path, that will go through:
do_pte_missing()->do_fault()->do_read_fault()
do_wp_page()->wp_page_copy()
wp_page_copy(), which indeed doesn't hold the lock (but takes a reference).
Maybe it's because it's Friday, but I'm confused as to why
do_pte_missing()->do_fault()->do_cow_fault() holds the lock while do_wp_page() doesn't
although it might the file's page we have to copy.
--
Oscar Salvador
SUSE Labs
Powered by blists - more mailing lists