[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ffeeb3d2-0e45-43d1-b2e1-a55f09b160f5@redhat.com>
Date: Fri, 13 Jun 2025 21:57:23 +0200
From: David Hildenbrand <david@...hat.com>
To: Oscar Salvador <osalvador@...e.de>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Muchun Song <muchun.song@...ux.dev>, James Houghton <jthoughton@...gle.com>,
Peter Xu <peterx@...hat.com>, Gavin Guo <gavinguo@...lia.com>,
linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/5] mm,hugetlb: Document the reason to lock the folio in
the faulting path
On 13.06.25 16:23, Oscar Salvador wrote:
> On Fri, Jun 13, 2025 at 03:56:15PM +0200, David Hildenbrand wrote:
>> On 12.06.25 15:46, Oscar Salvador wrote:
>>> - /* hugetlb_wp() requires page locks of pte_page(vmf.orig_pte) */
>>> + /*
>>> + * We need to lock the folio before calling hugetlb_wp().
>>> + * Either the folio is in the pagecache and we need to copy it over
>>> + * to another file, so it must remain stable throughout the operation,
>>
>> But as discussed, why is that the case? We don't need that for ordinary
>> pages, and existing folio mappings can already concurrently modify the page?
>
> Normal faulting path takes the lock when we fault in a file read-only or to
> to map it privately.
> That is done via __do_fault or cow_fault, in __do_fault()->vma->vm_ops_>fault().
> E.g. filemap_fault() will locate the page and lock it.
> And it will hold it during the entire operation, note that we unlock it
> after we have called finish_fault().
> > The page can't go away because filemap_fault also gets a reference on
> it, so I guess it's to hold it stable.
>
What I meant is:
Assume we have a pagecache page mapped into our page tables R/O
(MAP_PRIVATE mapping).
During a write fault on such a pagecache page, we end up in
do_wp_page()->wp_page_copy() we perform the copy via
__wp_page_copy_user() without the folio lock.
In wp_page_copy(), we retake the pt lock, to make sure that the page is
still mapped (pte_same). If the page is no longer mapped, we retry the
fault.
In that case, we only want to make sure that the folio is still mapped
after possibly dropping the page table lock in between.
As we are holding an additional folio reference in
do_wp_page()->wp_page_copy(), the folio cannot get freed concurrently.
There is indeed the do_cow_fault() path where we avoid faulting in the
pagecache page in the first place. So no page table reference, an I can
understand why we would need the folio lock there.
Regarding hugetlb_no_page(): I think we could drop the folio lock for a
pagecache folio after inserting the folio into the page table. Just like
do_wp_page()->wp_page_copy(), we would have to verify again under PTL if
the folio is still mapped
... which we already do through pte_same() checks?
--
Cheers,
David / dhildenb
Powered by blists - more mailing lists