lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ffeeb3d2-0e45-43d1-b2e1-a55f09b160f5@redhat.com>
Date: Fri, 13 Jun 2025 21:57:23 +0200
From: David Hildenbrand <david@...hat.com>
To: Oscar Salvador <osalvador@...e.de>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
 Muchun Song <muchun.song@...ux.dev>, James Houghton <jthoughton@...gle.com>,
 Peter Xu <peterx@...hat.com>, Gavin Guo <gavinguo@...lia.com>,
 linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/5] mm,hugetlb: Document the reason to lock the folio in
 the faulting path

On 13.06.25 16:23, Oscar Salvador wrote:
> On Fri, Jun 13, 2025 at 03:56:15PM +0200, David Hildenbrand wrote:
>> On 12.06.25 15:46, Oscar Salvador wrote:
>>> -	/* hugetlb_wp() requires page locks of pte_page(vmf.orig_pte) */
>>> +	/*
>>> +	 * We need to lock the folio before calling hugetlb_wp().
>>> +	 * Either the folio is in the pagecache and we need to copy it over
>>> +	 * to another file, so it must remain stable throughout the operation,
>>
>> But as discussed, why is that the case? We don't need that for ordinary
>> pages, and existing folio mappings can already concurrently modify the page?
> 
> Normal faulting path takes the lock when we fault in a file read-only or to
> to map it privately.
> That is done via __do_fault or cow_fault, in __do_fault()->vma->vm_ops_>fault().
> E.g. filemap_fault() will locate the page and lock it.
> And it will hold it during the entire operation, note that we unlock it
> after we have called finish_fault().
 > > The page can't go away because filemap_fault also gets a reference on
> it, so I guess it's to hold it stable.
> 

What I meant is:

Assume we have a pagecache page mapped into our page tables R/O 
(MAP_PRIVATE mapping).

During a write fault on such a pagecache page, we end up in 
do_wp_page()->wp_page_copy() we perform the copy via 
__wp_page_copy_user() without the folio lock.

In wp_page_copy(), we retake the pt lock, to make sure that the page is 
still mapped (pte_same). If the page is no longer mapped, we retry the 
fault.

In that case, we only want to make sure that the folio is still mapped 
after possibly dropping the page table lock in between.

As we are holding an additional folio reference in 
do_wp_page()->wp_page_copy(), the folio cannot get freed concurrently.


There is indeed the do_cow_fault() path where we avoid faulting in the 
pagecache page in the first place. So no page table reference, an I can 
understand why we would need the folio lock there.


Regarding hugetlb_no_page(): I think we could drop the folio lock for a 
pagecache folio after inserting the folio into the page table. Just like 
do_wp_page()->wp_page_copy(), we would have to verify again under PTL if 
the folio is still mapped

... which we already do through pte_same() checks?

-- 
Cheers,

David / dhildenb


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ