[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <35b03911-74c3-4626-aaa8-4c331c086f8f@redhat.com>
Date: Mon, 23 Jun 2025 16:11:38 +0200
From: David Hildenbrand <david@...hat.com>
To: Oscar Salvador <osalvador@...e.de>,
Andrew Morton <akpm@...ux-foundation.org>
Cc: Muchun Song <muchun.song@...ux.dev>, Peter Xu <peterx@...hat.com>,
Gavin Guo <gavinguo@...lia.com>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 2/5] mm,hugetlb: Sort out folio locking in the faulting
path
On 20.06.25 14:30, Oscar Salvador wrote:
> Recent conversations showed that there was a misunderstanding about why we
> were locking the folio prior to call in hugetlb_wp().
> In fact, as soon as we have the folio mapped into the pagetables, we no longer
> need to hold it locked, because we know that no concurrent truncation could have
> happened.
> There is only one case where the folio needs to be locked, and that is when we
> are handling an anonymous folio, because hugetlb_wp() will check whether it can
> re-use it exclusively for the process that is faulting it in.
>
> So, pass the folio locked to hugetlb_wp() when that is the case.
>
> Suggested-by: David Hildenbrand <david@...hat.com>
> Signed-off-by: Oscar Salvador <osalvador@...e.de>
> ---
> mm/hugetlb.c | 43 +++++++++++++++++++++++++++++++++----------
> 1 file changed, 33 insertions(+), 10 deletions(-)
>
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 175edafeec67..1a5f713c1e4c 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -6437,6 +6437,7 @@ static vm_fault_t hugetlb_no_page(struct address_space *mapping,
> pte_t new_pte;
> bool new_folio, new_pagecache_folio = false;
> u32 hash = hugetlb_fault_mutex_hash(mapping, vmf->pgoff);
> + bool folio_locked = true;
>
> /*
> * Currently, we are forced to kill the process in the event the
> @@ -6602,6 +6603,11 @@ static vm_fault_t hugetlb_no_page(struct address_space *mapping,
>
> hugetlb_count_add(pages_per_huge_page(h), mm);
> if ((vmf->flags & FAULT_FLAG_WRITE) && !(vma->vm_flags & VM_SHARED)) {
> + /* No need to lock file folios. See comment in hugetlb_fault() */
> + if (!anon_rmap) {
> + folio_locked = false;
> + folio_unlock(folio);
> + }
> /* Optimization, do the COW without a second fault */
> ret = hugetlb_wp(vmf);
> }
> @@ -6616,7 +6622,8 @@ static vm_fault_t hugetlb_no_page(struct address_space *mapping,
> if (new_folio)
> folio_set_hugetlb_migratable(folio);
>
> - folio_unlock(folio);
> + if (folio_locked)
> + folio_unlock(folio);
> out:
> hugetlb_vma_unlock_read(vma);
>
> @@ -6636,7 +6643,8 @@ static vm_fault_t hugetlb_no_page(struct address_space *mapping,
> if (new_folio && !new_pagecache_folio)
> restore_reserve_on_error(h, vma, vmf->address, folio);
>
> - folio_unlock(folio);
> + if (folio_locked)
> + folio_unlock(folio);
> folio_put(folio);
> goto out;
> }
> @@ -6670,7 +6678,7 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma,
> {
> vm_fault_t ret;
> u32 hash;
> - struct folio *folio;
> + struct folio *folio = NULL;
> struct hstate *h = hstate_vma(vma);
> struct address_space *mapping;
> struct vm_fault vmf = {
> @@ -6687,6 +6695,7 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma,
> * be hard to debug if called functions make assumptions
> */
> };
> + bool folio_locked = false;
>
> /*
> * Serialize hugepage allocation and instantiation, so that we don't
> @@ -6801,13 +6810,24 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma,
> /* Fallthrough to CoW */
> }
>
> - /* hugetlb_wp() requires page locks of pte_page(vmf.orig_pte) */
> - folio = page_folio(pte_page(vmf.orig_pte));
> - folio_lock(folio);
> - folio_get(folio);
> -
> if (flags & (FAULT_FLAG_WRITE|FAULT_FLAG_UNSHARE)) {
> if (!huge_pte_write(vmf.orig_pte)) {
> + /*
> + * Anonymous folios need to be lock since hugetlb_wp()
> + * checks whether we can re-use the folio exclusively
> + * for us in case we are the only user of it.
> + */
Should we move that comment to hugetlb_wp() instead? And if we are
already doing this PTL unlock dance now, why not do it in hugetlb_wp()
instead so we can simplify this code?
--
Cheers,
David / dhildenb
Powered by blists - more mailing lists