[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aDc4uO_Vq-q7ks5h@x1.local>
Date: Wed, 28 May 2025 12:24:24 -0400
From: Peter Xu <peterx@...hat.com>
To: James Houghton <jthoughton@...gle.com>
Cc: Oscar Salvador <osalvador@...e.de>,
David Hildenbrand <david@...hat.com>,
Gavin Guo <gavinguo@...lia.com>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, muchun.song@...ux.dev,
akpm@...ux-foundation.org, mike.kravetz@...cle.com,
kernel-dev@...lia.com, stable@...r.kernel.org,
Hugh Dickins <hughd@...gle.com>, Florent Revest <revest@...gle.com>,
Gavin Shan <gshan@...hat.com>
Subject: Re: [PATCH v3] mm/hugetlb: fix a deadlock with pagecache_folio and
hugetlb_fault_mutex_table
On Wed, May 28, 2025 at 12:14:28PM -0400, James Houghton wrote:
[...]
> > > For 2) I am also not sure if we need need the pagecache folio locked; I
> > > doubt it ... but this code is not the easiest to follow.
> >
> > I have been staring at that code and thinking about potential scenarios
> > for a few days now, and I cannot convice myself that we need
> > pagecache_folio's lock when pagecache_folio != old_folio because as a
> > matter of fact I cannot think of anything it protects us against.
>
> Hi Oscar,
Hey, James,
>
> Have you thought about the UFFDIO_CONTINUE case (hugetlb_mfill_atomic_pte())?
>
> I'm slightly concerned that, if you aren't holding pagecache_folio's
> lock, there might be issues where hugetlb_mfill_atomic_pte() proceeds
> to map a hugetlb page that it is not supposed to. (For example, if the
> fault handler does not generally hold pagecache_folio's lock,
> hugetlb_mfill_atomic_pte() will see a page in the pagecache and map
> it, even though it may not have been zeroed yet.)
>
> I haven't had enough time to fully think through this case, but just
> want to make sure it has been considered.
AFAIU we're talking about two separate code paths. IIUC you're talking
about a fresh new hugetlb folio being allocated, but then that's what
hugetlb_no_page() does. Folio lock required there.
Here IIUC Oscar's context is only in hugetlb_wp() where there's a niche use
case to compare whether a VM_PRIVATE has already CoWed once from a
pagecache, and whether we need the folio lock for the pagecache lookup.
Aka, this one:
if ((flags & (FAULT_FLAG_WRITE|FAULT_FLAG_UNSHARE)) &&
!(vma->vm_flags & VM_MAYSHARE) && !huge_pte_write(vmf.orig_pte)) {
if (vma_needs_reservation(h, vma, vmf.address) < 0) {
ret = VM_FAULT_OOM;
goto out_mutex;
}
/* Just decrements count, does not deallocate */
vma_end_reservation(h, vma, vmf.address);
pagecache_folio = filemap_lock_hugetlb_folio(h, mapping, <---
vmf.pgoff);
if (IS_ERR(pagecache_folio))
pagecache_folio = NULL;
}
Thanks,
--
Peter Xu
Powered by blists - more mailing lists