[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250630144212.156938-1-osalvador@suse.de>
Date: Mon, 30 Jun 2025 16:42:07 +0200
From: Oscar Salvador <osalvador@...e.de>
To: Andrew Morton <akpm@...ux-foundation.org>
Cc: David Hildenbrand <david@...hat.com>,
Muchun Song <muchun.song@...ux.dev>,
Peter Xu <peterx@...hat.com>,
Gavin Guo <gavinguo@...lia.com>,
linux-mm@...ck.org,
linux-kernel@...r.kernel.org,
Oscar Salvador <osalvador@...e.de>
Subject: [PATCH v4 0/5] Misc rework on hugetlb faulting path
v3 -> v4:
- Fix the deadlock for real in patch#1 (Kudos to Galving for testing out v3)
- Keep trylock as the original code does
- Add reproducer in the cover letter
v2 -> v3:
- Addressed issue folio_lock when holding spinlock (per David)
- Simplify new_anon_folio (per David)
- Slightly rework patch#2 to make it simpler
v1 -> v2:
- Addressed feedback from David
- Settle ideas wrt. locking in faulting path after
discussion with David
- Add Acks-by
RFC -> v1:
- Stop looking up folio in the pagecache for detecting a COW
on a private mapping.
- Document the locks
This patchset aims to give some love to the hugetlb faulting path, doing so
by removing obsolete comments that are no longer true, sorting out the folio
lock, and changing the mechanism we use to determine whether we are COWing a
private mapping already.
The most important patch of the series is #1, as it fixes a deadlock that
was described in [1], where two processes were holding the same lock
for the folio in the pagecache, and then deadlocked in the mutex.
Note that this can also happen for anymous folios. This has been tested using
this reproducer [2].
Looking up and locking the folio in the pagecache was done to check whether
that folio was the same folio we had mapped in our pagetables, meaning that if it
was different we knew that we already mapped that folio privately, so any
further CoW would be made on a private mapping, which lead us to the question:
__Was the reservation for that address consumed?__
That is all we care about, because if it was indeed consumed and we are the
owner and we cannot allocate more folios, we need to unmap the folio from the
processes pagetables and make it exclusive for us.
We figured we do not need to look up the folio at all, and it is just enough to
check whether the folio we have mapped is anonymous, which means we mapped it
privately, so the reservation was indeed consumed.
Patch#2 sorts out folio locking in the faulting path, reducing the scope of it
,only taking it when we are dealing with an anonymous folio and document it.
More details in the patch.
Patch#3-5 are cleanups.
[1] https://lore.kernel.org/lkml/20250513093448.592150-1-gavinguo@igalia.com/
[2] Here is the reproducer:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/mman.h>
#include <sys/wait.h>
#define PROTECTION (PROT_READ | PROT_WRITE)
#define LENGTH (2UL*1024*1024)
#define ADDR (void *)(0x0UL)
#define FLAGS (MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB)
void __read(char *addr)
{
int i = 0;
printf("a[%d]: %c\n", i, addr[i]);
}
void fill(char *addr)
{
addr[0] = 'd';
printf("addr: %c\n", addr[0]);
}
int main(void)
{
void *addr;
pid_t pid, wpid;
int status;
addr = mmap(ADDR, LENGTH, PROTECTION, FLAGS, -1, 0);
if (addr == MAP_FAILED) {
perror("mmap");
return -1;
}
printf("Parent faulting in RO\n");
__read(addr);
sleep (10);
printf("Forking\n");
pid = fork();
switch (pid) {
case -1:
perror("fork");
break;
case 0:
sleep (4);
printf("Child: Faulting in\n");
fill(addr);
exit(0);
break;
default:
printf("Parent: Faulting in\n");
fill(addr);
while((wpid = wait(&status)) > 0);
if (munmap(addr, LENGTH))
perror("munmap");
}
return 0;
}
You will also have to add a delay in hugetlb_wp, after releasing the mutex and
before unmapping, so the window is large enough to reproduce it reliably.
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index fda6b748e985..5601a9cf819b 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -38,6 +38,7 @@
#include <linux/memory.h>
#include <linux/mm_inline.h>
#include <linux/padata.h>
+#include <linux/delay.h>
#include <asm/page.h>
#include <asm/pgalloc.h>
@@ -6261,6 +6262,8 @@ static vm_fault_t hugetlb_wp(struct vm_fault *vmf)
hugetlb_vma_unlock_read(vma);
mutex_unlock(&hugetlb_fault_mutex_table[hash]);
+ mdelay(8000);
+
unmap_ref_private(mm, vma, old_folio, vmf->address);
mutex_lock(&hugetlb_fault_mutex_table[hash]);
Oscar Salvador (5):
mm,hugetlb: change mechanism to detect a COW on private mapping
mm,hugetlb: sort out folio locking in the faulting path
mm,hugetlb: rename anon_rmap to new_anon_folio and make it boolean
mm,hugetlb: drop obsolete comment about non-present pte and second
faults
mm,hugetlb: drop unlikelys from hugetlb_fault
mm/hugetlb.c | 132 +++++++++++++++++++++++----------------------------
1 file changed, 60 insertions(+), 72 deletions(-)
--
2.50.0
Powered by blists - more mailing lists