[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20230104225207.1066932-1-peterx@redhat.com>
Date: Wed, 4 Jan 2023 17:52:04 -0500
From: Peter Xu <peterx@...hat.com>
To: linux-mm@...ck.org, linux-kernel@...r.kernel.org
Cc: Mike Kravetz <mike.kravetz@...cle.com>,
Muchun Song <songmuchun@...edance.com>, peterx@...hat.com,
Nadav Amit <nadav.amit@...il.com>,
Andrea Arcangeli <aarcange@...hat.com>,
David Hildenbrand <david@...hat.com>,
James Houghton <jthoughton@...gle.com>,
Axel Rasmussen <axelrasmussen@...gle.com>,
Andrew Morton <akpm@...ux-foundation.org>
Subject: [PATCH 0/3] mm/uffd: Fix missing markers on hugetlb
When James was developing the vma split fix for hugetlb pmd sharing, he
found that hugetlb uffd-wp is broken with the test case he developed [1]:
https://lore.kernel.org/r/CADrL8HWSym93=yNpTUdWebOEzUOTR2ffbfUk04XdK6O+PNJNoA@mail.gmail.com
Missing hugetlb pgtable pages caused uffd-wp to lose message when vma split
happens to be across a shared huge pmd range in the test.
The issue is pgtable pre-allocation on hugetlb path was overlooked. That
was fixed in patch 1.
Meanwhile there's another issue on proper reporting of pgtable allocation
failures during UFFDIO_WRITEPROTECT. When pgtable allocation failed during
the ioctl(UFFDIO_WRITEPROTECT), we will silent the error so the user cannot
detect it (even if extremely rare). This issue can happen not only on
hugetlb but also shmem. Anon is not affected because anon doesn't require
pgtable allocation during wr-protection. Patch 2 prepares for such a
change, then patch 3 allows the error to be reported to the users.
This set only marks patch 1 to copy stable, because it's a real bug to be
fixed for all kernels 5.19+.
Patch 2-3 will be an enhancement to process pgtable allocation errors, it
should hardly be hit even during heavy workloads in the past of my tests,
but it should make the interface clearer. Not copying stable for patch 2-3
due to that. I'll prepare a man page update after patch 2-3 lands.
Tested with:
- James's reproducer above [1] so it'll start to pass with the vma split
fix:
https://lore.kernel.org/r/20230101230042.244286-1-jthoughton@google.com
- Faked memory pressures to make sure -ENOMEM returned with either shmem
and hugetlbfs
- Some uffd general routines
Peter Xu (3):
mm/hugetlb: Pre-allocate pgtable pages for uffd wr-protects
mm/mprotect: Use long for page accountings and retval
mm/uffd: Detect pgtable allocation failures
include/linux/hugetlb.h | 4 +-
include/linux/mm.h | 2 +-
include/linux/userfaultfd_k.h | 2 +-
mm/hugetlb.c | 21 +++++++--
mm/mempolicy.c | 4 +-
mm/mprotect.c | 89 ++++++++++++++++++++++-------------
mm/userfaultfd.c | 16 +++++--
7 files changed, 88 insertions(+), 50 deletions(-)
--
2.37.3
Powered by blists - more mailing lists