[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251014231501.2301398-1-peterx@redhat.com>
Date: Tue, 14 Oct 2025 19:14:57 -0400
From: Peter Xu <peterx@...hat.com>
To: linux-kernel@...r.kernel.org,
linux-mm@...ck.org
Cc: Mike Rapoport <rppt@...nel.org>,
Muchun Song <muchun.song@...ux.dev>,
Nikita Kalyazin <kalyazin@...zon.com>,
Vlastimil Babka <vbabka@...e.cz>,
Axel Rasmussen <axelrasmussen@...gle.com>,
Andrew Morton <akpm@...ux-foundation.org>,
James Houghton <jthoughton@...gle.com>,
Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
David Hildenbrand <david@...hat.com>,
Hugh Dickins <hughd@...gle.com>,
Michal Hocko <mhocko@...e.com>,
Ujwal Kundur <ujwal.kundur@...il.com>,
"Liam R . Howlett" <Liam.Howlett@...cle.com>,
peterx@...hat.com,
Oscar Salvador <osalvador@...e.de>,
Suren Baghdasaryan <surenb@...gle.com>,
Andrea Arcangeli <aarcange@...hat.com>
Subject: [PATCH v4 0/4] mm/userfaultfd: modulize memory types
[based on latest akpm/mm-new of Oct 14th, commit 36c6c5ce1b275]
v4:
- Some cleanups within vma_can_userfault() [David]
- Rename uffd_get_folio() to minor_get_folio() [David]
- Remove uffd_features in vm_uffd_ops, deduce it from supported ioctls [David]
v1: https://lore.kernel.org/r/20250620190342.1780170-1-peterx@redhat.com
v2: https://lore.kernel.org/r/20250627154655.2085903-1-peterx@redhat.com
v3: https://lore.kernel.org/r/20250926211650.525109-1-peterx@redhat.com
This series is an alternative proposal of what Nikita proposed here on the
initial three patches:
https://lore.kernel.org/r/20250404154352.23078-1-kalyazin@amazon.com
This is not yet relevant to any guest-memfd support, but paving way for it.
Here, the major goal is to make kernel modules be able to opt-in with any
form of userfaultfd supports, like guest-memfd. This alternative option
should hopefully be cleaner, and avoid leaking userfault details into
vm_ops.fault().
It also means this series does not depend on anything. It's a pure
refactoring of userfaultfd internals to provide a generic API, so that
other types of files, especially RAM based, can support userfaultfd without
touching mm/ at all.
To achieve it, this series introduced a file operation called vm_uffd_ops.
The ops needs to be provided when a file type supports any of userfaultfd.
With that, I moved both hugetlbfs and shmem over, whenever possible. So
far due to concerns on exposing an uffd_copy() API, the MISSING faults are
still separately processed and can only be done within mm/. Hugetlbfs kept
its special paths untouched.
An example of shmem uffd_ops:
static const struct vm_uffd_ops shmem_uffd_ops = {
.supported_ioctls = BIT(_UFFDIO_COPY) |
BIT(_UFFDIO_ZEROPAGE) |
BIT(_UFFDIO_WRITEPROTECT) |
BIT(_UFFDIO_CONTINUE) |
BIT(_UFFDIO_POISON),
.minor_get_folio = shmem_uffd_get_folio,
};
To show another sample, this is the patch that Nikita posted to implement
minor fault for guest-memfd (on top of older versions of this series):
https://lore.kernel.org/all/114133f5-0282-463d-9d65-3143aa658806@amazon.com/
No functional change expected at all after the whole series applied. There
might be some slightly stricter check on uffd ops here and there in the
last patch, but that really shouldn't stand out anywhere to anyone.
For testing: besides the cross-compilation tests, I did also try with
uffd-stress in a VM to measure any perf difference before/after the change;
The static call becomes a pointer now. I really cannot measure anything
different, which is more or less expected.
Comments welcomed, thanks.
Peter Xu (4):
mm: Introduce vm_uffd_ops API
mm/shmem: Support vm_uffd_ops API
mm/hugetlb: Support vm_uffd_ops API
mm: Apply vm_uffd_ops API to core mm
include/linux/mm.h | 9 +++
include/linux/userfaultfd_k.h | 75 +++++++++++----------
mm/hugetlb.c | 18 +++++
mm/shmem.c | 24 +++++++
mm/userfaultfd.c | 120 +++++++++++++++++++++++++++-------
5 files changed, 189 insertions(+), 57 deletions(-)
--
2.50.1
Powered by blists - more mailing lists