[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aOVEDii4HPB6outm@x1.local>
Date: Tue, 7 Oct 2025 12:47:10 -0400
From: Peter Xu <peterx@...hat.com>
To: David Hildenbrand <david@...hat.com>
Cc: "Liam R. Howlett" <Liam.Howlett@...cle.com>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org,
Axel Rasmussen <axelrasmussen@...gle.com>,
Vlastimil Babka <vbabka@...e.cz>,
James Houghton <jthoughton@...gle.com>,
Nikita Kalyazin <kalyazin@...zon.com>,
Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
Ujwal Kundur <ujwal.kundur@...il.com>,
Mike Rapoport <rppt@...nel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Andrea Arcangeli <aarcange@...hat.com>,
Michal Hocko <mhocko@...e.com>, Muchun Song <muchun.song@...ux.dev>,
Oscar Salvador <osalvador@...e.de>, Hugh Dickins <hughd@...gle.com>,
Suren Baghdasaryan <surenb@...gle.com>
Subject: Re: [PATCH v3 1/4] mm: Introduce vm_uffd_ops API
On Tue, Oct 07, 2025 at 06:14:01PM +0200, David Hildenbrand wrote:
> > > If so, I'd prefer that rather than introducing feature-backend flags,
> > > because I want to avoid introducing another different feature set to uffd.
> > >
> >
> > I was talking about uffd_features. I thought it was being renamed to
> > flags, not modes_supported. It was pretty late when I responded.
> >
> > FWIU, David was saying we don't need both of modes and ioctl listed in
> > the uffd_ops?
>
> Right, I would have abstracted the features to clean it up and avoid using
> VM_ flags in this interface.
>
> >
> > I was thinking that we could just put the features directly as function
> > pointers in the uffd_ops and check if they are NULL or not for
> > 'support'.
> >
> > ie:
> >
> > struct vm_uffd_ops hugetlb_uffd_ops = {
> > .missing = hugetlb_handle_userfault,
This is not needed because the logic to handle userfault isn't very type
sensitive. Hugetlb is the only one that differs very lightly, but again I
think we should better take hugetlbfs as special always as of now, per all
the previous discussions on hugetlb unifications.
> > .write_protect = mwriteprotect_range,
This is not needed. WP processing is the same.
> > .minor = hugetlb_handle_userfault_minor,
We can do that, but ultimately we do almost exactly the same for all memory
types except a fetch on the page cache. IMHO this will make it awkward..
because 99% of the minor hooks will do the same thing.. OTOH, it makes
more sense to me that the hook is defined to cover what is different on the
memory type.
I'll stop commenting on the rest.
> >
> > .mfill_atomic = hugetlb_mfill_atomic_pte,
> > .mfill_atomic_continue = ...
> > .mfill_zeropage = ...
> > .mfill_poison = ...
> > .mfill_copy = NULL, /* For example */
> > };
> >
> > Then mfill_atomic_copy() becomes:
> > {
> > /*
> > * Maybe some setup, used for all mfill operations from
> > * mfill_atomic()
> > */
> >
> > ...
> >
> > dst_vma = uffd_mfill_lock()
> > uffd_ops = vma_get_uffd_ops(vma);
> > if (!uffd_ops)
> > return false;
> >
> > if (!uffd_ops->mfill_copy) /* unlikely? */
> > return false;
> >
> > return uffd_ops->mfill_copy(dst_vma,..);
> > }
> >
> > This way is_vm_hugetlb_page() never really needs to be used because the
> > function pointer already makes that distinction.
> >
> > Right now, we have checks for hugetlb through other functions that "pass
> > off to appropriate routine", and we end up translating the
> > ioctl_supports into the function call eventually, anyways.
>
> Right, it would be great to get rid of that. I recall I asked for such a
> cleanup in RFC (or was it v1).
I didn't send RFC, likely you meant this reply in v1?
https://lore.kernel.org/all/0126fa5f-b5aa-4a17-80d6-d428105e45c7@redhat.com/
I agree that another special-purpose file (like implemented by
guest_memfd) would need that. But if we could get rid of
"hugetlb"/"shmem" special-casing in userfaultfd, it would be a
rasonable independent cleanup.
Get rid of hugetlbfs is still not my goal as of in this series.
OTOH, I generalized shmem and removed shmem.h header from userfaultfd, but
that was prior versions when with uffd_copy() and it was rejected.
What should I do now to move this series forward? Could anyone provide a
solid answer?
Thanks,
--
Peter Xu
Powered by blists - more mailing lists