[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <jdy364dcag4nabmtjvykvxd5xxxi37yz5a6hknnqjtea65jz3y@powzev34t255>
Date: Wed, 17 Sep 2025 11:29:01 -0400
From: "Liam R. Howlett" <Liam.Howlett@...cle.com>
To: Peter Xu <peterx@...hat.com>
Cc: Nikita Kalyazin <kalyazin@...zon.com>,
Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
David Hildenbrand <david@...hat.com>, Mike Rapoport <rppt@...nel.org>,
Suren Baghdasaryan <surenb@...gle.com>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, Vlastimil Babka <vbabka@...e.cz>,
Muchun Song <muchun.song@...ux.dev>, Hugh Dickins <hughd@...gle.com>,
Andrew Morton <akpm@...ux-foundation.org>,
James Houghton <jthoughton@...gle.com>, Michal Hocko <mhocko@...e.com>,
Andrea Arcangeli <aarcange@...hat.com>,
Oscar Salvador <osalvador@...e.de>,
Axel Rasmussen <axelrasmussen@...gle.com>,
Ujwal Kundur <ujwal.kundur@...il.com>
Subject: Re: [PATCH v2 1/4] mm: Introduce vm_uffd_ops API
* Peter Xu <peterx@...hat.com> [250916 16:05]:
> On Mon, Sep 08, 2025 at 12:53:37PM -0400, Liam R. Howlett wrote:
> > What we don't want is non-mm code specifying values for the function
> > pointer and doing what they want, or a function pointer that returns a
> > core mm resource (in the old example this was a vma, here it is a
> > folio).
> >
> > From this patch set:
> > + * Return: zero if succeeded, negative for errors.
> > + */
> > + int (*uffd_get_folio)(struct inode *inode, pgoff_t pgoff,
> > + struct folio **folio);
> >
> > This is one of the contention points in the current scenario as the
> > folio would be returned.
>
> OK I didn't see this one previously, it partly answers one of my question
> in the other reply, in a way I wished not.
>
> Could you elaborate why an API returning an folio pointer would be
> dangerous?
I did [1], and Lorenzo did [2].
This is a bad design that has gotten us in trouble and we don't need to
do it. This is an anti-pattern.
>
> OTOH, would you think alloc_pages() or folio_alloc() be dangerous too?
>
> They return a folio from the mm core to drivers, hence it's not the same
> direction of folio sharing, however it also means at least the driver can
> manipulate the folio / memmap as much as it wants, sabotaging everything is
> similarly possible. Why we worry about that?
Are you expecting the same number of memory types as drivers? I'm not
sure I want to live in that reality.
>
> Are we going to unexport alloc_pages() someday?
I guess, over a long enough timeline all functions will be unexported.
And this is exactly why a 'two phase' approach to 'revisit if necessary'
[3] is a problem. When we tried to remove the use of mm pointers in
drivers, we were stuck looking at hundreds of lines of code in a single
driver trying to figure out what was going on.
Seriously, I added locking and it added a regression so they removed it
[4]. It took years to get that driver to a more sensible state, and I'm
really happy that Carlos Llamas did all that work!
We regularly had to fight with people to stop caching a pointer
internally. You know why they needed the vma pointer? To modify the
vma flag, but only a few modifications were supposed to be made.. yet it
spawned years of cleanup.
And you're asking us to do it again.
Why can't we use enums to figure out what to do [5], one of which could
be the new functionality for guest_memfd?
There are many ways that this can be done with limited code living in
the mm that are safer and more maintainable and testable than handing
out pointers that have locking hell [6] to anyone with a source file.
Thanks,
Liam
[1]. https://lore.kernel.org/linux-mm/5ixvg4tnwj53ixh2fx26dxlctgqtayydqryqxhns6bwj3q3ies@6sttjti5dxt7/
[2]. https://lore.kernel.org/linux-mm/982f4f94-f0bf-45dd-9003-081b76e57027@lucifer.local/
[3]. https://lore.kernel.org/linux-mm/e9235b88-e2be-4358-a6fb-507c5cad6fd9@lucifer.local/
[4]. https://lore.kernel.org/all/20220621140212.vpkio64idahetbyf@revolver/T/#m9d9c8911447e395a73448700d7f06a4366b5ae02
[5]. https://lore.kernel.org/linux-mm/54bb09fc-144b-4d61-8bc2-1eca4d278382@lucifer.local/
[6]. https://elixir.bootlin.com/linux/v6.16.7/source/mm/rmap.c#L21
Powered by blists - more mailing lists