[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <cda7c46b-c474-48f4-b703-e2f988470f3b@amazon.com>
Date: Wed, 2 Jul 2025 18:08:44 +0100
From: Nikita Kalyazin <kalyazin@...zon.com>
To: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>, Suren Baghdasaryan
<surenb@...gle.com>
CC: Peter Xu <peterx@...hat.com>, <linux-mm@...ck.org>,
<linux-kernel@...r.kernel.org>, Vlastimil Babka <vbabka@...e.cz>, Muchun Song
<muchun.song@...ux.dev>, Mike Rapoport <rppt@...nel.org>, Hugh Dickins
<hughd@...gle.com>, Andrew Morton <akpm@...ux-foundation.org>, James Houghton
<jthoughton@...gle.com>, "Liam R . Howlett" <Liam.Howlett@...cle.com>,
"Michal Hocko" <mhocko@...e.com>, David Hildenbrand <david@...hat.com>,
"Andrea Arcangeli" <aarcange@...hat.com>, Oscar Salvador <osalvador@...e.de>,
"Axel Rasmussen" <axelrasmussen@...gle.com>, Ujwal Kundur
<ujwal.kundur@...il.com>
Subject: Re: [PATCH v2 1/4] mm: Introduce vm_uffd_ops API
On 02/07/2025 16:56, Lorenzo Stoakes wrote:
> On Tue, Jul 01, 2025 at 10:04:28AM -0700, Suren Baghdasaryan wrote:
>> On Mon, Jun 30, 2025 at 3:16 AM Lorenzo Stoakes
>> <lorenzo.stoakes@...cle.com> wrote:
>>> This feels like you're trying to put mm functionality outside of mm?
>>
>> To second that, two things stick out for me here:
>> 1. uffd_copy and uffd_get_folio seem to be at different abstraction
>> levels. uffd_copy is almost the entire copy operation for VM_SHARED
>> VMAs while uffd_get_folio is a small part of the continue operation.
>> 2. shmem_mfill_atomic_pte which becomes uffd_copy for shmem in the
>> last patch is quite a complex function which itself calls some IMO
>> pretty internal functions like mfill_atomic_install_pte(). Expecting
>> modules to implement such functionality seems like a stretch to me but
>> maybe this is for some specialized modules which are written by mm
>> experts only?
>
> To echo what Liam said - I don't think we can truly rely on expertise here
> (we make enough mistakes in core mm for that to be a dubious proposition
> even tere :) and even if experts were involved, having core mm
> functionality outside of core mm carries significant risk - we are
> constantly changing things, including assumptions around sensitive topics
> such as locking (think VMA locking) - having code elsewhere significantly
> increases the risk of missing things.
>
> I am also absolutely, to be frank, not going to accept us EXPORT()'ing
> anything core.
>
> Page table manipulation really must rely in core mm and arch code only, it
> is easily some of the most subtle, confusing and dangerous code in mm (I
> have spent subtantial hours banging my head against it recently), and again
> - subject to constant change.
>
> But to come back to Liam's comments and to reiterate what I was referring
> to earlier, even permitting drivers to have access to VMAs is _highly_
> problematic and has resulted in very real bugs and subtle issues that took
> many hours, much stress + gnashing of teeth to adress.
The main target of this change is the implementation of UFFD for
KVM/guest_memfd (examples: [1], [2]) to avoid bringing KVM-specific code
into the mm codebase. We usually mean KVM by the "drivers" in this
context, and it is already somewhat "knowledgeable" of the mm. I don't
think there are existing use cases for other drivers to implement this
at the moment.
Although I can't see new exports in this series, there is now a way to
limit exports to particular modules [3]. Would it help if we only do it
for KVM initially (if/when actually needed)?
[1]
https://lore.kernel.org/all/114133f5-0282-463d-9d65-3143aa658806@amazon.com/
[2]
https://lore.kernel.org/all/7666ee96-6f09-4dc1-8cb2-002a2d2a29cf@amazon.com/
[3]
https://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild.git/commit/?h=kbuild&id=707f853d7fa3ce323a6875487890c213e34d81a0
Thanks,
Nikita
>
> The very thing of:
>
> xxx
> <hand off sensitive mutable state X, Y, Z to driver>
> yyy
>
> Means that between xxx and yyy we can make literally no assumptions about
> what just happened to all handed off state. A single instance of this has
> caused mayhem, if we did this in such a way as to affect the _many_ uffd
> hooks we could have a realy serious problem.
>
> So - what seems really positive about this series is the _generalisation_
> and _abstraction_ of uffd functionality.
>
> That is something I appreciate and I think uffd sorely needs, in fact if we
> could find a way to not need to do:
>
> if (some_uffd_predicate())
> some_uffd_specific_fn();
>
> That'd be incredible.
>
> So I think the answer here is to do something like this, and to keep all
> the mm-specific code in core mm.
>
> Thanks, Lorenzo
Powered by blists - more mailing lists