[<prev] [next>] [day] [month] [year] [list]
Message-ID: <aNxxJodpbHceb3rF@google.com>
Date: Wed, 1 Oct 2025 00:09:10 +0000
From: Sean Christopherson <seanjc@...gle.com>
To: Fuad Tabba <tabba@...gle.com>
Cc: kvm@...r.kernel.org, linux-kernel@...r.kernel.org,
Yan Zhao <yan.y.zhao@...el.com>, Fuad Tabba <tabba@...gle.com>,
Binbin Wu <binbin.wu@...ux.intel.com>, Michael Roth <michael.roth@....com>,
Ira Weiny <ira.weiny@...el.com>, Rick P Edgecombe <rick.p.edgecombe@...el.com>,
Vishal Annapurve <vannapurve@...gle.com>, David Hildenbrand <david@...hat.com>,
Paolo Bonzini <pbonzini@...hat.com>
Subject: Re: [RFC PATCH v2 04/51] KVM: guest_memfd: Introduce
KVM_GMEM_CONVERT_SHARED/PRIVATE ioctls
On Tue, Sep 30, 2025, Sean Christopherson wrote:
> Trimmed Cc again.
And of course I forgot to actually Cc folks...
> On Thu, May 22, 2025, Sean Christopherson wrote:
> > On Thu, May 22, 2025, Fuad Tabba wrote:
> > > From a conceptual point of view, I understand that the in-place conversion is
> > > a property of guest_memfd. But that doesn't necessarily mean that the
> > > interface between kvm <-> guest_memfd is a userspace IOCTL.
>
> ...
>
> > A decent comparison is vCPUs. KVM _could_ route all ioctls through the VM, but
> > that's unpleasant for all parties, as it'd be cumbersome for userspace, and
> > unnecessarily complex and messy for KVM. Similarly, routing guest_memfd state
> > changes through KVM_SET_MEMORY_ATTRIBUTES is awkward from both design and mechanical
> > perspectives.
> >
> > Even if we disagree on how ugly/pretty routing conversions through kvm would be,
> > which I'll allow is subjective, the bigger problem is that bouncing through
> > KVM_SET_MEMORY_ATTRIBUTES would create an unholy mess of an ABI.
> >
> > Today, KVM_SET_MEMORY_ATTRIBUTES is handled entirely within kvm, and any changes
> > take effect irrespective of any memslot bindings. And that didn't happen by
> > chance; preserving and enforcing attribute changes independently of memslots was
> > a key design requirement, precisely because memslots are ephemeral to a certain
> > extent.
> >
> > Adding support for in-place guest_memfd conversion will require new ABI, and so
> > will be a "breaking" change for KVM_SET_MEMORY_ATTRIBUTES no matter what. E.g.
> > KVM will need to reject KVM_MEMORY_ATTRIBUTE_PRIVATE for VMs that elect to use
> > in-place guest_memfd conversions. But very critically, KVM can cripsly enumerate
> > the lack of KVM_MEMORY_ATTRIBUTE_PRIVATE via KVM_CAP_MEMORY_ATTRIBUTES, the
> > behavior will be very straightforward to document (e.g. CAP X is mutually excusive
> > with KVM_MEMORY_ATTRIBUTE_PRIVATE), and it will be opt-in, i.e. won't truly be a
> > breaking change.
> >
> > If/when we move shareability to guest_memfd, routing state changes through
> > KVM_SET_MEMORY_ATTRIBUTES will gain a subtle dependency on userspace having to
> > create memslots in order for state changes to take effect. That wrinkle would be
> > weird and annoying to document, e.g. "if CAP X is enabled, the ioctl ordering is
> > A => B => C, otherwise the ordering doesn't matter", and would create many more
> > conundrums:
> >
> > - If a memslot needs to exist in order for KVM_SET_MEMORY_ATTRIBUTES to take effect,
> > what should happen if that memslot is deleted?
> > - If a memslot isn't found, should KVM_SET_MEMORY_ATTRIBUTES fail and report
> > an error, or silently do nothing?
> > - If KVM_SET_MEMORY_ATTRIBUTES affects multiple memslots that are bound to
> > multiple guest_memfd, how does KVM guarantee atomicity? What happens if one
> > guest_memfd conversion succeeds, but a later fails?
>
> Note, my above objections were purely about routing updates through a VM, e.g. due
> to having to resolve memslots and whatnot. I.e. I swear I'm not contradicting
> myself by suggesting we reuse KVM_SET_MEMORY_ATTRIBUTES itself on the gmem file
> descriptor. I'm pretty sure past me didn't think at all about the actual uAPI,
> only the roles and responsibilities.
Powered by blists - more mailing lists