[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aFWLqOd7Kln67h1N@google.com>
Date: Fri, 20 Jun 2025 09:26:16 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: Vishal Annapurve <vannapurve@...gle.com>
Cc: Adrian Hunter <adrian.hunter@...el.com>, pbonzini@...hat.com, kvm@...r.kernel.org,
rick.p.edgecombe@...el.com, kirill.shutemov@...ux.intel.com,
kai.huang@...el.com, reinette.chatre@...el.com, xiaoyao.li@...el.com,
tony.lindgren@...ux.intel.com, binbin.wu@...ux.intel.com,
isaku.yamahata@...el.com, linux-kernel@...r.kernel.org, yan.y.zhao@...el.com,
chao.gao@...el.com
Subject: Re: [PATCH V4 1/1] KVM: TDX: Add sub-ioctl KVM_TDX_TERMINATE_VM
On Fri, Jun 20, 2025, Vishal Annapurve wrote:
> On Fri, Jun 20, 2025 at 7:24 AM Sean Christopherson <seanjc@...gle.com> wrote:
> > > > The problem is that setting kvm->vm_dead will prevent (3) from succeeding. If
> > > > kvm->vm_dead is set, KVM will reject all vCPU, VM, and device (not /dev/kvm the
> > > > device, but rather devices bound to the VM) ioctls.
> > >
> > > (3) is "Close the older guest memfd handles -> results in older VM instance cleanup."
> > >
> > > close() is not an IOCTL, so I do not understand.
> >
> > Sorry, I misread that as "Close the older guest memfd handles by deleting the
> > memslots".
> >
> > > > I intended that behavior, e.g. to guard against userspace blowing up KVM because
> > > > the hkid was released, I just didn't consider the memslots angle.
> > >
> > > The patch was tested with QEMU which AFAICT does not touch memslots when
> > > shutting down. Is there a reason to?
> >
> > In this case, the VMM process is not shutting down. To emulate a reboot, the
> > VMM destroys the VM, but reuses the guest_memfd files for the "new" VM. Because
> > guest_memfd takes a reference to "struct kvm", through memslot bindings, memslots
>
> guest_memfd takes a reference on the "struct kvm" only on
> creation/linking, currently memslot binding doesn't add additional
> references.
Oh yeah, duh.
> Adrian's suggestion makes sense
+1. It should also be faster overall (hopefully notably faster?).
> and it should be functional but I am running into some issues which likely
> need to be resolved on the userspace side. I will keep this thread updated.
>
> Currently testing this reboot flow:
> 1) Issue KVM_TDX_TERMINATE_VM on the old VM.
> 2) Close the VM fd.
> 3) Create a new VM fd.
> 4) Link the old guest_memfd handles to the new VM fd.
> 5) Close the old guest_memfd handles.
> 6) Register memslots on the new VM using the linked guest_memfd handles.
>
> That being said, I still see the value in what Sean suggested.
> " Remove vm_dead and instead reject ioctls based on vm_bugged, and simply rely
> on KVM_REQ_VM_DEAD to prevent running the guest."
>
> This will help with:
> 1) Keeping the cleanup sequence as close as possible to the normal VM
> cleanup sequence.
> 2) Actual VM destruction happens at step 5 from the above mentioned
> flow, if there is any cleanup that happens asynchronously, userspace
> can enforce synchronous cleanup by executing graceful VM shutdown
> stages before step 2 above.
>
> And IIUC the goal here is to achieve exactly what Sean suggested above
> i.e. prevent running the guest after KVM_TDX_TERMINATE_VM is issued.
Ya, I still like the idea. But given that it's not needed for KVM_TDX_TERMINATE_VM,
it can and should be posted/landed separately.
Powered by blists - more mailing lists