lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aFVvDh7tTTXhX13f@google.com>
Date: Fri, 20 Jun 2025 07:24:14 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: Adrian Hunter <adrian.hunter@...el.com>
Cc: Vishal Annapurve <vannapurve@...gle.com>, pbonzini@...hat.com, kvm@...r.kernel.org, 
	rick.p.edgecombe@...el.com, kirill.shutemov@...ux.intel.com, 
	kai.huang@...el.com, reinette.chatre@...el.com, xiaoyao.li@...el.com, 
	tony.lindgren@...ux.intel.com, binbin.wu@...ux.intel.com, 
	isaku.yamahata@...el.com, linux-kernel@...r.kernel.org, yan.y.zhao@...el.com, 
	chao.gao@...el.com
Subject: Re: [PATCH V4 1/1] KVM: TDX: Add sub-ioctl KVM_TDX_TERMINATE_VM

On Thu, Jun 19, 2025, Adrian Hunter wrote:
> On 19/06/2025 03:33, Sean Christopherson wrote:
> > On Wed, Jun 18, 2025, Adrian Hunter wrote:
> >> On 18/06/2025 09:00, Vishal Annapurve wrote:
> >>> On Tue, Jun 17, 2025 at 10:50 PM Adrian Hunter <adrian.hunter@...el.com> wrote:
> >>>>> Ability to clean up memslots from userspace without closing
> >>>>> VM/guest_memfd handles is useful to keep reusing the same guest_memfds
> >>>>> for the next boot iteration of the VM in case of reboot.
> >>>>
> >>>> TD lifecycle does not include reboot.  In other words, reboot is
> >>>> done by shutting down the TD and then starting again with a new TD.
> >>>>
> >>>> AFAIK it is not currently possible to shut down without closing
> >>>> guest_memfds since the guest_memfd holds a reference (users_count)
> >>>> to struct kvm, and destruction begins when users_count hits zero.
> >>>>
> >>>
> >>> gmem link support[1] allows associating existing guest_memfds with new
> >>> VM instances.
> >>>
> >>> Breakdown of the userspace VMM flow:
> >>> 1) Create a new VM instance before closing guest_memfd files.
> >>> 2) Link existing guest_memfd files with the new VM instance. -> This
> >>> creates new set of files backed by the same inode but associated with
> >>> the new VM instance.
> >>
> >> So what about:
> >>
> >> 2.5) Call KVM_TDX_TERMINATE_VM IOCTL
> >>
> >> Memory reclaimed after KVM_TDX_TERMINATE_VM will be done efficiently,
> >> so avoid causing it to be reclaimed earlier.
> > 
> > The problem is that setting kvm->vm_dead will prevent (3) from succeeding.  If
> > kvm->vm_dead is set, KVM will reject all vCPU, VM, and device (not /dev/kvm the
> > device, but rather devices bound to the VM) ioctls.
> 
> (3) is "Close the older guest memfd handles -> results in older VM instance cleanup."
> 
> close() is not an IOCTL, so I do not understand.

Sorry, I misread that as "Close the older guest memfd handles by deleting the
memslots".

> > I intended that behavior, e.g. to guard against userspace blowing up KVM because
> > the hkid was released, I just didn't consider the memslots angle.
> 
> The patch was tested with QEMU which AFAICT does not touch  memslots when
> shutting down.  Is there a reason to?

In this case, the VMM process is not shutting down.  To emulate a reboot, the
VMM destroys the VM, but reuses the guest_memfd files for the "new" VM.  Because
guest_memfd takes a reference to "struct kvm", through memslot bindings, memslots
need to be manually destroyed so that all references are put and the VM is freed
by the kernel.  E.g. otherwise multiple reboots would manifest as memory leakds
and eventually OOM the host.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ