lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAGtprH_cpbPLvW2rSc2o7BsYWYZKNR6QAEsA4X-X77=2A7s=yg@mail.gmail.com>
Date: Sun, 15 Jun 2025 20:40:18 -0700
From: Vishal Annapurve <vannapurve@...gle.com>
To: Adrian Hunter <adrian.hunter@...el.com>
Cc: pbonzini@...hat.com, seanjc@...gle.com, kvm@...r.kernel.org, 
	rick.p.edgecombe@...el.com, kirill.shutemov@...ux.intel.com, 
	kai.huang@...el.com, reinette.chatre@...el.com, xiaoyao.li@...el.com, 
	tony.lindgren@...ux.intel.com, binbin.wu@...ux.intel.com, 
	isaku.yamahata@...el.com, linux-kernel@...r.kernel.org, yan.y.zhao@...el.com, 
	chao.gao@...el.com
Subject: Re: [PATCH V4 1/1] KVM: TDX: Add sub-ioctl KVM_TDX_TERMINATE_VM

On Wed, Jun 11, 2025 at 2:52 AM Adrian Hunter <adrian.hunter@...el.com> wrote:
>
> From: Sean Christopherson <seanjc@...gle.com>
>
> Add sub-ioctl KVM_TDX_TERMINATE_VM to release the HKID prior to shutdown,
> which enables more efficient reclaim of private memory.
>
> Private memory is removed from MMU/TDP when guest_memfds are closed. If
> the HKID has not been released, the TDX VM is still in RUNNABLE state,
> so pages must be removed using "Dynamic Page Removal" procedure (refer
> TDX Module Base spec) which involves a number of steps:
>         Block further address translation
>         Exit each VCPU
>         Clear Secure EPT entry
>         Flush/write-back/invalidate relevant caches
>
> However, when the HKID is released, the TDX VM moves to TD_TEARDOWN state
> where all TDX VM pages are effectively unmapped, so pages can be reclaimed
> directly.
>
> Reclaiming TD Pages in TD_TEARDOWN State was seen to decrease the total
> reclaim time.  For example:
>
>         VCPUs   Size (GB)       Before (secs)   After (secs)
>          4       18               72             24
>         32      107              517            134
>         64      400             5539            467
>
> Link: https://lore.kernel.org/r/Z-V0qyTn2bXdrPF7@google.com
> Link: https://lore.kernel.org/r/aAL4dT1pWG5dDDeo@google.com
> Signed-off-by: Sean Christopherson <seanjc@...gle.com>
> Co-developed-by: Adrian Hunter <adrian.hunter@...el.com>
> Signed-off-by: Adrian Hunter <adrian.hunter@...el.com>
> ---
>
>
> Changes in V4:
>
>         Drop TDX_FLUSHVP_NOT_DONE change.  It will be done separately.
>         Use KVM_BUG_ON() instead of WARN_ON().
>         Correct kvm_trylock_all_vcpus() return value.
>
> Changes in V3:
>
>         Remove KVM_BUG_ON() from tdx_mmu_release_hkid() because it would
>         trigger on the error path from __tdx_td_init()
>
>         Put cpus_read_lock() handling back into tdx_mmu_release_hkid()
>
>         Handle KVM_TDX_TERMINATE_VM in the switch statement, i.e. let
>         tdx_vm_ioctl() deal with kvm->lock
> ....
>
> +static int tdx_terminate_vm(struct kvm *kvm)
> +{
> +       if (kvm_trylock_all_vcpus(kvm))
> +               return -EBUSY;
> +
> +       kvm_vm_dead(kvm);

With this no more VM ioctls can be issued on this instance. How would
userspace VMM clean up the memslots? Is the expectation that
guest_memfd and VM fds are closed to actually reclaim the memory?

Ability to clean up memslots from userspace without closing
VM/guest_memfd handles is useful to keep reusing the same guest_memfds
for the next boot iteration of the VM in case of reboot.

> +
> +       kvm_unlock_all_vcpus(kvm);
> +
> +       tdx_mmu_release_hkid(kvm);
> +
> +       return 0;
> +}
> +

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ