lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 22 Apr 2024 10:55:21 -0700
From: Isaku Yamahata <isaku.yamahata@...el.com>
To: Paolo Bonzini <pbonzini@...hat.com>
Cc: linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
	isaku.yamahata@...el.com, xiaoyao.li@...el.com,
	binbin.wu@...ux.intel.com, seanjc@...gle.com,
	rick.p.edgecombe@...el.com, isaku.yamahata@...ux.intel.com
Subject: Re: [PATCH 1/6] KVM: Document KVM_PRE_FAULT_MEMORY ioctl

On Fri, Apr 19, 2024 at 04:59:22AM -0400,
Paolo Bonzini <pbonzini@...hat.com> wrote:

> From: Isaku Yamahata <isaku.yamahata@...el.com>
> 
> Adds documentation of KVM_PRE_FAULT_MEMORY ioctl. [1]
> 
> It populates guest memory.  It doesn't do extra operations on the
> underlying technology-specific initialization [2].  For example,
> CoCo-related operations won't be performed.  Concretely for TDX, this API
> won't invoke TDH.MEM.PAGE.ADD() or TDH.MR.EXTEND().  Vendor-specific APIs
> are required for such operations.
> 
> The key point is to adapt of vcpu ioctl instead of VM ioctl.  First,
> populating guest memory requires vcpu.  If it is VM ioctl, we need to pick
> one vcpu somehow.  Secondly, vcpu ioctl allows each vcpu to invoke this
> ioctl in parallel.  It helps to scale regarding guest memory size, e.g.,
> hundreds of GB.
> 
> [1] https://lore.kernel.org/kvm/Zbrj5WKVgMsUFDtb@google.com/
> [2] https://lore.kernel.org/kvm/Ze-TJh0BBOWm9spT@google.com/
> 
> Suggested-by: Sean Christopherson <seanjc@...gle.com>
> Signed-off-by: Isaku Yamahata <isaku.yamahata@...el.com>
> Message-ID: <9a060293c9ad9a78f1d8994cfe1311e818e99257.1712785629.git.isaku.yamahata@...el.com>
> Signed-off-by: Paolo Bonzini <pbonzini@...hat.com>
> ---
>  Documentation/virt/kvm/api.rst | 50 ++++++++++++++++++++++++++++++++++
>  1 file changed, 50 insertions(+)
> 
> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> index f0b76ff5030d..bbcaa5d2b54b 100644
> --- a/Documentation/virt/kvm/api.rst
> +++ b/Documentation/virt/kvm/api.rst
> @@ -6352,6 +6352,56 @@ a single guest_memfd file, but the bound ranges must not overlap).
>  
>  See KVM_SET_USER_MEMORY_REGION2 for additional details.
>  
> +4.143 KVM_PRE_FAULT_MEMORY
> +------------------------
> +
> +:Capability: KVM_CAP_PRE_FAULT_MEMORY
> +:Architectures: none
> +:Type: vcpu ioctl
> +:Parameters: struct kvm_pre_fault_memory (in/out)
> +:Returns: 0 on success, < 0 on error
> +
> +Errors:
> +
> +  ========== ===============================================================
> +  EINVAL     The specified `gpa` and `size` were invalid (e.g. not
> +             page aligned).
> +  ENOENT     The specified `gpa` is outside defined memslots.
> +  EINTR      An unmasked signal is pending and no page was processed.
> +  EFAULT     The parameter address was invalid.
> +  EOPNOTSUPP Mapping memory for a GPA is unsupported by the
> +             hypervisor, and/or for the current vCPU state/mode.

     EIO        Unexpected error happened.

> +  ========== ===============================================================
> +
> +::
> +
> +  struct kvm_pre_fault_memory {
> +	/* in/out */
> +	__u64 gpa;
> +	__u64 size;
> +	/* in */
> +	__u64 flags;
> +	__u64 padding[5];
> +  };
> +
> +KVM_PRE_FAULT_MEMORY populates KVM's stage-2 page tables used to map memory
> +for the current vCPU state.  KVM maps memory as if the vCPU generated a
> +stage-2 read page fault, e.g. faults in memory as needed, but doesn't break
> +CoW.  However, KVM does not mark any newly created stage-2 PTE as Accessed.
> +
> +In some cases, multiple vCPUs might share the page tables.  In this
> +case, the ioctl can be called in parallel.
> +
> +Shadow page tables cannot support this ioctl because they
> +are indexed by virtual address or nested guest physical address.
> +Calling this ioctl when the guest is using shadow page tables (for
> +example because it is running a nested guest with nested page tables)
> +will fail with `EOPNOTSUPP` even if `KVM_CHECK_EXTENSION` reports
> +the capability to be present.
> +
> +`flags` must currently be zero.

`flags` and `padding`

> +
> +
>  5. The kvm_run structure
>  ========================
>  
> -- 
> 2.43.0
> 
> 
> 

-- 
Isaku Yamahata <isaku.yamahata@...el.com>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ