linux-kernel - Re: [PATCH v2 01/10] KVM: Document KVM_MAP

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240415234705.GV3039520@ls.amr.corp.intel.com>
Date: Mon, 15 Apr 2024 16:47:05 -0700
From: Isaku Yamahata <isaku.yamahata@...el.com>
To: "Edgecombe, Rick P" <rick.p.edgecombe@...el.com>
Cc: "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
	"Yamahata, Isaku" <isaku.yamahata@...el.com>,
	"seanjc@...gle.com" <seanjc@...gle.com>,
	"Huang, Kai" <kai.huang@...el.com>,
	"federico.parola@...ito.it" <federico.parola@...ito.it>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"isaku.yamahata@...il.com" <isaku.yamahata@...il.com>,
	"dmatlack@...gle.com" <dmatlack@...gle.com>,
	"michael.roth@....com" <michael.roth@....com>,
	"pbonzini@...hat.com" <pbonzini@...hat.com>,
	isaku.yamahata@...ux.intel.com
Subject: Re: [PATCH v2 01/10] KVM: Document KVM_MAP_MEMORY ioctl

On Mon, Apr 15, 2024 at 11:27:20PM +0000,
"Edgecombe, Rick P" <rick.p.edgecombe@...el.com> wrote:

> Nits only...
> 
> On Wed, 2024-04-10 at 15:07 -0700, isaku.yamahata@...el.com wrote:
> > From: Isaku Yamahata <isaku.yamahata@...el.com>
> > 
> > Adds documentation of KVM_MAP_MEMORY ioctl. [1]
> > 
> > It populates guest memory.  It doesn't do extra operations on the
> > underlying technology-specific initialization [2].  For example,
> > CoCo-related operations won't be performed.  Concretely for TDX, this API
> > won't invoke TDH.MEM.PAGE.ADD() or TDH.MR.EXTEND().  Vendor-specific APIs
> > are required for such operations.
> > 
> > The key point is to adapt of vcpu ioctl instead of VM ioctl.
> 
> Not sure what you are trying to say here.
> 
> >   First,
> > populating guest memory requires vcpu.  If it is VM ioctl, we need to pick
> > one vcpu somehow.  Secondly, vcpu ioctl allows each vcpu to invoke this
> > ioctl in parallel.  It helps to scale regarding guest memory size, e.g.,
> > hundreds of GB.
> 
> I guess you are explaining why this is a vCPU ioctl instead of a KVM ioctl. Is
> this clearer:

Right, I wanted to explain why I chose vCPU ioctl.  Let me update the commit
message.


> Although the operation is sort of a VM operation, make the ioctl a vCPU ioctl
> instead of KVM ioctl. Do this because a vCPU is needed internally for the fault
> path anyway, and because... (I don't follow the second point).
> 
> > 
> > [1] https://lore.kernel.org/kvm/Zbrj5WKVgMsUFDtb@google.com/
> > [2] https://lore.kernel.org/kvm/Ze-TJh0BBOWm9spT@google.com/
> > 
> > Suggested-by: Sean Christopherson <seanjc@...gle.com>
> > Signed-off-by: Isaku Yamahata <isaku.yamahata@...el.com>
> > ---
> > v2:
> > - Make flags reserved for future use. (Sean, Michael)
> > - Clarified the supposed use case. (Kai)
> > - Dropped source member of struct kvm_memory_mapping. (Michael)
> > - Change the unit from pages to bytes. (Michael)
> > ---
> >  Documentation/virt/kvm/api.rst | 52 ++++++++++++++++++++++++++++++++++
> >  1 file changed, 52 insertions(+)
> > 
> > diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> > index f0b76ff5030d..6ee3d2b51a2b 100644
> > --- a/Documentation/virt/kvm/api.rst
> > +++ b/Documentation/virt/kvm/api.rst
> > @@ -6352,6 +6352,58 @@ a single guest_memfd file, but the bound ranges must
> > not overlap).
> >  
> >  See KVM_SET_USER_MEMORY_REGION2 for additional details.
> >  
> > +4.143 KVM_MAP_MEMORY
> > +------------------------
> > +
> > +:Capability: KVM_CAP_MAP_MEMORY
> > +:Architectures: none
> > +:Type: vcpu ioctl
> > +:Parameters: struct kvm_memory_mapping (in/out)
> > +:Returns: 0 on success, < 0 on error
> > +
> > +Errors:
> > +
> > +  ========== =============================================================
> > +  EINVAL     invalid parameters
> > +  EAGAIN     The region is only processed partially.  The caller should
> > +             issue the ioctl with the updated parameters when `size` > 0.
> > +  EINTR      An unmasked signal is pending.  The region may be processed
> > +             partially.
> > +  EFAULT     The parameter address was invalid.  The specified region
> > +             `base_address` and `size` was invalid.  The region isn't
> > +             covered by KVM memory slot.
> > +  EOPNOTSUPP The architecture doesn't support this operation. The x86 two
> > +             dimensional paging supports this API.  the x86 kvm shadow mmu
> > +             doesn't support it.  The other arch KVM doesn't support it.
> > +  ========== =============================================================
> > +
> > +::
> > +
> > +  struct kvm_memory_mapping {
> > +       __u64 base_address;
> > +       __u64 size;
> > +       __u64 flags;
> > +  };
> > +
> > +KVM_MAP_MEMORY populates guest memory with the range, `base_address` in (L1)
> > +guest physical address(GPA) and `size` in bytes.  `flags` must be zero.  It's
> > +reserved for future use.  When the ioctl returns, the input values are
> > updated
> > +to point to the remaining range.  If `size` > 0 on return, the caller should
> > +issue the ioctl with the updated parameters.
> > +
> > +Multiple vcpus are allowed to call this ioctl simultaneously.  It's not
> > +mandatory for all vcpus to issue this ioctl.  A single vcpu can suffice.
> > +Multiple vcpus invocations are utilized for scalability to process the
> > +population in parallel.  If multiple vcpus call this ioctl in parallel, it
> > may
> > +result in the error of EAGAIN due to race conditions.
> > +
> > +This population is restricted to the "pure" population without triggering
> > +underlying technology-specific initialization.  For example, CoCo-related
> > +operations won't perform.  In the case of TDX, this API won't invoke
> > +TDH.MEM.PAGE.ADD() or TDH.MR.EXTEND().  Vendor-specific uAPIs are required
> > for
> > +such operations.
> 
> Probably don't want to have TDX bits in here yet. Since it's talking about what
> KVM_MAP_MEMORY is *not* doing, it can just be dropped.

Ok.  Will drop it.
-- 
Isaku Yamahata <isaku.yamahata@...el.com>