lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 7 Mar 2024 17:28:20 -0800
From: Sean Christopherson <seanjc@...gle.com>
To: David Matlack <dmatlack@...gle.com>
Cc: Kai Huang <kai.huang@...el.com>, Isaku Yamahata <isaku.yamahata@...ux.intel.com>, 
	"kvm@...r.kernel.org" <kvm@...r.kernel.org>, Isaku Yamahata <isaku.yamahata@...el.com>, 
	"federico.parola@...ito.it" <federico.parola@...ito.it>, "pbonzini@...hat.com" <pbonzini@...hat.com>, 
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, 
	"isaku.yamahata@...il.com" <isaku.yamahata@...il.com>, "michael.roth@....com" <michael.roth@....com>
Subject: Re: [RFC PATCH 1/8] KVM: Document KVM_MAP_MEMORY ioctl

On Thu, Mar 07, 2024, David Matlack wrote:
> On 2024-03-08 01:20 PM, Huang, Kai wrote:
> > > > > +:Parameters: struct kvm_memory_mapping(in/out)
> > > > > +:Returns: 0 on success, <0 on error
> > > > > +
> > > > > +KVM_MAP_MEMORY populates guest memory without running vcpu.
> > > > > +
> > > > > +::
> > > > > +
> > > > > +  struct kvm_memory_mapping {
> > > > > +	__u64 base_gfn;
> > > > > +	__u64 nr_pages;
> > > > > +	__u64 flags;
> > > > > +	__u64 source;
> > > > > +  };
> > > > > +
> > > > > +  /* For kvm_memory_mapping:: flags */
> > > > > +  #define KVM_MEMORY_MAPPING_FLAG_WRITE         _BITULL(0)
> > > > > +  #define KVM_MEMORY_MAPPING_FLAG_EXEC          _BITULL(1)
> > > > > +  #define KVM_MEMORY_MAPPING_FLAG_USER          _BITULL(2)
> > > > 
> > > > I am not sure what's the good of having "FLAG_USER"?
> > > > 
> > > > This ioctl is called from userspace, thus I think we can just treat this always
> > > > as user-fault?
> > > 
> > > The point is how to emulate kvm page fault as if vcpu caused the kvm page
> > > fault.  Not we call the ioctl as user context.
> > 
> > Sorry I don't quite follow.  What's wrong if KVM just append the #PF USER
> > error bit before it calls into the fault handler?
> > 
> > My question is, since this is ABI, you have to tell how userspace is
> > supposed to use this.  Maybe I am missing something, but I don't see how
> > USER should be used here.
> 
> If we restrict this API to the TDP MMU then KVM_MEMORY_MAPPING_FLAG_USER
> is meaningless, PFERR_USER_MASK is only relevant for shadow paging.

+1

> KVM_MEMORY_MAPPING_FLAG_WRITE seems useful to allow memslots to be
> populated with writes (which avoids just faulting in the zero-page for
> anon or tmpfs backed memslots), while also allowing populating read-only
> memslots.
> 
> I don't really see a use-case for KVM_MEMORY_MAPPING_FLAG_EXEC.

It would midly be interesting for something like the NX hugepage mitigation.

For the initial implementation, I don't think the ioctl() should specify
protections, period.

VMA-based mappings, i.e. !guest_memfd, already have a way to specify protections.
And for guest_memfd, finer grained control in general, and long term compatibility
with other features that are in-flight or proposed, I would rather userspace specify
RWX protections via KVM_SET_MEMORY_ATTRIBUTES.  Oh, and dirty logging would be a
pain too.

KVM doesn't currently support execute-only (XO) or !executable (RW), so I think
we can simply define KVM_MAP_MEMORY to behave like a read fault.  E.g. map RX,
and add W if all underlying protections allow it.

That way we can defer dealing with things like XO and RW *if* KVM ever does gain
support for specifying those combinations via KVM_SET_MEMORY_ATTRIBUTES, which
will likely be per-arch/vendor and non-trivial, e.g. AMD's NPT doesn't even allow
for XO memory.

And we shouldn't need to do anything for KVM_MAP_MEMORY in particular if
KVM_SET_MEMORY_ATTRIBUTES gains support for RWX protections the existing RWX and
RX combinations, e.g. if there's a use-case for write-protecting guest_memfd
regions.

We can always expand the uAPI, but taking away functionality is much harder, if
not impossible.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ