linux-kernel - Re: [PATCH 09/11] KVM: guest_memfd: Add interface for populating gmem pages with user data

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZiZ3yDxEX+n6Za9b@yilunxu-OptiPlex-7050>
Date: Mon, 22 Apr 2024 22:44:24 +0800
From: Xu Yilun <yilun.xu@...ux.intel.com>
To: Paolo Bonzini <pbonzini@...hat.com>
Cc: linux-kernel@...r.kernel.org, kvm@...r.kernel.org, seanjc@...gle.com,
	michael.roth@....com, isaku.yamahata@...el.com
Subject: Re: [PATCH 09/11] KVM: guest_memfd: Add interface for populating
 gmem pages with user data

On Thu, Apr 04, 2024 at 02:50:31PM -0400, Paolo Bonzini wrote:
> During guest run-time, kvm_arch_gmem_prepare() is issued as needed to
> prepare newly-allocated gmem pages prior to mapping them into the guest.
> In the case of SEV-SNP, this mainly involves setting the pages to
> private in the RMP table.
> 
> However, for the GPA ranges comprising the initial guest payload, which
> are encrypted/measured prior to starting the guest, the gmem pages need
> to be accessed prior to setting them to private in the RMP table so they
> can be initialized with the userspace-provided data. Additionally, an
> SNP firmware call is needed afterward to encrypt them in-place and
> measure the contents into the guest's launch digest.
> 
> While it is possible to bypass the kvm_arch_gmem_prepare() hooks so that
> this handling can be done in an open-coded/vendor-specific manner, this
> may expose more gmem-internal state/dependencies to external callers
> than necessary. Try to avoid this by implementing an interface that
> tries to handle as much of the common functionality inside gmem as
> possible, while also making it generic enough to potentially be
> usable/extensible for TDX as well.
> 
> Suggested-by: Sean Christopherson <seanjc@...gle.com>
> Signed-off-by: Michael Roth <michael.roth@....com>
> Co-developed-by: Michael Roth <michael.roth@....com>
> Signed-off-by: Paolo Bonzini <pbonzini@...hat.com>
> ---
>  include/linux/kvm_host.h | 26 ++++++++++++++
>  virt/kvm/guest_memfd.c   | 78 ++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 104 insertions(+)
> 
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index 33ed3b884a6b..97d57ec59789 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -2450,4 +2450,30 @@ int kvm_arch_gmem_prepare(struct kvm *kvm, gfn_t gfn, kvm_pfn_t pfn, int max_ord
>  bool kvm_arch_gmem_prepare_needed(struct kvm *kvm);
>  #endif
>  
> +/**
> + * kvm_gmem_populate() - Populate/prepare a GPA range with guest data
> + *
> + * @kvm: KVM instance
> + * @gfn: starting GFN to be populated
> + * @src: userspace-provided buffer containing data to copy into GFN range
> + *       (passed to @post_populate, and incremented on each iteration
> + *       if not NULL)
> + * @npages: number of pages to copy from userspace-buffer
> + * @post_populate: callback to issue for each gmem page that backs the GPA
> + *                 range
> + * @opaque: opaque data to pass to @post_populate callback
> + *
> + * This is primarily intended for cases where a gmem-backed GPA range needs
> + * to be initialized with userspace-provided data prior to being mapped into
> + * the guest as a private page. This should be called with the slots->lock
> + * held so that caller-enforced invariants regarding the expected memory
> + * attributes of the GPA range do not race with KVM_SET_MEMORY_ATTRIBUTES.
> + *
> + * Returns the number of pages that were populated.
> + */
> +long kvm_gmem_populate(struct kvm *kvm, gfn_t gfn, void __user *src, long npages,
> +		       int (*post_populate)(struct kvm *kvm, gfn_t gfn, kvm_pfn_t pfn,
> +					    void __user *src, int order, void *opaque),
> +		       void *opaque);
> +
>  #endif
> diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
> index 51c99667690a..e7de97382a67 100644
> --- a/virt/kvm/guest_memfd.c
> +++ b/virt/kvm/guest_memfd.c
> @@ -602,3 +602,81 @@ int kvm_gmem_get_pfn(struct kvm *kvm, struct kvm_memory_slot *slot,
>  	return r;
>  }
>  EXPORT_SYMBOL_GPL(kvm_gmem_get_pfn);
> +
> +static int kvm_gmem_undo_get_pfn(struct file *file, struct kvm_memory_slot *slot,
> +				 gfn_t gfn, int order)
> +{
> +	pgoff_t index = gfn - slot->base_gfn + slot->gmem.pgoff;
> +	struct kvm_gmem *gmem = file->private_data;
> +
> +	/*
> +	 * Races with kvm_gmem_unbind() must have been detected by
> +	 * __kvm_gmem_get_gfn(), because the invalidate_lock is
> +	 * taken between __kvm_gmem_get_gfn() and kvm_gmem_undo_get_pfn().
> +	 */
> +	if (WARN_ON_ONCE(xa_load(&gmem->bindings, index) != slot))
> +		return -EIO;
> +
> +	return __kvm_gmem_punch_hole(file_inode(file), index << PAGE_SHIFT, PAGE_SIZE << order);
> +}
> +
> +long kvm_gmem_populate(struct kvm *kvm, gfn_t gfn, void __user *src, long npages,
> +		       int (*post_populate)(struct kvm *kvm, gfn_t gfn, kvm_pfn_t pfn,
> +					    void __user *src, int order, void *opaque),
> +		       void *opaque)
> +{
> +	struct file *file;
> +	struct kvm_memory_slot *slot;
> +
> +	int ret = 0, max_order;
> +	long i;
> +
> +	lockdep_assert_held(&kvm->slots_lock);
> +	if (npages < 0)
> +		return -EINVAL;
> +
> +	slot = gfn_to_memslot(kvm, gfn);
> +	if (!kvm_slot_can_be_private(slot))
> +		return -EINVAL;
> +
> +	file = kvm_gmem_get_file(slot);
> +	if (!file)
> +		return -EFAULT;
> +
> +	filemap_invalidate_lock(file->f_mapping);
> +
> +	npages = min_t(ulong, slot->npages - (gfn - slot->base_gfn), npages);
> +	for (i = 0; i < npages; i += (1 << max_order)) {
> +		gfn_t this_gfn = gfn + i;
> +		kvm_pfn_t pfn;
> +
> +		ret = __kvm_gmem_get_pfn(file, slot, this_gfn, &pfn, &max_order, false);
> +		if (ret)
> +			break;
> +
> +		if (!IS_ALIGNED(this_gfn, (1 << max_order)) ||
> +		    (npages - i) < (1 << max_order))
> +			max_order = 0;
> +
> +		if (post_populate) {
> +			void __user *p = src ? src + i * PAGE_SIZE : NULL;
> +			ret = post_populate(kvm, this_gfn, pfn, p, max_order, opaque);

I don't see the main difference between gmem_prepare() and post_populate()
from gmem's point of view.  They are all vendor callbacks invoked after
__filemap_get_folio(). Is it possible gmem choose to call gmem_prepare()
or post_populate() outside __kvm_gmem_get_pfn()? Or even pass all
parameters to a single gmem_prepare() and let vendor code decide what to
do.

> +		}
> +
> +		put_page(pfn_to_page(pfn));
> +		if (ret) {
> +			/*
> +			 * Punch a hole so that FGP_CREAT_ONLY can succeed
> +			 * again.
> +			 */
> +			kvm_gmem_undo_get_pfn(file, slot, this_gfn, max_order);
> +			break;
> +		}
> +	}
> +
> +	filemap_invalidate_unlock(file->f_mapping);
> +
> +	fput(file);
> +	return ret && !i ? ret : i;
> +}
> +EXPORT_SYMBOL_GPL(kvm_gmem_populate);
> -- 
> 2.43.0
> 
> 
>