[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2daf03ae-6b5a-44ae-806e-76d09fb5273b@linux.intel.com>
Date: Tue, 12 Mar 2024 21:33:31 +0800
From: Binbin Wu <binbin.wu@...ux.intel.com>
To: isaku.yamahata@...el.com
Cc: kvm@...r.kernel.org, linux-kernel@...r.kernel.org,
isaku.yamahata@...il.com, Paolo Bonzini <pbonzini@...hat.com>,
erdemaktas@...gle.com, Sean Christopherson <seanjc@...gle.com>,
Sagi Shahar <sagis@...gle.com>, Kai Huang <kai.huang@...el.com>,
chen.bo@...el.com, hang.yuan@...el.com, tina.zhang@...el.com
Subject: Re: [PATCH v19 011/130] KVM: Add new members to struct kvm_gfn_range
to operate on
On 2/26/2024 4:25 PM, isaku.yamahata@...el.com wrote:
> From: Isaku Yamahata <isaku.yamahata@...el.com>
>
> Add new members to strut kvm_gfn_range to indicate which mapping
> (private-vs-shared) to operate on. only_private and only_shared. Update
> mmu notifier, set memory attributes ioctl or KVM gmem callback to
> initialize them.
>
> It was premature for set_memory_attributes ioctl to call
> kvm_unmap_gfn_range(). Instead, let kvm_arch_ste_memory_attributes()
"kvm_arch_ste_memory_attributes()" -> "kvm_vm_set_mem_attributes()" ?
> handle it and add a new x86 vendor callback to react to memory attribute
> change. [1]
Which new x86 vendor callback?
>
> - If it's from the mmu notifier, zap shared pages only
> - If it's from the KVM gmem, zap private pages only
> - If setting memory attributes, vendor callback checks new attributes
> and make decisions.
> SNP would do nothing and handle it later with gmem callback
> TDX callback would do as follows.
> When it converts pages to shared, zap private pages only.
> When it converts pages to private, zap shared pages only.
>
> TDX needs to know which mapping to operate on. Shared-EPT vs. Secure-EPT.
> The following sequence to convert the GPA to private doesn't work for TDX
> because the page can already be private.
>
> 1) Update memory attributes to private in memory attributes xarray
> 2) Zap the GPA range irrespective of private-or-shared.
> Even if the page is already private, zap the entry.
> 3) EPT violation on the GPA
> 4) Populate the GPA as private
> The page is zeroed, and the guest has to accept the page again.
>
> In step 2, TDX wants to zap only shared pages and skip private ones.
>
> [1] https://lore.kernel.org/all/ZJX0hk+KpQP0KUyB@google.com/
>
> Suggested-by: Sean Christopherson <seanjc@...gle.com>
> Signed-off-by: Isaku Yamahata <isaku.yamahata@...el.com>
>
> ---
> Changes v18:
> - rebased to kvm-next
>
> Changes v2 -> v3:
> - Drop the KVM_GFN_RANGE flags
> - Updated struct kvm_gfn_range
> - Change kvm_arch_set_memory_attributes() to return bool for flush
> - Added set_memory_attributes x86 op for vendor backends
> - Refined commit message to describe TDX care concretely
>
> Changes v1 -> v2:
> - consolidate KVM_GFN_RANGE_FLAGS_GMEM_{PUNCH_HOLE, RELEASE} into
> KVM_GFN_RANGE_FLAGS_GMEM.
> - Update the commit message to describe TDX more. Drop SEV_SNP.
>
> Signed-off-by: Isaku Yamahata <isaku.yamahata@...el.com>
> ---
> include/linux/kvm_host.h | 2 ++
> virt/kvm/guest_memfd.c | 3 +++
> virt/kvm/kvm_main.c | 17 +++++++++++++++++
> 3 files changed, 22 insertions(+)
>
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index 7e7fd25b09b3..0520cd8d03cc 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -264,6 +264,8 @@ struct kvm_gfn_range {
> gfn_t start;
> gfn_t end;
> union kvm_mmu_notifier_arg arg;
> + bool only_private;
> + bool only_shared;
IMO, an enum will be clearer than the two flags.
enum {
PROCESS_PRIVATE_AND_SHARED,
PROCESS_ONLY_PRIVATE,
PROCESS_ONLY_SHARED,
};
> bool may_block;
> };
> bool kvm_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range);
> diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
> index 0f4e0cf4f158..3830d50b9b67 100644
> --- a/virt/kvm/guest_memfd.c
> +++ b/virt/kvm/guest_memfd.c
> @@ -64,6 +64,9 @@ static void kvm_gmem_invalidate_begin(struct kvm_gmem *gmem, pgoff_t start,
> .end = slot->base_gfn + min(pgoff + slot->npages, end) - pgoff,
> .slot = slot,
> .may_block = true,
> + /* guest memfd is relevant to only private mappings. */
> + .only_private = true,
> + .only_shared = false,
> };
>
> if (!found_memslot) {
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 10bfc88a69f7..0349e1f241d1 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -634,6 +634,12 @@ static __always_inline kvm_mn_ret_t __kvm_handle_hva_range(struct kvm *kvm,
> */
> gfn_range.arg = range->arg;
> gfn_range.may_block = range->may_block;
> + /*
> + * HVA-based notifications aren't relevant to private
> + * mappings as they don't have a userspace mapping.
> + */
> + gfn_range.only_private = false;
> + gfn_range.only_shared = true;
>
> /*
> * {gfn(page) | page intersects with [hva_start, hva_end)} =
> @@ -2486,6 +2492,16 @@ static __always_inline void kvm_handle_gfn_range(struct kvm *kvm,
> gfn_range.arg = range->arg;
> gfn_range.may_block = range->may_block;
>
> + /*
> + * If/when KVM supports more attributes beyond private .vs shared, this
> + * _could_ set only_{private,shared} appropriately if the entire target
> + * range already has the desired private vs. shared state (it's unclear
> + * if that is a net win). For now, KVM reaches this point if and only
> + * if the private flag is being toggled, i.e. all mappings are in play.
> + */
> + gfn_range.only_private = false;
> + gfn_range.only_shared = false;
> +
> for (i = 0; i < kvm_arch_nr_memslot_as_ids(kvm); i++) {
> slots = __kvm_memslots(kvm, i);
>
> @@ -2542,6 +2558,7 @@ static int kvm_vm_set_mem_attributes(struct kvm *kvm, gfn_t start, gfn_t end,
> struct kvm_mmu_notifier_range pre_set_range = {
> .start = start,
> .end = end,
> + .arg.attributes = attributes,
> .handler = kvm_pre_set_memory_attributes,
> .on_lock = kvm_mmu_invalidate_begin,
> .flush_on_ret = true,
Powered by blists - more mailing lists