[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZyiPUdNVOUvasvn2@yzhao56-desk.sh.intel.com>
Date: Mon, 4 Nov 2024 17:09:37 +0800
From: Yan Zhao <yan.y.zhao@...el.com>
To: Binbin Wu <binbin.wu@...ux.intel.com>
CC: Rick Edgecombe <rick.p.edgecombe@...el.com>, <seanjc@...gle.com>,
<pbonzini@...hat.com>, <kvm@...r.kernel.org>, <kai.huang@...el.com>,
<dmatlack@...gle.com>, <isaku.yamahata@...il.com>, <nik.borisov@...e.com>,
<linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 14/21] KVM: TDX: Implement hooks to propagate changes of
TDP MMU mirror page table
On Wed, Oct 30, 2024 at 11:03:39AM +0800, Binbin Wu wrote:
> On 9/4/2024 11:07 AM, Rick Edgecombe wrote:
> > From: Isaku Yamahata <isaku.yamahata@...el.com>
> >
> [...]
> > diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c
> > index 6feb3ab96926..b8cd5a629a80 100644
> > --- a/arch/x86/kvm/vmx/tdx.c
> > +++ b/arch/x86/kvm/vmx/tdx.c
> > @@ -447,6 +447,177 @@ void tdx_load_mmu_pgd(struct kvm_vcpu *vcpu, hpa_t root_hpa, int pgd_level)
> > td_vmcs_write64(to_tdx(vcpu), SHARED_EPT_POINTER, root_hpa);
> > }
> > +static void tdx_unpin(struct kvm *kvm, kvm_pfn_t pfn)
> > +{
> > + struct page *page = pfn_to_page(pfn);
> > +
> > + put_page(page);
> Nit: It can be
> put_page(pfn_to_page(pfn));
>
Yes, thanks.
>
> > +}
> > +
> > +static int tdx_mem_page_aug(struct kvm *kvm, gfn_t gfn,
> > + enum pg_level level, kvm_pfn_t pfn)
> > +{
> > + int tdx_level = pg_level_to_tdx_sept_level(level);
> > + struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm);
> > + hpa_t hpa = pfn_to_hpa(pfn);
> > + gpa_t gpa = gfn_to_gpa(gfn);
> > + u64 entry, level_state;
> > + u64 err;
> > +
> > + err = tdh_mem_page_aug(kvm_tdx, gpa, hpa, &entry, &level_state);
> Nit:
> Usually, kernel prefers to handle and return for error conditions first.
>
> But for this case, for all error conditions, it needs to unpin the page.
> Is it better to return the successful case first, so that it only needs
> to call tdx_unpin() once?
>
> > + if (unlikely(err == TDX_ERROR_SEPT_BUSY)) {
> > + tdx_unpin(kvm, pfn);
> > + return -EAGAIN;
> > + }
As how to handle the busy is not decided up to now, I prefer to leave this hunk
as is.
> > + if (unlikely(err == (TDX_EPT_ENTRY_STATE_INCORRECT | TDX_OPERAND_ID_RCX))) {
> > + if (tdx_get_sept_level(level_state) == tdx_level &&
> > + tdx_get_sept_state(level_state) == TDX_SEPT_PENDING &&
> > + is_last_spte(entry, level) &&
> > + spte_to_pfn(entry) == pfn &&
> > + entry & VMX_EPT_SUPPRESS_VE_BIT) {
> Can this condition be triggered?
> For contention from multiple vCPUs, the winner has frozen the SPTE,
> it shouldn't trigger this.
> Could KVM do page aug for a same page multiple times somehow?
This condition should not be triggered due to the BUG_ON in
set_external_spte_present().
With Isaku's series [1], this condition will not happen either.
Will remove it. Thanks!
[1] https://lore.kernel.org/all/cover.1728718232.git.isaku.yamahata@intel.com/
>
> > + tdx_unpin(kvm, pfn);
> > + return -EAGAIN;
> > + }
> > + }
> > + if (KVM_BUG_ON(err, kvm)) {
> > + pr_tdx_error_2(TDH_MEM_PAGE_AUG, err, entry, level_state);
> > + tdx_unpin(kvm, pfn);
> > + return -EIO;
> > + }
> > +
> > + return 0;
> > +}
> > +
> > +int tdx_sept_set_private_spte(struct kvm *kvm, gfn_t gfn,
> > + enum pg_level level, kvm_pfn_t pfn)
> > +{
> > + struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm);
> > +
> > + /* TODO: handle large pages. */
> > + if (KVM_BUG_ON(level != PG_LEVEL_4K, kvm))
> > + return -EINVAL;
> > +
> > + /*
> > + * Because guest_memfd doesn't support page migration with
> > + * a_ops->migrate_folio (yet), no callback is triggered for KVM on page
> > + * migration. Until guest_memfd supports page migration, prevent page
> > + * migration.
> > + * TODO: Once guest_memfd introduces callback on page migration,
> > + * implement it and remove get_page/put_page().
> > + */
> > + get_page(pfn_to_page(pfn));
> > +
> > + if (likely(is_td_finalized(kvm_tdx)))
> > + return tdx_mem_page_aug(kvm, gfn, level, pfn);
> > +
> > + /*
> > + * TODO: KVM_MAP_MEMORY support to populate before finalize comes
> > + * here for the initial memory.
> > + */
> > + return 0;
> Is it better to return error before adding the support?
Hmm, returning error is better, though returning 0 is not wrong.
The future tdx_mem_page_record_premap_cnt() just increases
kvm_tdx->nr_premapped, while tdx_vcpu_init_mem_region() can be implemented
without checking kvm_tdx->nr_premapped.
> > +}
> > +
> [...]
>
>
Powered by blists - more mailing lists