[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aCqsDW6bDlM6yOtP@yzhao56-desk.sh.intel.com>
Date: Mon, 19 May 2025 11:57:01 +0800
From: Yan Zhao <yan.y.zhao@...el.com>
To: "Edgecombe, Rick P" <rick.p.edgecombe@...el.com>
CC: "kvm@...r.kernel.org" <kvm@...r.kernel.org>, "Li, Xiaoyao"
<xiaoyao.li@...el.com>, "quic_eberman@...cinc.com"
<quic_eberman@...cinc.com>, "Hansen, Dave" <dave.hansen@...el.com>,
"david@...hat.com" <david@...hat.com>, "Li, Zhiquan1"
<zhiquan1.li@...el.com>, "tabba@...gle.com" <tabba@...gle.com>,
"vbabka@...e.cz" <vbabka@...e.cz>, "thomas.lendacky@....com"
<thomas.lendacky@....com>, "michael.roth@....com" <michael.roth@....com>,
"seanjc@...gle.com" <seanjc@...gle.com>, "Weiny, Ira" <ira.weiny@...el.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"pbonzini@...hat.com" <pbonzini@...hat.com>, "ackerleytng@...gle.com"
<ackerleytng@...gle.com>, "Yamahata, Isaku" <isaku.yamahata@...el.com>,
"binbin.wu@...ux.intel.com" <binbin.wu@...ux.intel.com>, "Peng, Chao P"
<chao.p.peng@...el.com>, "Du, Fan" <fan.du@...el.com>, "Annapurve, Vishal"
<vannapurve@...gle.com>, "jroedel@...e.de" <jroedel@...e.de>, "Miao, Jun"
<jun.miao@...el.com>, "Shutemov, Kirill" <kirill.shutemov@...el.com>,
"pgonda@...gle.com" <pgonda@...gle.com>, "x86@...nel.org" <x86@...nel.org>
Subject: Re: [RFC PATCH 10/21] KVM: x86/mmu: Disallow page merging (huge page
adjustment) for mirror root
On Sat, May 17, 2025 at 01:50:53AM +0800, Edgecombe, Rick P wrote:
> On Fri, 2025-05-16 at 12:01 +0800, Yan Zhao wrote:
> > > Maybe we should rename nx_huge_page_workaround_enabled to something more
> > > generic
> > > and do the is_mirror logic in kvm_mmu_do_page_fault() when setting it. It
> > > should
> > > shrink the diff and centralize the logic.
> > Hmm, I'm reluctant to rename nx_huge_page_workaround_enabled, because
> >
> > (1) Invoking disallowed_hugepage_adjust() for mirror root is to disable page
> > promotion for TDX private memory, so is only applied to TDP MMU.
> > (2) nx_huge_page_workaround_enabled is used specifically for nx huge pages.
> > fault->huge_page_disallowed = fault->exec && fault-
> > >nx_huge_page_workaround_enabled;
>
> Oh, good point.
>
> >
> > if (fault->huge_page_disallowed)
> > account_nx_huge_page(vcpu->kvm, sp,
> > fault->req_level >= it.level);
> >
> > sp->nx_huge_page_disallowed = fault->huge_page_disallowed.
> >
> > Affecting fault->huge_page_disallowed would impact
> > sp->nx_huge_page_disallowed as well and would disable huge pages entirely.
> >
> > So, we still need to keep nx_huge_page_workaround_enabled.
> >
> > If we introduce a new flag fault->disable_hugepage_adjust, and set it in
> > kvm_mmu_do_page_fault(), we would also need to invoke
> > tdp_mmu_get_root_for_fault() there as well.
> >
> > Checking for mirror root for non-TDX VMs is not necessary, and the invocation
> > of
> > tdp_mmu_get_root_for_fault() seems redundant with the one in
> > kvm_tdp_mmu_map().
>
> Also true. What about a wrapper for MMU code to check instead of fault-
> >nx_huge_page_workaround_enabled then?
Like below?
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index 1b2bacde009f..0e4a03f44036 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -1275,6 +1275,11 @@ static int tdp_mmu_link_sp(struct kvm *kvm, struct tdp_iter *iter,
return 0;
}
+static inline bool is_fault_disallow_huge_page_adust(struct kvm_page_fault *fault, bool is_mirror)
+{
+ return fault->nx_huge_page_workaround_enabled || is_mirror;
+}
+
/*
* Handle a TDP page fault (NPT/EPT violation/misconfiguration) by installing
* page tables and SPTEs to translate the faulting guest physical address.
@@ -1297,7 +1302,7 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
for_each_tdp_pte(iter, kvm, root, fault->gfn, fault->gfn + 1) {
int r;
- if (fault->nx_huge_page_workaround_enabled || is_mirror)
+ if (is_fault_disallow_huge_page_adust(fault, is_mirror))
disallowed_hugepage_adjust(fault, iter.old_spte, iter.level, is_mirror);
/*
> Also, why not check is_mirror_sp() in disallowed_hugepage_adjust() instead of
> passing in an is_mirror arg?
It's an optimization.
As is_mirror_sptep(iter->sptep) == is_mirror_sp(root), passing in is_mirror arg
can avoid checking mirror for each sp, which remains unchanged in a root.
> There must be a way to have it fit in better with disallowed_hugepage_adjust()
> without adding so much open coded boolean logic.
Powered by blists - more mailing lists