[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <08cb04f1408461d436eb48370888a066c4e03001.camel@intel.com>
Date: Wed, 17 Jul 2024 00:38:25 +0000
From: "Edgecombe, Rick P" <rick.p.edgecombe@...el.com>
To: "seanjc@...gle.com" <seanjc@...gle.com>
CC: "kvm@...r.kernel.org" <kvm@...r.kernel.org>, "pbonzini@...hat.com"
<pbonzini@...hat.com>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>, "isaku.yamahata@...il.com"
<isaku.yamahata@...il.com>, "Yamahata, Isaku" <isaku.yamahata@...el.com>
Subject: Re: [PATCH] KVM: x86/mmu: Allow per VM kvm_mmu_max_gfn()
On Tue, 2024-07-16 at 16:08 -0700, Sean Christopherson wrote:
>
> IMO, you're looking at it with too much of a TDX lens and not thinking about
> all
> the people that don't care about TDX, which is the majority of KVM developers.
Well, I don't mean to. Actually trying to do the opposite.
I don't see how per-vm max gfn makes it easier or harder to look at things from
the normal VM perspective. I guess you'd have to figure out what set kvm-
>mmu_max_gfn.
>
> The unaliased GFN is definitely not the max GFN of all the VM's MMUs, since
> the
> shared EPT must be able to process GPAs with bits set above the "max" GFN.
> And
> to me, _that's_ far more weird than saying that "S-EPT MMUs never set the
> shared
> bit, and shared EPT MMUs never clear the shared bit". I'm guessing the S-EPT
> support ORs in the shared bit, but it's still a GFN.
In the current solution GFNs always have the shared bit stripped. It gets added
back within the TDP MMU iterator. So for the regular VM reader's perspective,
TDP MMU behaves pretty much like normal for TDX direct (shared) roots, that is
memslot GFNs get mapped at the same TDP GFNs. The iterator handles writing the
PTE for the memslot GFN at the special position in the EPT tables (gfn |
shared_bit).
>
> If you were adding a per-MMU max GFN, then I'd buy that it legitimately is the
> max
> GFN, but why not have a full GFN range for the MMU? E.g.
>
> static void __tdp_mmu_zap_root(struct kvm *kvm, struct kvm_mmu_page *root,
> bool shared, int zap_level)
> {
> struct tdp_iter iter;
>
> gfn_t end = tdp_mmu_max_gfn_exclusive(root);
> gfn_t start = tdp_mmu_min_gfn_inclusive(root);
>
> and then have the helpers incorporated the S-EPT vs. EPT information. That
> gets
> us optimized, precise zapping without needing to muddy the waters by tracking
> a
> per-VM "max" GFN that is only kinda sorta the max if you close your eyes and
> don't
> think too hard about the shared MMU usage.
This is similar to what we had before with the kvm_gfn_for_root(). Have you
looked at a recent version? Here, this recent one has some discussion on it:
https://lore.kernel.org/kvm/20240530210714.364118-7-rick.p.edgecombe@intel.com/#t
Pushing the shared bit re-application into to the TDP MMU iterator pretty nicely
hides a lot of the TDX specific behavior. It wraps up the TDX bits so that other
developers *don't* working on the various operations don't have to think about
it.
>
> > My inclination was to try to reduce the places where TDX MMU needs paths
> > happen to work for subtle reasons for the cost of the VM field.
>
> But it doesn't happen to work for subtle reasons. It works because it has to
> work. Processing !PRESENT SPTEs should always work, regardless of why KVM can
> guarantee there are no SPTEs in a given GFN range.
The current code (without this patch) doesn't zap the whole range that is
mappable by the EPT for the shared root case. It covers the whole range in the
private case, and only the range that is expected to be mapped in the shared
case. So this is good or bad? I think you are saying bad.
With this patch, it also doesn't zap the whole range mappable by the EPT, but
does it in a consistent way.
I think you are saying it's important to zap the whole range mappable by the
EPT. With TDX it is fuzzy, because the direct root range without the shared bit,
or beyond shouldn't be accessible via that root so it is not really mapped. We
would still be zapping the whole accessible paging structure, even if not the
whole part documented in the normal EPT.
But if we want to zap the whole structure I see the positives. I'd still rather
not go back to a kvm_gfn_for_root()-like solution for the TDP MMU support if
possible. Is that alright with you? What about something like this in
conjunction with your earlier diff?
diff --git a/arch/x86/kvm/mmu/tdp_iter.h b/arch/x86/kvm/mmu/tdp_iter.h
index 8bc19e65371c..bf19d5a45f87 100644
--- a/arch/x86/kvm/mmu/tdp_iter.h
+++ b/arch/x86/kvm/mmu/tdp_iter.h
@@ -129,6 +129,11 @@ struct tdp_iter {
iter.valid && iter.gfn < end;
\
tdp_iter_next(&iter))
+#define for_each_tdp_pte_min_level_all(iter, root, min_level) \
+ for (tdp_iter_start(&iter, root, min_level, 0, 0); \
+ iter.valid && iter.gfn < tdp_mmu_max_gfn_exclusive(); \
+ tdp_iter_next(&iter))
+
#define for_each_tdp_pte(iter, kvm, root, start, end)
\
for_each_tdp_pte_min_level(iter, kvm, root, PG_LEVEL_4K, start, end)
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index c3ce43ce7b3f..6f425652e396 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -889,10 +889,7 @@ static void __tdp_mmu_zap_root(struct kvm *kvm, struct
kvm_mmu_page *root,
{
struct tdp_iter iter;
- gfn_t end = tdp_mmu_max_gfn_exclusive();
- gfn_t start = 0;
-
- for_each_tdp_pte_min_level(iter, kvm, root, zap_level, start, end) {
+ for_each_tdp_pte_min_level_all(iter, root, zap_level) {
retry:
if (tdp_mmu_iter_cond_resched(kvm, &iter, false, shared))
continue;
Powered by blists - more mailing lists