[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240321223928.GT1994522@ls.amr.corp.intel.com>
Date: Thu, 21 Mar 2024 15:39:28 -0700
From: Isaku Yamahata <isaku.yamahata@...el.com>
To: "Edgecombe, Rick P" <rick.p.edgecombe@...el.com>
Cc: "Yamahata, Isaku" <isaku.yamahata@...el.com>,
"Zhang, Tina" <tina.zhang@...el.com>,
"isaku.yamahata@...ux.intel.com" <isaku.yamahata@...ux.intel.com>,
"seanjc@...gle.com" <seanjc@...gle.com>,
"Huang, Kai" <kai.huang@...el.com>,
"sean.j.christopherson@...el.com" <sean.j.christopherson@...el.com>,
"Chen, Bo2" <chen.bo@...el.com>,
"sagis@...gle.com" <sagis@...gle.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"Yuan, Hang" <hang.yuan@...el.com>,
"Aktas, Erdem" <erdemaktas@...gle.com>,
"kvm@...r.kernel.org" <kvm@...r.kernel.org>,
"pbonzini@...hat.com" <pbonzini@...hat.com>,
"isaku.yamahata@...il.com" <isaku.yamahata@...il.com>
Subject: Re: [PATCH v19 059/130] KVM: x86/tdp_mmu: Don't zap private pages
for unsupported cases
On Wed, Mar 20, 2024 at 12:56:38AM +0000,
"Edgecombe, Rick P" <rick.p.edgecombe@...el.com> wrote:
> On Tue, 2024-03-19 at 16:56 -0700, Isaku Yamahata wrote:
> > When we zap a page from the guest, and add it again on TDX even with
> > the same
> > GPA, the page is zeroed. We'd like to keep memory contents for those
> > cases.
> >
> > Ok, let me add those whys and drop migration part. Here is the
> > updated one.
> >
> > TDX supports only write-back(WB) memory type for private memory
> > architecturally so that (virtualized) memory type change doesn't make
> > sense for private memory. When we remove the private page from the
> > guest
> > and re-add it with the same GPA, the page is zeroed.
> >
> > Regarding memory type change (mtrr virtualization and lapic page
> > mapping change), the current implementation zaps pages, and populate
> s^
> > the page with new memory type on the next KVM page fault.
> ^s
>
> > It doesn't work for TDX to have zeroed pages.
> What does this mean? Above you mention how all the pages are zeroed. Do
> you mean it doesn't work for TDX to zero a running guest's pages. Which
> would happen for the operations that would expect the pages could get
> faulted in again just fine.
(non-TDX part of) KVM assumes that page contents are preserved after zapping and
re-populate. This isn't true for TDX. The guest would suddenly see zero pages
instead of the old memory contents and would be upset.
> > Because TDX supports only WB, we
> > ignore the request for MTRR and lapic page change to not zap private
> > pages on unmapping for those two cases
>
> Hmm. I need to go back and look at this again. It's not clear from the
> description why it is safe for the host to not zap pages if requested
> to. I see why the guest wouldn't want them to be zapped.
KVM siltently ignores the request to change memory types.
> > TDX Secure-EPT requires removing the guest pages first and leaf
> > Secure-EPT pages in order. It doesn't allow zap a Secure-EPT entry
> > that has child pages. It doesn't work with the current TDP MMU
> > zapping logic that zaps the root page table without touching child
> > pages. Instead, zap only leaf SPTEs for KVM mmu that has a shared
> > bit
> > mask.
>
> Could this be better as two patches that each address a separate thing?
> 1. Leaf only zapping
> 2. Don't zap for MTRR, etc.
Makes sense. Let's split it.
> > > There seems to be an attempt to abstract away the existence of
> > > Secure-
> > > EPT in mmu.c, that is not fully successful. In this case the code
> > > checks kvm_gfn_shared_mask() to see if it needs to handle the
> > > zapping
> > > in a way specific needed by S-EPT. It ends up being a little
> > > confusing
> > > because the actual check is about whether there is a shared bit. It
> > > only works because only S-EPT is the only thing that has a
> > > kvm_gfn_shared_mask().
> > >
> > > Doing something like (kvm->arch.vm_type == KVM_X86_TDX_VM) looks
> > > wrong,
> > > but is more honest about what we are getting up to here. I'm not
> > > sure
> > > though, what do you think?
> >
> > Right, I attempted and failed in zapping case. This is due to the
> > restriction
> > that the Secure-EPT pages must be removed from the leaves. the VMX
> > case (also
> > NPT, even SNP) heavily depends on zapping root entry as optimization.
> >
> > I can think of
> > - add TDX check. Looks wrong
> > - Use kvm_gfn_shared_mask(kvm). confusing
> > - Give other name for this check like zap_from_leafs (or better
> > name?)
> > The implementation is same to kvm_gfn_shared_mask() with comment.
> > - Or we can add a boolean variable to struct kvm
>
> Hmm, maybe wrap it in a function like:
> static inline bool kvm_can_only_zap_leafs(const struct kvm *kvm)
> {
> /* A comment explaining what is going on */
> return kvm->arch.vm_type == KVM_X86_TDX_VM;
> }
>
> But KVM seems to be a bit more on the open coded side when it comes to
> things like this, so not sure what maintainers would prefer. My opinion
> is the kvm_gfn_shared_mask() check is too strange and it's worth a new
> helper. If that is bad, then just open coded kvm->arch.vm_type ==
> KVM_X86_TDX_VM is the second best I think.
>
> I feel both strongly that it should be changed, and unsure what
> maintainers would prefer. Hopefully one will chime in.
Now compile time config is dropped, open code is option.
--
Isaku Yamahata <isaku.yamahata@...el.com>
Powered by blists - more mailing lists