[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1257b7b43472fad6287b648ec96fc27a89766eb9.camel@intel.com>
Date: Wed, 15 May 2024 19:23:28 +0000
From: "Edgecombe, Rick P" <rick.p.edgecombe@...el.com>
To: "seanjc@...gle.com" <seanjc@...gle.com>, "Huang, Kai"
<kai.huang@...el.com>
CC: "isaku.yamahata@...il.com" <isaku.yamahata@...il.com>, "sagis@...gle.com"
<sagis@...gle.com>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>, "Zhao, Yan Y" <yan.y.zhao@...el.com>, "Aktas,
Erdem" <erdemaktas@...gle.com>, "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
"pbonzini@...hat.com" <pbonzini@...hat.com>, "dmatlack@...gle.com"
<dmatlack@...gle.com>
Subject: Re: [PATCH 02/16] KVM: x86/mmu: Introduce a slot flag to zap only
slot leafs on slot deletion
On Wed, 2024-05-15 at 12:09 -0700, Sean Christopherson wrote:
> > It's weird that userspace needs to control how does KVM zap page table for
> > memslot delete/move.
>
> Yeah, this isn't quite what I had in mind. Granted, what I had in mind may
> not
> be much any better, but I definitely don't want to let userspace dictate
> exactly
> how KVM manages SPTEs.
To me it doesn't seem completely unprecedented at least. Linux has a ton of
madvise() flags and other knobs to control this kind of PTE management for
userspace memory.
>
> My thinking for a memslot flag was more of a "deleting this memslot doesn't
> have
> side effects", i.e. a way for userspace to give KVM the green light to deviate
> from KVM's historical behavior of rebuilding the entire page tables. Under
> the
> hood, KVM would be allowed to do whatever it wants, e.g. for the initial
> implementation, KVM would zap only leafs. But critically, KVM wouldn't be
> _required_ to zap only leafs.
>
> > So to me looks it's overkill to expose this "zap-leaf-only" to userspace.
> > We can just set this flag for a TDX guest when memslot is created in KVM.
>
> 100% agreed from a functionality perspective. My thoughts/concerns are more
> about
> KVM's ABI.
>
> Hmm, actually, we already have new uAPI/ABI in the form of VM types. What if
> we squeeze a documentation update into 6.10 (which adds the SEV VM flavors) to
> state that KVM's historical behavior of blasting all SPTEs is only
> _guaranteed_
> for KVM_X86_DEFAULT_VM?
>
> Anyone know if QEMU deletes shared-only, i.e. non-guest_memfd, memslots during
> SEV-* boot? If so, and assuming any such memslots are smallish, we could even
> start enforcing the new ABI by doing a precise zap for small (arbitrary limit
> TBD)
> shared-only memslots for !KVM_X86_DEFAULT_VM VMs.
Again thinking of the userspace memory analogy... Aren't there some VMs where
the fast zap is faster? Like if you have guest with a small memslot that gets
deleted all the time, you could want it to be zapped specifically. But for the
giant memslot next to it, you might want to do the fast zap all thing.
So rather then try to optimize zapping more someday and hit similar issues, let
userspace decide how it wants it to be done. I'm not sure of the actual
performance tradeoffs here, to be clear.
That said, a per-vm know is easier for TDX purposes.
Powered by blists - more mailing lists