linux-kernel - Re: [PATCH] KVM:SVM: Flush cache only on CPUs running SEV guest

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Zee2ogAOl8cR4vNZ@google.com>
Date: Tue, 5 Mar 2024 16:19:46 -0800
From: Sean Christopherson <seanjc@...gle.com>
To: Tom Lendacky <thomas.lendacky@....com>
Cc: Zheyun Shen <szy0127@...u.edu.cn>, pbonzini@...hat.com, tglx@...utronix.de, 
	kvm@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] KVM:SVM: Flush cache only on CPUs running SEV guest

On Tue, Mar 05, 2024, Tom Lendacky wrote:
> On 3/4/24 11:55, Sean Christopherson wrote:
> > +Tom
> > 
> > "KVM: SVM:" for the shortlog scope.
> > 
> > On Fri, Mar 01, 2024, Zheyun Shen wrote:
> > > On AMD CPUs without ensuring cache consistency, each memory page reclamation in
> > > an SEV guest triggers a call to wbinvd_on_all_cpus, thereby affecting the
> > > performance of other programs on the host.
> > > 
> > > Typically, an AMD server may have 128 cores or more, while the SEV guest might only
> > > utilize 8 of these cores. Meanwhile, host can use qemu-affinity to bind these 8 vCPUs
> > > to specific physical CPUs.
> > > 
> > > Therefore, keeping a record of the physical core numbers each time a vCPU runs
> > > can help avoid flushing the cache for all CPUs every time.
> > 
> > This needs an unequivocal statement from AMD that flushing caches only on CPUs
> > that do VMRUN is sufficient.  That sounds like it should be obviously correct,
> > as I don't see how else a cache line can be dirtied for the encrypted PA, but
> > this entire non-coherent caches mess makes me more than a bit paranoid.
> 
> As long as the wbinvd_on_all_cpus() related to the ASID flushing isn't
> changed, this should be ok. And the code currently flushes the source pages
> when doing LAUNCH_UPDATE commands and adding encrypted regions, so should be
> good there.

Nice, thanks!

> Would it make sense to make this configurable, with the current behavior the
> default, until testing looks good for a while?

I don't hate the idea, but I'm inclined to hit the "I'm feeling lucky" button.
I would rather we put in effort to all but guarantee we can do a clean revert in
the future, at which point a kill switch doesn't add all that much value.  E.g.
it would allow for a non-disruptive fix, and maybe a slightly faster confirmation
of a bug, but that's about it.

And since the fallout from this would be host data corruption, _not_ rebooting
hosts that may have been corrupted is probably a bad idea, i.e. the whole
non-disruptive fix benefit is quite dubious.

The other issue is that it'd be extremely difficult to know when we could/should
remove the kill switch.  It might be months or even years before anyone starts
running high volume of SEV/SEV-ES VMs with this optimization.