linux-kernel - Re: [PATCH] KVM:SVM: Flush cache only on CPUs running SEV guest

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <76744e4b-361d-43ae-9a52-6a410ed57303@amd.com>
Date: Wed, 6 Mar 2024 10:26:42 -0600
From: Tom Lendacky <thomas.lendacky@....com>
To: Sean Christopherson <seanjc@...gle.com>
Cc: Zheyun Shen <szy0127@...u.edu.cn>, pbonzini@...hat.com,
 tglx@...utronix.de, kvm@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] KVM:SVM: Flush cache only on CPUs running SEV guest

On 3/5/24 18:19, Sean Christopherson wrote:
> On Tue, Mar 05, 2024, Tom Lendacky wrote:
>> On 3/4/24 11:55, Sean Christopherson wrote:
>>> +Tom
>>>
>>> "KVM: SVM:" for the shortlog scope.
>>>
>>> On Fri, Mar 01, 2024, Zheyun Shen wrote:
>>>> On AMD CPUs without ensuring cache consistency, each memory page reclamation in
>>>> an SEV guest triggers a call to wbinvd_on_all_cpus, thereby affecting the
>>>> performance of other programs on the host.
>>>>
>>>> Typically, an AMD server may have 128 cores or more, while the SEV guest might only
>>>> utilize 8 of these cores. Meanwhile, host can use qemu-affinity to bind these 8 vCPUs
>>>> to specific physical CPUs.
>>>>
>>>> Therefore, keeping a record of the physical core numbers each time a vCPU runs
>>>> can help avoid flushing the cache for all CPUs every time.
>>>
>>> This needs an unequivocal statement from AMD that flushing caches only on CPUs
>>> that do VMRUN is sufficient.  That sounds like it should be obviously correct,
>>> as I don't see how else a cache line can be dirtied for the encrypted PA, but
>>> this entire non-coherent caches mess makes me more than a bit paranoid.
>>
>> As long as the wbinvd_on_all_cpus() related to the ASID flushing isn't
>> changed, this should be ok. And the code currently flushes the source pages
>> when doing LAUNCH_UPDATE commands and adding encrypted regions, so should be
>> good there.
> 
> Nice, thanks!
> 
>> Would it make sense to make this configurable, with the current behavior the
>> default, until testing looks good for a while?
> 
> I don't hate the idea, but I'm inclined to hit the "I'm feeling lucky" button.
> I would rather we put in effort to all but guarantee we can do a clean revert in
> the future, at which point a kill switch doesn't add all that much value.  E.g.
> it would allow for a non-disruptive fix, and maybe a slightly faster confirmation
> of a bug, but that's about it.
> 
> And since the fallout from this would be host data corruption, _not_ rebooting
> hosts that may have been corrupted is probably a bad idea, i.e. the whole
> non-disruptive fix benefit is quite dubious.
> 
> The other issue is that it'd be extremely difficult to know when we could/should
> remove the kill switch.  It might be months or even years before anyone starts
> running high volume of SEV/SEV-ES VMs with this optimization.

I can run the next version of the patch through our CI and see if it 
uncovers anything. I just worry about corner cases... but then that's just me.

Thanks,
Tom