linux-kernel - Re: [PATCH v7 3/3] KVM: SVM: Flush cache only on CPUs running SEV guest

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <D32EF18F-7C4D-466B-9058-1EBD4C378EFC@sjtu.edu.cn>
Date: Wed, 26 Feb 2025 11:26:28 +0800
From: Zheyun Shen <szy0127@...u.edu.cn>
To: Sean Christopherson <seanjc@...gle.com>
Cc: Tom Lendacky <thomas.lendacky@....com>,
 pbonzini@...hat.com,
 tglx@...utronix.de,
 Kevin Loughlin <kevinloughlin@...gle.com>,
 mingo@...hat.com,
 bp@...en8.de,
 kvm@...r.kernel.org,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH v7 3/3] KVM: SVM: Flush cache only on CPUs running SEV
 guest

I'm very sorry that the formatting of my previous email was messed up due to an issue with the email client. I am sending a new email with the same content.

> Sean Christopherson <seanjc@...gle.com> wrote：
> 
> On Tue, Jan 28, 2025, Zheyun Shen wrote:
>> On AMD CPUs without ensuring cache consistency, each memory page
>> reclamation in an SEV guest triggers a call to wbinvd_on_all_cpus(),
>> thereby affecting the performance of other programs on the host.
>> 
>> Typically, an AMD server may have 128 cores or more, while the SEV guest
>> might only utilize 8 of these cores. Meanwhile, host can use qemu-affinity
>> to bind these 8 vCPUs to specific physical CPUs.
>> 
>> Therefore, keeping a record of the physical core numbers each time a vCPU
>> runs can help avoid flushing the cache for all CPUs every time.
>> 
>> Signed-off-by: Zheyun Shen <szy0127@...u.edu.cn>
>> ---
>> arch/x86/kvm/svm/sev.c | 30 +++++++++++++++++++++++++++---
>> arch/x86/kvm/svm/svm.c |  2 ++
>> arch/x86/kvm/svm/svm.h |  5 ++++-
>> 3 files changed, 33 insertions(+), 4 deletions(-)
>> 
>> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
>> index 1ce67de9d..4b80ecbe7 100644
>> --- a/arch/x86/kvm/svm/sev.c
>> +++ b/arch/x86/kvm/svm/sev.c
>> @@ -252,6 +252,27 @@ static void sev_asid_free(struct kvm_sev_info *sev)
>> sev->misc_cg = NULL;
>> }
>> 
>> +void sev_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
> 
> And now I'm very confused.
> 
> v1 and v2 marked the CPU dirty in pre_sev_run(), which AFAICT is exactly when a
> CPU should be recorded as having dirtied memory.  v3 fixed a bug with using
> get_cpu(), but otherwise was unchanged.  Tom even gave a Tested-by for v3.
> 
> Then v4 comes along, and without explanation, moved the code to vcpu_load().
> 
I apologize for not including sufficient information in the changelog, which led to your confusion.

> Changed the time of recording the CPUs from pre_sev_run() to vcpu_load().
> 
> Why?  If there's a good reason, then that absolutely, positively belongs in the
> changelog and in the code as a comment.  If there's no good reason, then...
> 
The reason I moved the timing of CPU recording from pre_sev_run(） to vcpu_load() is that I found vcpu_load() is always present in the call path of kvm_arch_vcpu_ioctl_run(). Moreover, whenever a vCPU migration occurs, the control flow will reach vcpu_load() again to ensure the correctness of CPU recording. On the other hand, recording information in pre_sev_run() would result in recording the CPU number every time before entering the guest. Without vCPU migration, only the first time to record is effective and the subsequent records are redundant and thus waste time. This would result in each VM exit taking longer (although the additional time may be very short).

> Unless I hear otherwise, my plan is to move this back to pre_sev_run().
> 

Another issue in the v3 version is that I incorrectly cleared the recorded mask after each cache flushing. The mask bits should be cleared and changed at the time of vCPU migration rather than after a cache flushing.