[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <510e0391-410b-40be-b556-f91554b8d3a1@redhat.com>
Date: Wed, 12 Mar 2025 13:24:32 +0100
From: Paolo Bonzini <pbonzini@...hat.com>
To: Isaku Yamahata <isaku.yamahata@...el.com>, kvm@...r.kernel.org
Cc: Sean Christopherson <seanjc@...gle.com>, chao.gao@...el.com,
rick.p.edgecombe@...el.com, yan.y.zhao@...el.com,
linux-kernel@...r.kernel.org, isaku.yamahata@...il.com,
Nikunj A Dadhania <nikunj@....com>, Marcelo Tosatti <mtosatti@...hat.com>
Subject: Re: [PATCH 2/2] KVM: x86: Don't allow tsc_offset, tsc_scaling_ratio
to change
On 10/12/24 09:55, Isaku Yamahata wrote:
> Add guest_tsc_protected member to struct kvm_arch_vcpu and prohibit
> changing TSC offset/multiplier when guest_tsc_protected is true.
Thanks Isaku! To match the behavior of the SEV-SNP patches, this is
also needed, which I have added to kvm-coco-queue:
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index dc2f14a6d8a1..ccde7c2b2248 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3919,7 +3925,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
case MSR_IA32_TSC:
if (msr_info->host_initiated) {
kvm_synchronize_tsc(vcpu, &data);
- } else {
+ } else if (!vcpu->arch.guest_tsc_protected) {
u64 adj = kvm_compute_l1_tsc_offset(vcpu, data) - vcpu->arch.l1_tsc_offset;
adjust_tsc_offset_guest(vcpu, adj);
vcpu->arch.ia32_tsc_adjust_msr += adj;
Also, I rewrote the commit message as follows:
Add guest_tsc_protected member to struct kvm_arch_vcpu and prohibit
changing TSC offset/multiplier when guest_tsc_protected is true.
X86 confidential computing technology defines protected guest TSC so that
the VMM can't change the TSC offset/multiplier once vCPU is initialized.
SEV-SNP defines Secure TSC as optional, whereas TDX mandates it.
KVM has common logic on x86 that tries to guess or adjust TSC
offset/multiplier for better guest TSC and TSC interrupt latency
at KVM vCPU creation (kvm_arch_vcpu_postcreate()), vCPU migration
over pCPU (kvm_arch_vcpu_load()), vCPU TSC device attributes
(kvm_arch_tsc_set_attr()) and guest/host writing to TSC or TSC adjust MSR
(kvm_set_msr_common()).
The current x86 KVM implementation conflicts with protected TSC because the
VMM can't change the TSC offset/multiplier.
Because KVM emulates the TSC timer or the TSC deadline timer with the TSC
offset/multiplier, the TSC timer interrupts is injected to the guest at the
wrong time if the KVM TSC offset is different from what the TDX module
determined.
Originally this issue was found by cyclic test of rt-test [1] as the
latency in TDX case is worse than VMX value + TDX SEAMCALL overhead. It
turned out that the KVM TSC offset is different from what the TDX module
determines.
Disable or ignore the KVM logic to change/adjust the TSC offset/multiplier
somehow, thus keeping the KVM TSC offset/multiplier the same as the
value of the TDX module. Writes to MSR_IA32_TSC are also blocked as
they amount to a change in the TSC offset.
[1] https://git.kernel.org/pub/scm/utils/rt-tests/rt-tests.git
Paolo
> Reported-by: Marcelo Tosatti <mtosatti@...hat.com>
> Signed-off-by: Isaku Yamahata <isaku.yamahata@...el.com>
> ---
> arch/x86/include/asm/kvm_host.h | 1 +
> arch/x86/kvm/x86.c | 9 ++++++++-
> 2 files changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 61b7e9fe5e57..112b8a4f1860 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1036,6 +1036,7 @@ struct kvm_vcpu_arch {
>
> /* Protected Guests */
> bool guest_state_protected;
> + bool guest_tsc_protected;
>
> /*
> * Set when PDPTS were loaded directly by the userspace without
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 65d871bb5b35..a6cf4422df28 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -2587,6 +2587,9 @@ EXPORT_SYMBOL_GPL(kvm_calc_nested_tsc_multiplier);
>
> static void kvm_vcpu_write_tsc_offset(struct kvm_vcpu *vcpu, u64 l1_offset)
> {
> + if (vcpu->arch.guest_tsc_protected)
> + return;
> +
> trace_kvm_write_tsc_offset(vcpu->vcpu_id,
> vcpu->arch.l1_tsc_offset,
> l1_offset);
> @@ -2650,6 +2653,9 @@ static void __kvm_synchronize_tsc(struct kvm_vcpu *vcpu, u64 offset, u64 tsc,
>
> lockdep_assert_held(&kvm->arch.tsc_write_lock);
>
> + if (vcpu->arch.guest_tsc_protected)
> + return;
> +
> if (user_set_tsc)
> vcpu->kvm->arch.user_set_tsc = true;
>
> @@ -5028,7 +5034,8 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
> u64 offset = kvm_compute_l1_tsc_offset(vcpu,
> vcpu->arch.last_guest_tsc);
> kvm_vcpu_write_tsc_offset(vcpu, offset);
> - vcpu->arch.tsc_catchup = 1;
> + if (!vcpu->arch.guest_tsc_protected)
> + vcpu->arch.tsc_catchup = 1;
> }
>
> if (kvm_lapic_hv_timer_in_use(vcpu))
Powered by blists - more mailing lists