[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20210331124130.337992-2-vkuznets@redhat.com>
Date: Wed, 31 Mar 2021 14:41:29 +0200
From: Vitaly Kuznetsov <vkuznets@...hat.com>
To: kvm@...r.kernel.org, Paolo Bonzini <pbonzini@...hat.com>
Cc: Sean Christopherson <seanjc@...gle.com>,
Wanpeng Li <wanpengli@...cent.com>,
Jim Mattson <jmattson@...gle.com>,
Marcelo Tosatti <mtosatti@...hat.com>,
linux-kernel@...r.kernel.org
Subject: [PATCH v3 1/2] KVM: x86: Prevent 'hv_clock->system_time' from going negative in kvm_guest_time_update()
When guest time is reset with KVM_SET_CLOCK(0), it is possible for
'hv_clock->system_time' to become a small negative number. This happens
because in KVM_SET_CLOCK handling we set 'kvm->arch.kvmclock_offset' based
on get_kvmclock_ns(kvm) but when KVM_REQ_CLOCK_UPDATE is handled,
kvm_guest_time_update() does (masterclock in use case):
hv_clock.system_time = ka->master_kernel_ns + v->kvm->arch.kvmclock_offset;
And 'master_kernel_ns' represents the last time when masterclock
got updated, it can precede KVM_SET_CLOCK() call. Normally, this is not a
problem, the difference is very small, e.g. I'm observing
hv_clock.system_time = -70 ns. The issue comes from the fact that
'hv_clock.system_time' is stored as unsigned and 'system_time / 100' in
compute_tsc_page_parameters() becomes a very big number.
Use 'master_kernel_ns' instead of get_kvmclock_ns() when masterclock is in
use and get_kvmclock_base_ns() when it's not to prevent 'system_time' from
going negative.
Signed-off-by: Vitaly Kuznetsov <vkuznets@...hat.com>
---
arch/x86/kvm/x86.c | 19 +++++++++++++++++--
1 file changed, 17 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 2bfd00da465f..2f54beed0105 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5728,6 +5728,7 @@ long kvm_arch_vm_ioctl(struct file *filp,
}
#endif
case KVM_SET_CLOCK: {
+ struct kvm_arch *ka = &kvm->arch;
struct kvm_clock_data user_ns;
u64 now_ns;
@@ -5746,8 +5747,22 @@ long kvm_arch_vm_ioctl(struct file *filp,
* pvclock_update_vm_gtod_copy().
*/
kvm_gen_update_masterclock(kvm);
- now_ns = get_kvmclock_ns(kvm);
- kvm->arch.kvmclock_offset += user_ns.clock - now_ns;
+
+ /*
+ * This pairs with kvm_guest_time_update(): when masterclock is
+ * in use, we use master_kernel_ns + kvmclock_offset to set
+ * unsigned 'system_time' so if we use get_kvmclock_ns() (which
+ * is slightly ahead) here we risk going negative on unsigned
+ * 'system_time' when 'user_ns.clock' is very small.
+ */
+ spin_lock_irq(&ka->pvclock_gtod_sync_lock);
+ if (kvm->arch.use_master_clock)
+ now_ns = ka->master_kernel_ns;
+ else
+ now_ns = get_kvmclock_base_ns();
+ ka->kvmclock_offset = user_ns.clock - now_ns;
+ spin_unlock_irq(&ka->pvclock_gtod_sync_lock);
+
kvm_make_all_cpus_request(kvm, KVM_REQ_CLOCK_UPDATE);
break;
}
--
2.30.2
Powered by blists - more mailing lists