[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <2a542f07-2158-16aa-e3cb-5431081ee1f6@gmail.com>
Date: Tue, 1 Aug 2023 10:26:22 +0800
From: Like Xu <like.xu.linux@...il.com>
To: Oliver Upton <oliver.upton@...ux.dev>
Cc: Paolo Bonzini <pbonzini@...hat.com>,
Sean Christopherson <seanjc@...gle.com>, kvm@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3] KVM: x86/tsc: Don't sync user changes to TSC with
KVM-initiated change
On 1/8/2023 2:29 am, Oliver Upton wrote:
> On Mon, Jul 31, 2023 at 04:07:58PM +0800, Like Xu wrote:
>> From: Like Xu <likexu@...cent.com>
>>
>> Add kvm->arch.user_changed_tsc to avoid synchronizing user changes to
>> the TSC with the KVM-initiated change in kvm_arch_vcpu_postcreate() by
>> conditioning this mess on userspace having written the TSC at least
>> once already.
>>
>> Here lies UAPI baggage: user-initiated TSC write with a small delta
>> (1 second) of virtual cycle time against real time is interpreted as an
>> attempt to synchronize the CPU. In such a scenario, the vcpu's tsc_offset
>> is not configured as expected, resulting in significant guest service
>> response latency, which is observed in our production environment.
>
> The changelog reads really weird, because it is taken out of context
> when it isn't a comment over the affected code. Furthermore, 'our
> production environment' is a complete black box to the rest of the
> community, it would be helpful spelling out exactly what the use case
> is.
>
> Suggested changelog:
>
> KVM interprets writes to the TSC with values within 1 second of each
> other as an attempt to synchronize the TSC for all vCPUs in the VM,
> and uses a common offset for all vCPUs in a VM. For brevity's sake
> let's just ignore what happens on systems with an unstable TSC.
>
> While this may seem odd, it is imperative for VM save/restore, as VMMs
> such as QEMU have long resorted to saving the TSCs (by value) from all
> vCPUs in the VM at approximately the same time. Of course, it is
> impossible to synchronize all the vCPU ioctls to capture the exact
> instant in time, hence KVM fudges it a bit on the restore side.
>
> This has been useful for the 'typical' VM lifecycle, where in all
> likelihood the VM goes through save/restore a considerable amount of
> time after VM creation. Nonetheless, there are some use cases that
> need to restore a VM snapshot that was created very shortly after boot
> (<1 second). Unfortunately the TSC sync code makes no distinction
> between kernel and user-initiated writes, which leads to the target VM
> synchronizing on the TSC offset from creation instead of the
> user-intended value.
Great clarification. Thanks, we're on the same page.
>
> Avoid synchronizing user-initiated changes to the guest TSC with the
> KVM initiated change in kvm_arch_vcpu_postcreate() by conditioning the
> logic on userspace having written the TSC at least once.
>
> I'll also note that the whole value-based TSC sync scheme is in
> desperate need of testing.
>
Powered by blists - more mailing lists