lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 15 Dec 2020 07:59:27 -0300
From:   Marcelo Tosatti <>
To:     Paolo Bonzini <>
Cc:     Thomas Gleixner <>,
        Maxim Levitsky <>,,
        "H. Peter Anvin" <>, Jonathan Corbet <>,
        Jim Mattson <>,
        Wanpeng Li <>,
        "open list:KERNEL SELFTEST FRAMEWORK" 
        Vitaly Kuznetsov <>,
        Sean Christopherson <>,
        open list <>,
        Ingo Molnar <>,
        "maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)" <>,
        Joerg Roedel <>, Borislav Petkov <>,
        Shuah Khan <>,
        Andrew Jones <>,
        Oliver Upton <>,
        "open list:DOCUMENTATION" <>
Subject: Re: [PATCH v2 1/3] KVM: x86: implement KVM_{GET|SET}_TSC_STATE

On Fri, Dec 11, 2020 at 10:59:59PM +0100, Paolo Bonzini wrote:
> On 11/12/20 22:04, Thomas Gleixner wrote:
> > > Its 100ms off with migration, and can be reduced further (customers
> > > complained about 5 seconds but seem happy with 0.1ms).
> > What is 100ms? Guaranteed maximum migration time?
> I suppose it's the length between the time from KVM_GET_CLOCK and
> VM is paused for much longer, the sequence for the non-live part of the
> migration (aka brownout) is as follows:
>     pause
>     finish sending RAM            receive RAM               ~1 sec
>     send paused-VM state          finish receiving RAM     \
>                                   receive paused-VM state   ) 0.1 sec
>                                   restart                  /
> The nanosecond and TSC times are sent as part of the paused-VM state at the
> very end of the live migration process.
> So it's still true that the time advances during live migration brownout;
> 0.1 seconds is just the final part of the live migration process.  But for
> _live_ migration there is no need to design things according to "people are
> happy if their clock is off by 0.1 seconds only".  

Agree. What would be a good way to fix this? 

It seems to me using CLOCK_REALTIME as in the interface Maxim is
proposing is prone to difference in CLOCK_REALTIME itself.

Perhaps there is another way to measure that 0.1 sec which is
independent of the clock values of the source and destination hosts
(say by sending a packet once the clock stops counting).

Then on destination measure delta = clock_restart_time - packet_receival
and increase clock by that amount.

> Again, save-to-disk,
> reverse debugging and the like are a different story, which is why KVM
> should delegate policy to userspace (while documenting how to do it right).
> Paolo
> > CLOCK_REALTIME and CLOCK_TAI are off by the time the VM is paused and
> > this state persists up to the point where NTP corrects it with a time
> > jump.
> > 
> > So if migration takes 5 seconds then CLOCK_REALTIME is not off by 100ms
> > it's off by 5 seconds.
> > 
> > CLOCK_MONOTONIC/BOOTTIME might be off by 100ms between pause and resume.
> > 

Powered by blists - more mailing lists