lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 22 Sep 2020 18:39:34 +0300
From:   Maxim Levitsky <mlevitsk@...hat.com>
To:     Paolo Bonzini <pbonzini@...hat.com>,
        Sean Christopherson <sean.j.christopherson@...el.com>
Cc:     kvm@...r.kernel.org, linux-kernel@...r.kernel.org,
        Jim Mattson <jmattson@...gle.com>,
        Wanpeng Li <wanpengli@...cent.com>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        Thomas Gleixner <tglx@...utronix.de>,
        Vitaly Kuznetsov <vkuznets@...hat.com>,
        "maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)" <x86@...nel.org>,
        Joerg Roedel <joro@...tes.org>,
        "H. Peter Anvin" <hpa@...or.com>
Subject: Re: [PATCH v2 1/1] KVM: x86: fix MSR_IA32_TSC read for nested
 migration

On Tue, 2020-09-22 at 17:50 +0300, Maxim Levitsky wrote:
> On Tue, 2020-09-22 at 14:50 +0200, Paolo Bonzini wrote:
> > On 21/09/20 18:23, Sean Christopherson wrote:
> > > Avoid "should" in code comments and describe what the code is doing, not what
> > > it should be doing.  The only exception for this is when the code has a known
> > > flaw/gap, e.g. "KVM should do X, but because of Y, KVM actually does Z".
> > > 
> > > > +		 * return it's real L1 value so that its restore will be correct.
> > > s/it's/its
> > > 
> > > Perhaps add "unconditionally" somewhere, since arch.tsc_offset can also contain
> > > the L1 value.  E.g. 
> > > 
> > > 		 * Unconditionally return L1's TSC offset on userspace reads
> > > 		 * so that userspace reads and writes always operate on L1's
> > > 		 * offset, e.g. to ensure deterministic behavior for migration.
> > > 		 */
> > > 
> > 
> > Technically the host need not restore MSR_IA32_TSC at all.  This follows
> > the idea of the discussion with Oliver Upton about transmitting the
> > state of the kvmclock heuristics to userspace, which include a (TSC,
> > CLOCK_MONOTONIC) pair to transmit the offset to the destination.  All
> > that needs to be an L1 value is then the TSC value in that pair.
> > 
> > I'm a bit torn over this patch.  On one hand it's an easy solution, on
> > the other hand it's... just wrong if KVM_GET_MSR is used for e.g.
> > debugging the guest.
> 
> Could you explain why though? After my patch, the KVM_GET_MSR will consistently
> read the L1 TSC, just like all other MSRs as I explained. I guess for debugging,
> this should work?
> 
> The fact that TSC reads with the guest offset is a nice exception made for the guests,
> that insist on reading this msr without inteception and not using rdtsc.
> 
> Best regards,
> 	Maxim Levitsky
> 
> > I'll talk to Maxim and see if he can work on the kvmclock migration stuff.

We talked about this on IRC and now I am also convinced that we should implement
proper TSC migration instead, so I guess I'll drop this patch and I will implement it.

Last few weeks I was digging through all the timing code, and I mostly understand it
so it shouldn't take me much time to implement it.

There is hope that this will make nested migration fully stable since, with this patch,
it still sometimes hangs. While on my AMD machine it takes about half a day of migration
cycles to reproduce this, on my Intel's laptop even with this patch I can hang the nested
guest after 10-20 cycles. The symptoms look very similar to the issue that this patch
tried to fix.
 
Maybe we should keep the *comment* I added to document this funny TSC read behavior. 
When I implement the whole thing, maybe I add a comment only version of this patch
for that.

Best regards,
	Maxim Levitsky 

> > 
> > Paolo
> > 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ