lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <894362115.582988.1503435653874.JavaMail.zimbra@redhat.com>
Date:   Tue, 22 Aug 2017 17:00:53 -0400 (EDT)
From:   Paolo Bonzini <pbonzini@...hat.com>
To:     John Stultz <john.stultz@...aro.org>
Cc:     Denis Plotnikov <dplotnikov@...tuozzo.com>,
        Radim Krcmar <rkrcmar@...hat.com>,
        kvm list <kvm@...r.kernel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>,
        "H. Peter Anvin" <hpa@...or.com>,
        lkml <linux-kernel@...r.kernel.org>, x86@...nel.org,
        rkagan@...tuozzo.com, den@...tuozzo.com,
        Marcelo Tosatti <mtosatti@...hat.com>
Subject: Re: [PATCH v4 00/10] make L2's kvm-clock stable, get rid of
 pvclock_gtod_copy in KVM


> I still don't feel my questions have been well answered. Its really
> not clear to me why, in order to allow the level-2 guest to use a vdso
> that the answer is to export more data through the entire stack rather
> then to make the kvmclock to be usable from the vsyscall.

Thanks, this helps.

A stable kvmclock is already usable from the vsyscall.  It is however not
yet usable _in the hypervisor_ as a way to provide another stable kvmclock
to the nested guest; right now the only clocksource that a hypervisor can
use to provide a stable kvmclock is the TSC.

So, regarding the "why is it necessary" part.  Even on a modern host with
invariant TSC, kvmclock mediates between TSC and the guest and provides for
example support for live migration, where the TSC frequency may be
different between source and destination.   If the L1 hypervisor could
use the TSC to provide a stable kvmclock, there would be no need for kvmclock
in the first place.  The paravirtualized clock may well disappear in a few
years since Skylake provides TSC scaling.  However, I'm not that optimistic
because people are complaining that I removed support for 2007 processors
and it seems that I'll have to put it back.  So, as more people use nested
virtualization (and we have nested virt migration in the works, too), nested
kvmclock becomes more important too.

Regarding the "why is it best" part.  Right now, the hypervisor makes a
copy of the timekeeper information in order to prepare the stable kvmclock.
This code is very much tied to the TSC.  However, a snapshot of the timekeeper
information is almost entirely the same thing that ktime_get_snapshot returns,
so my suggestion to "untie" the hypervisor code from the TSC was to use
ktime_get_snapshot instead.  This way, the clocksource itself tells KVM
whether it can be the base for a vsyscall-happy kvmclock (which means, it
must be the TSC or a linear transformation of it).

While I am very happy with how the KVM code comes out, it might certainly
be not the best solution---I definitely need help from the clocksource
maintainers here, not just approval!  In particular, it doesn't help that
a lot of code surrounding ktime_get_snapshot is unused, so that may have
sent me off track.

In particular, the return value of the new callback can be defined as "is
it the TSC or a linear transformation of it".  But that's as good a definition
as "is it good for KVM" (i.e., not very good) without some documentation on
the meaning of "cycles" in the struct returned by ktime_get_snapshot. Once I
understand that, I hope I can provide a better explanation for the return
value of the callback.

Paolo

> So far for a problem statement, all I've got is:
> "However, when using nested virtualization you have
> 
>         L0: bare-metal hypervisor (uses TSC)
>         L1: nested hypervisor (uses kvmclock, can use vsyscall)
>         L2: nested guest
> 
> and L2 cannot use vsyscall because it is not using the TSC."
> 
> Which is a start but doesn't really make it clear why the proposed
> solution is best/necessary.
> 
> thanks
> -john
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ