lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 27 Feb 2017 16:59:54 +0100
From:   Peter Zijlstra <peterz@...radead.org>
To:     Paolo Bonzini <pbonzini@...hat.com>
Cc:     Wanpeng Li <kernellwp@...il.com>, Mike Galbraith <efault@....de>,
        LKML <linux-kernel@...r.kernel.org>, Ingo Molnar <mingo@...e.hu>,
        Thomas Gleixner <tglx@...utronix.de>,
        Borislav Petkov <bp@...en8.de>
Subject: Re: tip.today - scheduler bam boom crash (cpu hotplug)

On Mon, Feb 27, 2017 at 04:27:32PM +0100, Paolo Bonzini wrote:
> 
> 
> On 27/02/2017 14:04, Peter Zijlstra wrote:
> >>>> This results in sched clock always unstable for kvm guest since there
> >>>> is no invariant tsc cpuid bit exposed for kvm guest currently. 
> >>> What the heck is KVM_FEATURE_CLOCKSOURCE_STABLE_BIT /
> >>> PVCLOCK_TSC_STABLE_BIT about then?
> >> It checks that all the bugs in the host have been ironed out, and that
> >> the host itself supports invtsc.
> > But what does it mean if that is not so? That is, will kvm_clock_read()
> > still be stable even if !stable?
> 
> If kvmclock is !stable, nobody should have set that the sched clock to
> stable, to begin with.

OK, so if !KVM_FEATURE_CLOCKSOURCE_STABLE_BIT nothing is stable, but if
it is set, TSC might still not be stable, but kvm_clock_read() is.

> However, if kvmclock is stable, we know that the sched clock is stable.

Right, so the problem is that we only ever want to allow marking
unstable -- once its found unstable, for whatever reason, we should
never allow going stable. The corollary of this proposition is that we
must start out assuming it will become stable. And to avoid actually
using unstable TSC we do a 3 state bringup:

 1) sched_clock_running = 0, __stable_early = 1, __stable = 0
 2) sched_clock_running = 1 (__stable is effective, iow, we run unstable)
 3) sched_clock_running = 2 (__stable <- __stable_early)

2) happens 'early' but is 'safe'.
3) happens 'late', after we've brought up SMP and probed TSC

Between there, we should have detected the most common TSC wreckage and
made sure to not then switch to 'stable' at 3.

Now the problem appears to be that we assume sched_clock will use RDTSC
(native_sched_clock) while sched_clock is a paravirt op.

Now, I've not yet figured out the ordering between when we set
pv_time_ops.sched_clock and when we do the 'normal' TSC init stuff.

But it appears to me, we should not be calling
clear_sched_clock_stable() on TSC bits when we don't end up using
native_sched_clock().


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ