lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 17 Apr 2012 09:19:28 +0100
From:	Tim Deegan <tim@....org>
To:	Sheng Yang <sheng@...ker.org>
Cc:	Dan Magenheimer <dan.magenheimer@...cle.com>,
	David Vrabel <david.vrabel@...rix.com>,
	Jan Beulich <JBeulich@...e.com>,
	Konrad Wilk <konrad.wilk@...cle.com>,
	linux-kernel@...r.kernel.org, xen-devel <xen-devel@...ts.xen.org>,
	Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [Xen-devel] [PATCH] xen: always set the sched clock as unstable

At 16:01 -0700 on 16 Apr (1334592096), Sheng Yang wrote:
> So I think there are maybe *two* bugs in this issue, one caused time
> jump(detail below), the other in the kernel triggered by the first bug
> sometime, thus result in migration fail.
> 
> I've spent some time to identify the timestamp jump issue, and finally
> found it's due to Invarient TSC (CPUID Leaf 0x80000007 EDX:8, also
> called non-stop TSC). The present of the feature would enable a
> parameter in the kernel named: sched_clock_stable. Seems this
> parameter is unable to work with Xen's pvclock. If
> sched_clock_stable() is set, value returned by xen_clocksource_read()
> would be returned as sched_clock_cpu() directly(rather than calculated
> through sched_clock_local()), but CMIIW the value returned by
> xen_clocksource_read() is based on host(vcpu) uptime rather than this
> VM's uptime, then result in the timestamp jump.

OK - that seems like a kernel bug.  Linux should not be modifying how it
treats the PV clocksource based on the 'Invariant TSC' bit.
(Conversely, the patch to pretend the TSC is not invariant just because
the PV clocksource is present also seems wrong, and the earlier patch
that just enforces sched_clock_stable=0 would be better.)

> I've compiled a kernel, force sched_clock_stable=0, then it solved the
> timestamp jump issue as expected. Luckily, seems it also solved the
> call trace and guest hang issue as well.
> 
> I've posted a patch to mask the CPUID leaf 0x80000007 in Xen.

Well, as Dan says, if Xen is emulating RDTSC to provide a 'stable' TSC, 
we shouldn't _also_ tell the guest that it's not stable. :)  

OTOH, grepping for CONSTANT_TSC, NONSTOP_TSC, and TSC_RELIABLE, I don't
see anywhere even in xen-unstable where these bits are ever hidden from
the guest.  I think it would be reasonable to mask this from PV guests
at least for tsc_mode == 2, and on older Xens.

Tim.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ