linux-kernel - Re: [RFC PATCH v3] Fix: clocksource watchdog marks TSC unstable on guest VM

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.DEB.2.11.1509081627450.3854@nanos>
Date:	Tue, 8 Sep 2015 17:08:03 +0200 (CEST)
From:	Thomas Gleixner <tglx@...utronix.de>
To:	Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
cc:	LKML <linux-kernel@...r.kernel.org>,
	Daniel Lezcano <daniel.lezcano@...aro.org>,
	John Stultz <john.stultz@...aro.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...nel.org>, Gleb Natapov <gleb@...nel.org>,
	Paolo Bonzini <pbonzini@...hat.com>, Shaohua Li <shli@...com>
Subject: Re: [RFC PATCH v3] Fix: clocksource watchdog marks TSC unstable on
 guest VM

On Tue, 8 Sep 2015, Mathieu Desnoyers wrote:
> Introduce WATCHDOG_RETRY to bound the number of retry (in the
> unlikely event of a bogus clock source for wdnow). If the
> number of retry has been reached, disable the watchdog timer.

This does not make any sense at all. Why would the clocksource be
bogus? I rather say, that the whole idea of trying to watchdog the TSC
in a VM is bogus.

There is no guarantee, that the readout of the TSC and the watchdog is
not disturbed by VM scheduling. Aside of that, the HPET emulation goes
all the way back into qemu user land and the implementation itself
does not make me more confident. Be happy that we don't support 64bit
HPET in the kernel as that emulation code is completely broken.

I really have to ask the question WHY we actually do this. There is
absolutely no point at all.

The TSC watchdog is there to catch a few issues with the TSC

   1) Frequency changing behind the kernels back

   2) SMM driven power safe state 'features' which cause the TSC to
      stop

   3) SMM fiddling with the TSC

   4) TSC drifting apart on multi socket systems

#1    Is completely irrelevant for KVM as all machines which have
      hardware virtualization have a frequency constant TSC

#2    Is irrelevant for KVM as well, because the machine does not go
      into deep idle states while the guest is running.

#3/#4 That are the only relevant issues, but there is absolutely no
      need to do this detection in the guest.

We already have a TSC sanity check on the host. So instead of adding
horrible hackery and magic detection, shutoff, retry mechanisms, we
can simply let the guest know, that the TSC has been buggered.

On paravirt kernels we can do that today and AFAICT the
pvclock/kvmclock code has enough magic to deal with all the oddities
already.

For non paravirt kernels which can read the TSC directly, we'd need a
way to transport that information. A simple mechanism would be to
query an emulated MSR from the watchdog which tells the guest the
state of affairs on the host side. That would be a sensible and
minimal invasive change on both host and guests.

Thoughts?

Thanks,

	tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/