linux-kernel - Re: [PATCH] raise tsc clocksource rating

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20071030071435.GA17074@elte.hu>
Date:	Tue, 30 Oct 2007 08:14:35 +0100
From:	Ingo Molnar <mingo@...e.hu>
To:	Dan Hecht <dhecht@...are.com>
Cc:	Zachary Amsden <zach@...are.com>,
	Glauber de Oliveira Costa <gcosta@...hat.com>,
	linux-kernel@...r.kernel.org, tglx@...utronix.de,
	rusty@...tcorp.com.au, jeremy@...p.org, --cc@...hat.com,
	avi@...amnet.com, kvm-devel@...ts.sourceforge.net,
	Glauber de Oliveira Costa <glauber@....localdomain>,
	Garrett Smith <garrett@...are.com>
Subject: Re: [PATCH] raise tsc clocksource rating

* Dan Hecht <dhecht@...are.com> wrote:

>> but if there's a perfect TSC available (there is such hardware) then 
>> the TSC _is_ the best clocksource. Paravirt now turns it off 
>> unconditionally in essence.
>
> Not really.  In the case hardware TSC is perfect, the paravirt time 
> counter can be implemented directly in terms of hardware TSC; there is 
> no loss in optimization.  This is done transparently.  And virtual TSC 
> can be implemented this way too.

Of course if you duplicate all (or part) of the TSC clocksource driver 
in the paravirt guest code then the "paravirt clocksource" is at least 
as good as the TSC. But that argument is playing word-games, _of course_ 
if you use the same (or similar) code it's at least as good. The real 
question are clocksources that communicate out to the hypervisor, and 
hence have higher overhead than a native, TSC based clocksource - and 
clocksources that use the TSC in a broken way.

> The real improvement that a paravirt clocksource offers over the TSC 
> clocksource is that the guest does not need to measure the TSC 
> frequency itself against some other constant frequency source (which 
> is problematic on a virtual machine). [...]

hey, you need not tell me, i've implemented a hyper-clocksource driver 
myself. But calibration is a boot only issue and there's no reason why 
calibration _has_ to be fragile. For example we could easily extend the 
TSC clocksource driver to not calibrate in the guest but take 
calibration information from the host. It's in essence a trivial and 
obvious extension to calibration. That way we get the highest possible 
performance _and_ we share much of the clocksource driver with the host.

also, the way the TSC is used by guests like Xen is fundamentally 
fragile on SMP. So i have a good reason to distrust the approach of 
hypervisors to timekeeping. The maintenance problem to me is that 
everyone in the paravirt space is busy coding away in their own (often 
broken) direction, replicating the essence of the TSC clocksource driver 
4 times over again and again, with subtle bugs in each variant, even in 
cases where the TSC readout can be trusted perfectly well. 
"Consolidation" and "sharing code" is not a particularly strong point of 
the paravirt projects ;-) (ok, KVM is a notable exception there.)

anyway, i do agree that this patch is wrong currently, mainly due to TSC 
calibration not being reliable in guest-space at the moment - but the 
whole concept of putting a separate clocksource driver into each 
paravirt guest, even in the case where the TSC is perfect, is madness. 
That code, once the hardware gets sane (and there are good signs for 
that), and once calibration can be passed from host to guest reliably, 
_will_ be consolidated, because it makes perfect technical sense.

	Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/