lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <48DC1338.6050107@goop.org>
Date:	Thu, 25 Sep 2008 15:39:52 -0700
From:	Jeremy Fitzhardinge <jeremy@...p.org>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
CC:	Ingo Molnar <mingo@...e.hu>, Martin Bligh <mbligh@...gle.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Martin Bligh <mbligh@...igh.org>,
	Steven Rostedt <rostedt@...dmis.org>,
	linux-kernel@...r.kernel.org, Thomas Gleixner <tglx@...utronix.de>,
	Andrew Morton <akpm@...ux-foundation.org>,
	prasad@...ux.vnet.ibm.com,
	Mathieu Desnoyers <compudj@...stal.dyndns.org>,
	"Frank Ch. Eigler" <fche@...hat.com>,
	David Wilder <dwilder@...ibm.com>, hch@....de,
	Tom Zanussi <zanussi@...cast.net>,
	Steven Rostedt <srostedt@...hat.com>
Subject: Re: [RFC PATCH 1/3] Unified trace buffer

Linus Torvalds wrote:
> The reason I say "and no" is that it's not technically really possible to 
> atomically give the exact TSC at which the frequency change took place. We 
> just don't have the information, and I doubt we will ever have it.
>   

Well, you don't need the tsc at the precise moment of the frequency
change.  You just need to emit the current tsc+frequency+wallclock time
before you emit any more delta records after the frequency change.  You
can't fetch all those values instantaneously, but you can get close.


> As such, there is no point in trying to make it a low-level special op, 
> because we'd _still_ end up being totally equivalent with just doing as 
> regular trace-event, with a regular TSC field, and then just fill the data 
> field with the new frequency.
>
> But yes, I do think we'd need to have that as a trace packet type. I 
> thought I even said so in my RFC for packet types. Ahh, it was in the 
> follow-up:
>
>   
>> I guess I should perhaps have put the TSC frequency in there in that "case 
>> 2" thing too. Maybe that should be in "data" (in kHz) and tv_sec/tv_nsec 
>> should be in array[0..1], and the time sync packet would be 24 bytes.
>>     
>
> but yes, we obviously need the frequency in order to calculate some kind 
> of wall-clock time (it doesn't _have_ to be in the same packet type as the 
> thing that tries to sync with a real clock, but it makes sense for it to 
> be there.
>   

Yeah.  If you ever mention wallclock time in the event stream, you have
to tie it to your local timebase (tsc+frequency) to make the whole thing
fit together.

> That said, if people think they can do a good job of ns conversion, I'll 
> stop arguing. Quite frankly, I think people are wrong about that, and 
> quite frankly, I think that anybody who looks even for one second at those 
> "alternate" sched_clock() implementations should realize that they aren't 
> suitable, but whatever. I'm not writing the code, I can only try to 
> convince people to not add the insane call-chains we have now.

Yeah.  Unfortunately, in the virtual case - unless you're virtualizing
the tsc itself, which is horrible - you can't really control or measure
how the tsc is going to behave, because its all under the hypervisor's
control.  A "cpu" could be migrated between different physical cpus, the
whole machine could be migrated between hosts, or suspended, etc, making
it very hard to use the naked tsc.  In that case the only real option is
to use a hypervisor-supplied timebase (which for Xen and KVM is a
tsc-based scheme exactly like we've been discussing, except the
hypervisor provides the tsc timing parameters).

asm/x86/kernel/pvclock.c does the tsc to ns conversion with just adds
and multiplies, but unfortunately it can't be expressed in C because it
uses the extra precision the x86 gives for multiplies.

    J
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ