linux-kernel - Re: [RFC PATCH 1/3] Unified trace buffer

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20080925201211.GA1878@elte.hu>
Date:	Thu, 25 Sep 2008 22:12:11 +0200
From:	Ingo Molnar <mingo@...e.hu>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Steven Rostedt <rostedt@...dmis.org>,
	Martin Bligh <mbligh@...gle.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Martin Bligh <mbligh@...igh.org>, linux-kernel@...r.kernel.org,
	Thomas Gleixner <tglx@...utronix.de>,
	Andrew Morton <akpm@...ux-foundation.org>,
	prasad@...ux.vnet.ibm.com,
	Mathieu Desnoyers <compudj@...stal.dyndns.org>,
	"Frank Ch. Eigler" <fche@...hat.com>,
	David Wilder <dwilder@...ibm.com>, hch@....de,
	Tom Zanussi <zanussi@...cast.net>,
	Steven Rostedt <srostedt@...hat.com>
Subject: Re: [RFC PATCH 1/3] Unified trace buffer

* Ingo Molnar <mingo@...e.hu> wrote:

> firstly, for the sake of full disclosure, the very first versions of 
> the latency tracer (which, through hundreds of revisions, morphed into 
> ftrace), used raw TSC timestamps.
> 
> I stuck to that simple design for a _long_ time because i shared your 
> exact views about robustness and simplicity. But it was pure utter 
> nightmare to get the timings right after the fact, and i got a _lot_ 
> of complaints about the quality of timings, and i could never _trust_ 
> the timings myself for certain types of analysis.
> 
> So i eventually went to the scheduler clock and never looked back.
> 
> So i've been there, i've done that. In fact i briefly tried to use the 
> _GTOD_ clock for tracing - that was utter nightmare as well, because 
> the scale and breath of the GTOD code is staggering.

heh, and i even have a link for a latency tracing patch for 2005 that is 
still alive that proves it:

   http://people.redhat.com/mingo/latency-tracing-patches/patches/latency-tracing.patch

(dont look at the quality of that code too much)

It has this line for timestamp generation:

+       timestamp = get_cycles();

i.e. we used the raw TSC, we used RDTSC straight away, and we used that 
for _years_, literally.

So i can tell you my direct experience with it: i had far more problems 
with the tracer due to inexact timings and traces that i could not 
depend on, than i had problems with sched_clock() locking up or 
crashing.

Far more people complained about the accuracy of timings than about 
performance or about the ability (or inability) to stream gigs of 
tracing data to user-space.

It was a very striking difference:

  - every second person who used the tracer observed that the timings 
    looked odd at places.

  - only every 6 months has someone asked whether he could save 
    gigabytes of trace data.

For years i maintained a tracer with TSC timestamps, and for years i 
maintained another tracer that used sched_clock(). Exact timings are a 
feature most people are willing to spend extra cycles on.

You seem to dismiss that angle by calling my arguments bullshit, but i 
dont know on what basis you dismiss it. Sure, a feature and extra 
complexity _always_ has a robustness cost. If your argument is that we 
should move cpu_clock() to assembly to make it more dependable - i'm all 
for it.

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/