[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.1.10.0809251332540.29802@gandalf.stny.rr.com>
Date: Thu, 25 Sep 2008 13:39:55 -0400 (EDT)
From: Steven Rostedt <rostedt@...dmis.org>
To: Linus Torvalds <torvalds@...ux-foundation.org>
cc: Peter Zijlstra <peterz@...radead.org>,
Martin Bligh <mbligh@...igh.org>,
Martin Bligh <mbligh@...gle.com>, linux-kernel@...r.kernel.org,
Ingo Molnar <mingo@...e.hu>,
Thomas Gleixner <tglx@...utronix.de>,
Andrew Morton <akpm@...ux-foundation.org>,
prasad@...ux.vnet.ibm.com,
Mathieu Desnoyers <compudj@...stal.dyndns.org>,
"Frank Ch. Eigler" <fche@...hat.com>,
David Wilder <dwilder@...ibm.com>, hch@....de,
Tom Zanussi <zanussi@...cast.net>,
Steven Rostedt <srostedt@...hat.com>
Subject: Re: [RFC PATCH 1/3] Unified trace buffer
On Thu, 25 Sep 2008, Linus Torvalds wrote:
>
>
> On Thu, 25 Sep 2008, Steven Rostedt wrote:
> >
> > If we do not normalize, then we must come up yet another generic way to read
> > the CPU clock for all archs. And then we also need to come up with another
> > generic way to normalize it later for output.
>
> Why would any of this be "generic"?
generic as in, could be implement in architecture dependent ways but with
a "generic" interface. IOW, I don't want the trace to be dependent on
any arch. ftrace already runs on x86, ppc, sparc64, mips, arm, sh, and
more.
>
> Quite the reverse. It should be as trace-buffer specific as possible, so
> that we do *not* share any code or any constraints with other people.
>
> Just do rdtsc at first, and make it depend on x86. If the thing is made
> simple enough, it will be a couple of lines of code for architectures to
> read their own timestamp counters.
I could do a HAVE_RING_BUFFER_TIMESTAMP config option for archs that
implement it, and just use something dumb for those that don't. For now
I'll keep to sched_clock, just because it makes it easy for me. But with
the wrappers, it should be easy to change later.
>
> And since the normalization is then no longer in the critical part, _that_
> can be architecture-independent, but obviously still trace-specific. You
> need to know the frequency, and that involves having frequency events in
> the trace if it changes, but if you don't see any frequency events you
> just take "current frequency".
The one thing that seemed to me most apparent from talking to people
at LPC, is that they want a simple ring buffer API. If every tracer that
uses this must come up with its own time keeping management, I don't think
this will be used at all (except by those that are maintaining tracers
now).
>
> And doing it at trace parse time, we can some day enable a boot trace that
> actually WORKS. Have you looked at the timestamp events we get from
> "sched_clock()" in early bootup? They show up in the kernel logs when you
> have CONFIG_PRINTK_TIME. And they are totally and utterly broken and
> _useless_ for the early stages right now. And they shouldn't have to be
> that way.
>
> Yeah, we'll never be able to trace stuff that happens really early
> (tracing will obviously always need kernel page tables and some really
> basic stuf working), but we should be able to trace through things like
> TSC calibration for boot time analysis. It wasn't that long ago that we
> had the whole discussion about TSC calibration taking 200ms. Or the early
> ACPI code. And get meaningful data.
My logdev code has a define option to use bootmem for its buffers, and it
also uses an atomic counter to try to keep things in order. Heck, at early
boot, the events happen in order anyway, since it is still a single CPU
system.
-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists