[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LFD.1.10.0809241458270.3265@nehalem.linux-foundation.org>
Date: Wed, 24 Sep 2008 15:28:40 -0700 (PDT)
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Mathieu Desnoyers <compudj@...stal.dyndns.org>
cc: Martin Bligh <mbligh@...gle.com>,
Peter Zijlstra <peterz@...radead.org>,
Steven Rostedt <rostedt@...dmis.org>,
linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...e.hu>,
Thomas Gleixner <tglx@...utronix.de>,
Andrew Morton <akpm@...ux-foundation.org>,
prasad@...ux.vnet.ibm.com, "Frank Ch. Eigler" <fche@...hat.com>,
David Wilder <dwilder@...ibm.com>, hch@....de,
Tom Zanussi <zanussi@...cast.net>,
Steven Rostedt <srostedt@...hat.com>
Subject: Re: [RFC PATCH 1/3] Unified trace buffer
On Wed, 24 Sep 2008, Mathieu Desnoyers wrote:
>
> The reason why Martin did use only a 27 bits TSC in ktrace was that they
> were statically limited to 32 event types.
Well, I actually think we could do the same - for the "internal" types.
So why not do something like 4-5 bits for the basic type information, and
then oen of those cases is a "freeform" thing, and the others are reserved
for other uses.
So a trace entry header could easily look something like
struct trace_entry {
u32 tsc_delta:27,
type:5;
u32 data;
u64 array[];
}
and then depending on the that 5-bit type, the "data" field in the header
means different things, and the size of the trace_entry also is different.
So it could be something like
- case 0: EnfOfPage marker
(data is ignored)
size = 8
- case 1: TSCExtend marker
data = extended TSC (bits 28..59)
size = 8
- case 2: TimeStamp marker
data = tv_nsec
array[0] = tv_sec
size = 16
- case 3: LargeBinaryBlob marker
data = 32-bit length of binary data
array[0] = 64-bit pointer to binary blob
array[1] = 64-bit pointer to "free" function
size = 24
- case 4: SmallBinaryBlob marker
data = inline length in bytes, must be < 4096
array[0..(len+7)/8] = inline data, padded
size = (len+15) & ~7
- case 5: AsciiFormat marker
data = number of arguments
array[0] = 64-bit pointer to static const format string
array[1..arg] = argument values
size = 8*(2+arg)
...
ie we use a few bits for "trace _internal_ type fields", and then for a
few of those types we have internal meanings, and other types just means
that the user can fill in the data itself.
IOW, you _could_ have an interface like
ascii_marker_2(ringbuffer,
"Reading sector %lu-%lu",
sector, sector+nsec);
and what it would create would be a fairly small trace packet that looks
something like
.type = 5,
.tsc_delta = ...,
.data = 2,
.array[0] = (const char *) "Reading sector %lu-%lu\n"
.array[1] = xx,
.array[2] = yy
and you would not actually print it out as ASCII until somebody read it
from the kernel (and any "binary" interface would get the string as a
string, not as a pointer, because the pointer is obviously meaningless
outside the kernel.
Also note how you'd literally just have a single copy of the string,
because the rule would be that a trace user must use a static string, not
some generated one that can go away (module unloading would need to be
aware of any trace buffer entries, of course - perhaps by just disallowing
unloading while trace buffers are active).
And note! Everything above is meant as an example of something that
_could_ work. I do like the notion of putting pointers to strings in the
markers, rather than having some odd magic numerical meaning that user
space has to just magically know that "event type 56 for ring buffer type
171 means that there are two words that mean 'sector' and 'end-sector'
respectively".
But it's still meant more as an RFC. But I think it could work.
Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists