linux-kernel - Re: [RFC PATCH 1/3] Unified trace buffer

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <33307c790809240949i3026170i8f9ac1d67a0fcf00@mail.gmail.com>
Date:	Wed, 24 Sep 2008 09:49:25 -0700
From:	"Martin Bligh" <mbligh@...gle.com>
To:	"Peter Zijlstra" <peterz@...radead.org>
Cc:	"Steven Rostedt" <rostedt@...dmis.org>,
	linux-kernel@...r.kernel.org, "Ingo Molnar" <mingo@...e.hu>,
	"Thomas Gleixner" <tglx@...utronix.de>,
	"Andrew Morton" <akpm@...ux-foundation.org>,
	prasad@...ux.vnet.ibm.com,
	"Linus Torvalds" <torvalds@...ux-foundation.org>,
	"Mathieu Desnoyers" <compudj@...stal.dyndns.org>,
	"Frank Ch. Eigler" <fche@...hat.com>,
	"David Wilder" <dwilder@...ibm.com>, hch@....de,
	"Tom Zanussi" <zanussi@...cast.net>,
	"Steven Rostedt" <srostedt@...hat.com>
Subject: Re: [RFC PATCH 1/3] Unified trace buffer

>> I'm not sure why this is any harder to deal with in write, than it is
>> in reserve? We should be able to make reserve handle this just
>> as well?
>
> No, imagine the mentioned case where we're straddling a page boundary.
>
> A----|   |----B
>    ^------|
>
> So when we reserve we get a pointer into page A, but our reserve length
> will run over into page B. A write() method will know how to check for
> this and break up the memcpy to copy up-to the end of A and continue
> into B.
>
> You cannot expect the reserve/commit interface users to do this
> correctly - it would also require one to expose too much internals,
> you'd need to be able to locate page B for starters.

Can't the reserve interface just put a padding event into page A,
or otherwise mark it, and return the start of page B?

>> If you use write rather than reserve, you have to copy all the data
>> twice for every event.
>
> Well, once. I'm not seeing where the second copy comes from.

Depends how you count ;-) One more time than you would have to
with reserve - the temporarily packed structure doesn't exist.

>> > On top of that foundation build an eventbuffer, which knows about
>> > encoding/decoding/printing events.
>> >
>> > This too needs to be a flexible layer -
>>
>> That would be nice. However, we need to keep at least the length
>> and timestamp fields common so we can do parsing and the mergesort?
>
> And here I was thinking you guys bit encoded the event id into the
> timestamp delta :-)

+/* header plus 32-bits of event data */
+struct ktrace_entry {
+       u32 event_type:5, tsc_shifted:27;
+       u32 data;
+};

was our basic data type. So ... sort of ;-)

>> So type would move into the body here?
>
> All of it would, basically I have no notion of an event in the
> ringbuffer API. You write $something and your read routine would need to
> be smart enough to figure it out.

If you don't have timestamps, you need domain-specific context to merge
the per-cpu buffers back together. As long as these are common format
amongst all the event-level alternatives, I guess it doesn't matter.

> Another option is to start out with a fixed sized header that contains a
> length field.

That's what we discussed at KS/plumbers, and seems like the simplest
option by far to start with.

> But the raw ringbuffer layer, the one concerned with fiddling the pages
> and writing/reading thereto need not be aware of anything else.

When you loop around the ringbuffer, you need to shift the starting "read"
pointer up to the next event, don't you? How do you do that to start on
a whole event without knowing the event size?

> Exactly - which is why a flexible encoding layer makes sense to me -
> aside from the abstraction itself.

I like the abstraction, yes ;-) Just not convinced how much we can put in it.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/