lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.1.10.0809251312010.27920@gandalf.stny.rr.com>
Date:	Thu, 25 Sep 2008 13:16:42 -0400 (EDT)
From:	Steven Rostedt <rostedt@...dmis.org>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
cc:	linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...e.hu>,
	Thomas Gleixner <tglx@...utronix.de>,
	Peter Zijlstra <peterz@...radead.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	prasad@...ux.vnet.ibm.com,
	Mathieu Desnoyers <compudj@...stal.dyndns.org>,
	"Frank Ch. Eigler" <fche@...hat.com>,
	David Wilder <dwilder@...ibm.com>, hch@....de,
	Martin Bligh <mbligh@...gle.com>,
	Christoph Hellwig <hch@...radead.org>,
	Steven Rostedt <srostedt@...hat.com>
Subject: Re: [RFC PATCH 1/2 v2] Unified trace buffer


On Thu, 25 Sep 2008, Linus Torvalds wrote:
> On Thu, 25 Sep 2008, Steven Rostedt wrote:
> > +
> > +/**
> > + * ring_buffer_event_length - return the length of the event
> > + * @event: the event to get the length of
> > + *
> > + * Note, if the event is bigger than 256 bytes, the length
> > + * can not be held in the shifted 5 bits. The length is then
> > + * added as a short (unshifted) in the body.
> 
> The comment seems stale ;)

hehe, this code is in so much flux, that I gave up on keeping the
comments up to date.

> 
> > +
> > +/**
> > + * ring_buffer_peek - peek at the next event to be read
> > + * @iter: The ring buffer iterator
> > + * @iter_next_cpu: The CPU that the next event belongs on
> > + *
> > + * This will return the event that will be read next, but does
> > + * not increment the iterator.
> > + */
> > +struct ring_buffer_event *
> > +ring_buffer_peek(struct ring_buffer *buffer, int cpu, u64 *ts)
> > +{
> > +	struct ring_buffer_per_cpu *cpu_buffer;
> > +	struct ring_buffer_event *event;
> > +	u64 delta;
> > +
> > +	cpu_buffer = buffer->buffers[cpu];
> > +
> > + again:
> > +	if (ring_buffer_per_cpu_empty(cpu_buffer))
> > +		return NULL;
> > +
> > +	event = ring_buffer_head_event(cpu_buffer);
> > +
> > +	switch (event->type) {
> > +	case RB_TYPE_PADDING:
> > +		ring_buffer_inc_page(buffer, &cpu_buffer->head_page);
> > +		rb_reset_read_page(cpu_buffer);
> > +		goto again;
> > +
> > +	case RB_TYPE_TIME_EXTENT:
> > +		delta = event->data;
> > +		delta <<= TS_SHIFT;
> > +		delta += event->time_delta;
> > +		cpu_buffer->read_stamp += delta;
> > +		goto again;
> > +
> > +	case RB_TYPE_TIME_STAMP:
> > +		/* FIXME: not implemented */
> > +		goto again;
> > +
> > +	case RB_TYPE_SMALL_DATA:
> > +	case RB_TYPE_LARGE_DATA:
> > +	case RB_TYPE_STRING:
> > +		if (ts)
> > +			*ts = cpu_buffer->read_stamp + event->time_delta;
> > +		return event;
> 
> Your timestamp handling seems odd. You do it per-event, but I think it 
> should happen for all events, ie just do
> 
> 	*ts += event->time_delta;
> 
> _outside_ the case statement, and then in RB_TYPE_TIME_EXTENT you'd do 
> either
> 
>  - relative:
> 	*ts += event->data << TS_SHIFT;
> 
>  - absolute timestamp events:
> 	*ts = (event->data << TS_SHIFT) + event->time_delta;
> 
> but the bigger issue is that I think the timestamp should be relative to 
> the _previous_ event, not relative to the page start. IOW, you really 
> should accumulate them. 
> 
> IOW, the base timestamp cannot be in the cpu_buffer, it needs to be in the 
> iterator data structure, since it updates as you walk over it.
> 
> Otherwise the extended TSC format will be _horrible_. You don't want to 
> add it in front of every event in the page just because you had a pause at 
> the beginning of the page. You want to have a running update, so that you 
> only need to add it after there was a pause.

The problem with this is overwrite mode, which is the only mode ftace 
currently offers. What happens when your writer starts overwriting the 
ring buffer and there is no reader?

What happens is that the start value is gone. You do not have a way to use 
all the deltas to catch up to the remaining events.

-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ