lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20080924165131.GA1045@Krystal>
Date:	Wed, 24 Sep 2008 12:51:31 -0400
From:	Mathieu Desnoyers <compudj@...stal.dyndns.org>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	Martin Bligh <mbligh@...gle.com>,
	Steven Rostedt <rostedt@...dmis.org>,
	linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...e.hu>,
	Thomas Gleixner <tglx@...utronix.de>,
	Andrew Morton <akpm@...ux-foundation.org>,
	prasad@...ux.vnet.ibm.com,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	"Frank Ch. Eigler" <fche@...hat.com>,
	David Wilder <dwilder@...ibm.com>, hch@....de,
	Tom Zanussi <zanussi@...cast.net>,
	Steven Rostedt <srostedt@...hat.com>
Subject: Re: [RFC PATCH 1/3] Unified trace buffer

* Peter Zijlstra (peterz@...radead.org) wrote:
> On Wed, 2008-09-24 at 12:13 -0400, Mathieu Desnoyers wrote:
> > * Martin Bligh (mbligh@...gle.com) wrote:
> > > Thanks for creating this so quickly ;-)
> > > 
> > > >> We can record either the fast way of reserving a part of the buffer:
> > > >>
> > > >> event = ring_buffer_lock_reserve(buffer, event_id, length, &flags);
> > > >> event->data = record_this_data;
> > > >> ring_buffer_unlock_commit(buffer, event, flags);
> > > >
> > > > This can, in generic, not work. Due to the simple fact that we might
> > > > straddle a page boundary. Therefore I think its best to limit our self
> > > > to the write interface below, so that it can handle that.
> > > 
> > > I'm not sure why this is any harder to deal with in write, than it is
> > > in reserve? We should be able to make reserve handle this just
> > > as well?
> > > 
> > > If you use write rather than reserve, you have to copy all the data
> > > twice for every event.
> > > 
> > 
> > I think we all agree that a supplementary copy is no wanted, but I think
> > this question is orthogonal to having a write wrapper. 
> 
> >  This reserve/commit mechanism
> > deals with synchronization (cli/spinlock or cmpxchg_local scheme...).
> 
> Right
> 
> > We can then use this offset to see in which page(s) we have to write.
> > This offset + len can in fact cross multiple page boundaries.
> 
> Sure
> 
> > Doing this elegantly could involve a page array that would represent the
> > buffer data :
> > 
> > struct page **buffer;
> 
> I really don't like the page array, but we can do without..
> 
> > And be given as parameter to the read() and write() methods, which would
> > deal with page-crossing.
> > 
> > e.g.
> 
> > size_t write(struct page **buffer, size_t woffset, void *data, size_t len);
> > 
> > Therefore, we could have code which writes in the buffers, without extra
> > copy, and without using vmap, in multiple writes for a single event,
> > which would deal with data alignment, e.g. :
> > 
> > size_t woffset, evsize = 0;
> > 
> > evsize += write(NULL, evsize, &var1, sizeof(var1));
> > evsize += write(NULL, evsize, &var2, sizeof(var2));
> > evsize += write(NULL, evsize, &var3, sizeof(var3));
> > 
> > woffset = reserve(..., evsize);
> > 
> > woffset += write(buffer, woffset, &var1, sizeof(var1));
> > woffset += write(buffer, woffset, &var2, sizeof(var2));
> > woffset += write(buffer, woffset, &var3, sizeof(var3));
> > 
> > commit(..., evsize);
> > 
> > Does that make sense ?
> 
> Yes, we can do the sub-write, how about:
> 
> struct ringbuffer_write_state 
> ringbuffer_write_start(struct ringbuffer *buffer, unsigned long size);
> 
> int ringbuffer_write(struct ringbuffer_write_state *state, 
>                      const void *buf, unsigned long size);
> 
> void ringbuffer_write_finish(struct ringbuffer_write_state *state);
> 
> 
> That way write_start() can do the reserve and set a local write
> iterator. write() can then do whatever, either the direct copy of break
> it up - will error on overflowing the reserved size. write_finish() will
> clean up (sti, preempt_enable etc..)
> 

Yup, that looks neat. I don't know if it's worth separating data
alignment concerns from this part of the infrastructure so it stays
simple. OTOH, embedding automatic alignment of data elements would be
easy to do here, e.g. :

struct ringbuffer_write_state 
ringbuffer_write_start(struct ringbuffer *buffer, unsigned long size);

int ringbuffer_write(struct ringbuffer_write_state *state, 
                     const void *buf,
                     unsigned long size,
                     unsigned long alignment);

#define ringbuffer_compute_size(size, alignment) \
  ringbuffer_write(NULL, NULL, size, alignment)

void ringbuffer_write_finish(struct ringbuffer_write_state *state);

So ringbuffer_compute_size could be useds to compute the total slot size
needed to write the event before doing the ringbuffer_write_start(). It
would be good to keep ringbuffer_write() mostly as a static inline so
the compiler could statically compile in much of these operations.

Mathieu

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ