[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100804072120.GB28183@elte.hu>
Date: Wed, 4 Aug 2010 09:21:20 +0200
From: Ingo Molnar <mingo@...e.hu>
To: Dave Chinner <david@...morbit.com>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
Peter Zijlstra <peterz@...radead.org>,
Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
Frederic Weisbecker <fweisbec@...il.com>,
LKML <linux-kernel@...r.kernel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Steven Rostedt <rostedt@...dmis.org>,
Steven Rostedt <rostedt@...tedt.homelinux.com>,
Thomas Gleixner <tglx@...utronix.de>,
Christoph Hellwig <hch@....de>, Li Zefan <lizf@...fujitsu.com>,
Lai Jiangshan <laijs@...fujitsu.com>,
Johannes Berg <johannes.berg@...el.com>,
Masami Hiramatsu <masami.hiramatsu.pt@...achi.com>,
Arnaldo Carvalho de Melo <acme@...radead.org>,
Tom Zanussi <tzanussi@...il.com>,
KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
Andi Kleen <andi@...stfloor.org>,
"H. Peter Anvin" <hpa@...or.com>,
Jeremy Fitzhardinge <jeremy@...p.org>,
"Frank Ch. Eigler" <fche@...hat.com>, Tejun Heo <htejun@...il.com>
Subject: Re: [patch 1/2] x86_64 page fault NMI-safe
* Dave Chinner <david@...morbit.com> wrote:
> On Tue, Aug 03, 2010 at 11:56:11AM -0700, Linus Torvalds wrote:
> > On Tue, Aug 3, 2010 at 10:18 AM, Peter Zijlstra <peterz@...radead.org> wrote:
> > >
> > > FWIW I really utterly detest the whole concept of sub-buffers.
> >
> > I'm not quite sure why. Is it something fundamental, or just an
> > implementation issue?
> >
> > One thing that I think could easily make sense in a _lot_ of buffering
> > areas is the notion of a "continuation" buffer. We know we have cases
> > where we want to attach a lot of data to one particular event, but the
> > buffering itself is inevitably always going to have some limits on
> > atomicity etc. And quite often, the event that _generates_ the data is
> > not necessarily going to have all that data in one contiguous region,
> > and doing a scatter-gather memcpy to get it that way is not good
> > either.
> >
> > At the same time, I do _not_ believe that the kernel ring-buffer code
> > should handle pointers to sub-buffers etc, or worry about iovec-like
> > arrays of smaller ranges. So if _that_ is what you mean by "concept of
> > sub-buffers", then I agree with you.
> >
> > But what I do think might make a lot of sense is to allow buffer
> > fragments, and just teach user space to do de-fragmentation. Where it
> > would be important that the de-fragmentation really is all in user
> > space, and not really ever visible to the ring-buffer implementation
> > itself (and there would not, for example, be any guarantees that the
> > fragments would be contiguous - there could be other events in the
> > buffer in between fragments). Maybe we could even say that fragments
> > might be across different CPU ring-buffers, and user-space needs to
> > sort it out if it wants to (where "sort it out" literally would mean
> > having to sort and re-attach them in the right order, since there
> > wouldn't be any ordering between them).
> >
> > From a kernel perspective, the only thing you need for fragment
> > handling would be to have a buffer entry that just says "I'm fragment
> > number X of event ID Y". Nothing more. Everything else would be up to
> > the parser in user space to work out.
>
> Heh. For a moment there I thought you were describing the the way XFS writes
> transactions into it's log. Replace "CPU ring-buffers" with "in-core log
> buffers", "userspace parsing" with "log recovery" and "event ID" with
> "transaction ID", and the concept you describe is eerily similar. That
> includes the fact that transactions are not contiguous in the log, can
> interleave fragments between concurrent transaction commits and they can
> span multiple log buffers, too. It works pretty well for scaling concurrent
> writers....
That's certainly a good model when you have to stream into a
persistent-storage transaction log space with multiple writers.
The difference is that with instrumentation we are generally able to make
things per task or per cpu so there's no real multi-CPU 'concurrent writers'
concurrency.
You dont have that luxory/simplicity when logging to storage, of course!
Thanks,
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists