linux-kernel - Re: [patch 1/2] x86_64 page fault NMI-safe

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20100804072120.GB28183@elte.hu>
Date:	Wed, 4 Aug 2010 09:21:20 +0200
From:	Ingo Molnar <mingo@...e.hu>
To:	Dave Chinner <david@...morbit.com>
Cc:	Linus Torvalds <torvalds@...ux-foundation.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
	Frederic Weisbecker <fweisbec@...il.com>,
	LKML <linux-kernel@...r.kernel.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Steven Rostedt <rostedt@...dmis.org>,
	Steven Rostedt <rostedt@...tedt.homelinux.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Christoph Hellwig <hch@....de>, Li Zefan <lizf@...fujitsu.com>,
	Lai Jiangshan <laijs@...fujitsu.com>,
	Johannes Berg <johannes.berg@...el.com>,
	Masami Hiramatsu <masami.hiramatsu.pt@...achi.com>,
	Arnaldo Carvalho de Melo <acme@...radead.org>,
	Tom Zanussi <tzanussi@...il.com>,
	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	Andi Kleen <andi@...stfloor.org>,
	"H. Peter Anvin" <hpa@...or.com>,
	Jeremy Fitzhardinge <jeremy@...p.org>,
	"Frank Ch. Eigler" <fche@...hat.com>, Tejun Heo <htejun@...il.com>
Subject: Re: [patch 1/2] x86_64 page fault NMI-safe


* Dave Chinner <david@...morbit.com> wrote:

> On Tue, Aug 03, 2010 at 11:56:11AM -0700, Linus Torvalds wrote:
> > On Tue, Aug 3, 2010 at 10:18 AM, Peter Zijlstra <peterz@...radead.org> wrote:
> > >
> > > FWIW I really utterly detest the whole concept of sub-buffers.
> > 
> > I'm not quite sure why. Is it something fundamental, or just an
> > implementation issue?
> > 
> > One thing that I think could easily make sense in a _lot_ of buffering
> > areas is the notion of a "continuation" buffer. We know we have cases
> > where we want to attach a lot of data to one particular event, but the
> > buffering itself is inevitably always going to have some limits on
> > atomicity etc. And quite often, the event that _generates_ the data is
> > not necessarily going to have all that data in one contiguous region,
> > and doing a scatter-gather memcpy to get it that way is not good
> > either.
> > 
> > At the same time, I do _not_ believe that the kernel ring-buffer code
> > should handle pointers to sub-buffers etc, or worry about iovec-like
> > arrays of smaller ranges. So if _that_ is what you mean by "concept of
> > sub-buffers", then I agree with you.
> > 
> > But what I do think might make a lot of sense is to allow buffer
> > fragments, and just teach user space to do de-fragmentation. Where it
> > would be important that the de-fragmentation really is all in user
> > space, and not really ever visible to the ring-buffer implementation
> > itself (and there would not, for example, be any guarantees that the
> > fragments would be contiguous - there could be other events in the
> > buffer in between fragments).  Maybe we could even say that fragments
> > might be across different CPU ring-buffers, and user-space needs to
> > sort it out if it wants to (where "sort it out" literally would mean
> > having to sort and re-attach them in the right order, since there
> > wouldn't be any ordering between them).
> > 
> > From a kernel perspective, the only thing you need for fragment
> > handling would be to have a buffer entry that just says "I'm fragment
> > number X of event ID Y". Nothing more. Everything else would be up to
> > the parser in user space to work out.
> 
> Heh. For a moment there I thought you were describing the the way XFS writes 
> transactions into it's log. Replace "CPU ring-buffers" with "in-core log 
> buffers", "userspace parsing" with "log recovery" and "event ID" with 
> "transaction ID", and the concept you describe is eerily similar. That 
> includes the fact that transactions are not contiguous in the log, can 
> interleave fragments between concurrent transaction commits and they can 
> span multiple log buffers, too. It works pretty well for scaling concurrent 
> writers....

That's certainly a good model when you have to stream into a 
persistent-storage transaction log space with multiple writers.

The difference is that with instrumentation we are generally able to make 
things per task or per cpu so there's no real multi-CPU 'concurrent writers' 
concurrency.

You dont have that luxory/simplicity when logging to storage, of course!

Thanks,

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/