linux-kernel - Re: [rfc] increase struct page size?!

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20070522062452.GA29807@wotan.suse.de>
Date:	Tue, 22 May 2007 08:24:53 +0200
From:	Nick Piggin <npiggin@...e.de>
To:	William Lee Irwin III <wli@...omorphy.com>
Cc:	Matt Mackall <mpm@...enic.com>,
	Christoph Lameter <clameter@....com>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Linux Memory Management List <linux-mm@...ck.org>,
	linux-arch@...r.kernel.org
Subject: Re: [rfc] increase struct page size?!

On Mon, May 21, 2007 at 10:04:10PM -0700, William Lee Irwin III wrote:
> On Mon, May 21, 2007 at 06:39:51PM -0700, William Lee Irwin III wrote:
> >> address (virtual and physical are trivially inter-convertible), mock
> >> up something akin to what filesystems do for anonymous pages, etc.
> >> The real objection everyone's going to have is that driver writers
> >> will stain their shorts when faced with the rules for handling such
> >> things. The thing is, I'm not entirely sure who these driver writers
> >> that would have such trouble are, since the driver writers I know
> >> personally are sophisticates rather than walking disaster areas as such
> >> would imply. I suppose they may not be representative of the whole.
> 
> On Tue, May 22, 2007 at 03:57:03AM +0200, Nick Piggin wrote:
> > That's not the objection I would have. I would say that firstly, I
> > don't think the mem_map overhead is very significant (at any rate,
> > an allocated-on-demand metadata is not going to be any smaller if
> > you fill up on pagecache...). Secondly, I think there is merit to
> > having the same page metadata used by the major subsystems, because
> > it helps for locality of reference.
> 
> The size isn't the advantage being cited; I'd actually expect the net
> result to be larger. It's the control over the layout of the metadata
> for cache locality and even things like having enough flags, folding
> buffer_head -like affairs into the per-page metadata for filesystems
> and so reaping cache locality benefits even there (assuming it works
> out in other respects), and so on.
> 
> Passing pages between subsystems doesn't seem very significant to me.
> There isn't going to be much locality of reference, or even any
> guarantee that the subsystem gets fed a cache hot page structure. The
> subsystem being passed the page will have its own cache hot accounting
> structures to stick the information about the memory into.

Well consider the page allocator and pagecache. The page allocator
uses page metadata rather than eg. a bitmap, and it uses page list
heads for the per-cpu allocator.

If we were to instead perhaps use external bitmaps and arrays to 
keep track of pages, then the pagecache would have to go and allocate
its own structures rather than reuse the cache hot page allocator
structures.

Buffer heads might be something that would work well, but we'd still
like to be able to deallocate them without freeing the whole pagecache
(because they tend to be associated with less frequent operations like
IO). But anyway, I don't know. I'm sure there would be cases where it
works better.


> On Tue, May 22, 2007 at 03:57:03AM +0200, Nick Piggin wrote:
> > But I haven't explored the idea enough myself to know whether there
> > would be any really killer benefits to this. Delayed metadata freeing
> > via RCU without holding up the freeing of the actual page would have
> > been something, however I can do similar with speculative references
> > now (or whenever the code gets merged), which doesn't even require the
> > RCU overhead.
> 
> I'm not entirely sure what you're on about there, but it sounds
> interesting.

Heh :) Well the lockless pagecache would become basically trivial if we
could RCU-free pagecache pages, however doing that is really awful for
a number of reasons. However if you had a system where the metadata is
decoupled, you could simply RCU-free the 'struct page' (while still
immediately freeing the page itself) which would make lockless pagecache
(and potentially similar things) equally trivial.

I assumed K42 might have been into that angle.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/