[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Pine.LNX.4.64.0801101143180.20926@schroedinger.engr.sgi.com>
Date: Thu, 10 Jan 2008 11:46:56 -0800 (PST)
From: Christoph Lameter <clameter@....com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
cc: Matt Mackall <mpm@...enic.com>,
Pekka J Enberg <penberg@...helsinki.fi>,
Ingo Molnar <mingo@...e.hu>, Hugh Dickins <hugh@...itas.com>,
Andi Kleen <andi@...stfloor.org>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [RFC PATCH] greatly reduce SLOB external fragmentation
On Thu, 10 Jan 2008, Linus Torvalds wrote:
> It's not even clear that a buddy allocator even for the high-order pages
> is at all the right choice. Almost nobody actually wants >64kB blocks, and
> the ones that *do* want bigger allocations tend to want *much* bigger
> ones, so it's quite possible that it could be worth it to have something
> like a three-level allocator:
Excellent! I am definitely on board with this.
> - huge pages (superpages for those crazy db people)
>
> Just a simple linked list of these things is fine, we'd never care
> about coalescing large pages together anyway.
>
> - "large pages" (on the order of ~64kB) - with *perhaps* a buddy bitmap
> setup to try to coalesce back into huge-pages, but more likely just
> admitting that you'd need something like migration to ever get back a
> hugepage that got split into large-pages.
>
> So maybe a simple bitmap allocator per huge-page for large pages. Say
> you have a 4MB huge-page, and just a 64-bit free-bitmap per huge-page
> when you split it into large pages.
>
> - slab/slub/slob for anything else, and "get_free_page()" ends up being
> just a shorthand for saying "naturally aligned kmalloc of size
> "PAGE_SIZE<<order"
>
> and maybe it would all work out ok.
Hmmm... a 3 level allocator? Basically we would have BASE_PAGE
STANDARD_PAGE and HUGE_PAGE? We could simply extend the page allocator to
have 3 pcp lists for these sizes and go from there?
Thinking about the arches this would mean
BASE_PAGE STANDARD_PAGE HUGE_PAGE
x86_64 4k 64k 2M
i386 4k 16k 4M
ia64 16k 256k 1G
?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists