lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LFD.1.00.0801101124140.3148@woody.linux-foundation.org>
Date:	Thu, 10 Jan 2008 11:41:50 -0800 (PST)
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Matt Mackall <mpm@...enic.com>
cc:	Pekka J Enberg <penberg@...helsinki.fi>,
	Christoph Lameter <clameter@....com>,
	Ingo Molnar <mingo@...e.hu>, Hugh Dickins <hugh@...itas.com>,
	Andi Kleen <andi@...stfloor.org>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [RFC PATCH] greatly reduce SLOB external fragmentation



On Thu, 10 Jan 2008, Matt Mackall wrote:
> 
> One idea I've been kicking around is pushing the boundary for the buddy
> allocator back a bit (to 64k, say) and using SL*B under that. The page
> allocators would call into buddy for larger than 64k (rare!) and SL*B
> otherwise. This would let us greatly improve our handling of things like
> task structs and skbs and possibly also things like 8k stacks and jumbo
> frames.

Yes, something like that may well be reasonable. It could possibly solve 
some of the issues for bigger page cache sizes too, but one issue is that 
many things actually end up having those power-of-two alignment 
constraints too - so an 8kB allocation would often still have to be 
naturally aligned, which then removes some of the freedom.

> Crazy?

It sounds like it might be worth trying out - there's just no way to know 
how well it would work. Buddy allocators sure as hell have problems too, 
no question about that. It's not like the page allocator is perfect.

It's not even clear that a buddy allocator even for the high-order pages 
is at all the right choice. Almost nobody actually wants >64kB blocks, and 
the ones that *do* want bigger allocations tend to want *much* bigger 
ones, so it's quite possible that it could be worth it to have something 
like a three-level allocator:

 - huge pages (superpages for those crazy db people)

   Just a simple linked list of these things is fine, we'd never care 
   about coalescing large pages together anyway.

 - "large pages" (on the order of ~64kB) - with *perhaps* a buddy bitmap 
   setup to try to coalesce back into huge-pages, but more likely just 
   admitting that you'd need something like migration to ever get back a 
   hugepage that got split into large-pages.

   So maybe a simple bitmap allocator per huge-page for large pages. Say 
   you have a 4MB huge-page, and just a 64-bit free-bitmap per huge-page 
   when you split it into large pages.

 - slab/slub/slob for anything else, and "get_free_page()" ends up being 
   just a shorthand for saying "naturally aligned kmalloc of size 
   "PAGE_SIZE<<order"

and maybe it would all work out ok. 

			Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ