[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20080415201734.GA25628@elte.hu>
Date: Tue, 15 Apr 2008 22:17:34 +0200
From: Ingo Molnar <mingo@...e.hu>
To: Christoph Lameter <clameter@....com>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
Pekka Enberg <penberg@...helsinki.fi>,
linux-kernel@...r.kernel.org, Mel Gorman <mel@....ul.ie>,
Nick Piggin <npiggin@...e.de>,
Andrew Morton <akpm@...ux-foundation.org>,
"Rafael J. Wysocki" <rjw@...k.pl>, Yinghai.Lu@....com,
apw@...dowen.org,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
Subject: Re: [bug] SLUB + mm/slab.c boot crash in -rc9
* Christoph Lameter <clameter@....com> wrote:
> > Pretty please, could you pay more than cursory attention to this bug
> > i already spent two full days on and which is blocking the v2.6.25
> > release?
>
> Yeah trying to get to understand how exactly sparsemem works and how
> the 32 bit highmem stuff interacts with it... Sorry not code that I am
> an expert in nor the platform that I am familiar with. Code mods there
> required heavy review from multiple parties with expertise in various
> subjects.
yeah - sorry about that impatient flame. And it could still be anything
from the page allocator to bootmem - or some completely unrelated piece
of code corrupting some key data structure.
sparsemem is supposed to work roughly like this on x86 (32-bit):
- the x86 memory map comes from the bios via e820.
- those individual chunks of e820-enumerated memory get
registered with mm/sparse.c's data structures via memory_present()
callbacks. [btw., this should be renamed to register_memory_present()
or register_sparse_range() - something less opaque.]
- there's really just 3 RAM areas that matter on this box, and the last
one is unusable for !PAE, which leaves 2.
- there's a 256 MB PCI aperture hole at 0xf0000000.
- out of the 64 sparse memory chunk the first 60 get filled in (all have
at least partially some RAM content) - the last 4 [the PCI aperture
hole] remains !present.
- we pass in an array of 3 zones to free_area_init_nodes().
- we free the lowmem pages into the buddy allocator via the usual
generic setup
- we have a special loop for highmem pages in arch/x86/mm/init_32.c,
set_highmem_pages_init(). This just goes through the PFNs one by one
and does an explicit __free_page() on all RAM pages that are in the
mem_map[] and which are non-reserved.
and that's it roughly.
my current guess would have been some bootmem regression/interaction
that messes up the buddy bitmaps - but i just reverted to the v2.6.24
version of bootmem.c and that crashes too ...
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists