lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20080415070811.GA15499@elte.hu>
Date:	Tue, 15 Apr 2008 09:08:11 +0200
From:	Ingo Molnar <mingo@...e.hu>
To:	Pekka Enberg <penberg@...helsinki.fi>
Cc:	linux-kernel@...r.kernel.org, Christoph Lameter <clameter@....com>,
	Mel Gorman <mel@....ul.ie>, Nick Piggin <npiggin@...e.de>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	"Rafael J. Wysocki" <rjw@...k.pl>, Yinghai.Lu@....com
Subject: Re: [bug] SLUB + mm/slab.c boot crash in -rc9


* Pekka Enberg <penberg@...helsinki.fi> wrote:

> On Tue, Apr 15, 2008 at 9:25 AM, Ingo Molnar <mingo@...e.hu> wrote:
> >  so it's probably the first few page allocations (setup_cpu_cache())
> >  going wrong already - suggesting a some fundamental borkage in SLAB?
> 
> I think it's still pointing to the page allocator and/or setting up 
> the zonelists...

i did a .config bisection and it pinpointed CONFIG_SPARSEMEM=y as the 
culprit. Changing it to FLATMEM gives a correctly booting system.

if you look at the good versus bad bootup log:

  http://redhat.com/~mingo/misc/log-Tue_Apr_15_07_24_59_CEST_2008.good
  http://redhat.com/~mingo/misc/log-Tue_Apr_15_07_24_59_CEST_2008.bad

(both SLUB) you'll see that the zone layout provided by the architecture 
code is _exactly_ the same and looks sane as well. So this is not an 
architecture zone layout bug, this is probably sparsemem setup (and/or 
the page allocator) getting confused by something.

why are there no good debug logs possible in this area? To debug such 
bugs we'd need an early dump of the precise layout of all memory maps, 
what points where, how large it is, where it is allocated - and then 
compare it with how the rest of the system is layed out - looking at 
possible overlaps or other bugs. This 8-way box is a pain to debug on, 
it takes a long time to boot it up, etc. etc.

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ