lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 15 Apr 2008 10:36:28 +0100
From:	Mel Gorman <mel@....ul.ie>
To:	Ingo Molnar <mingo@...e.hu>
Cc:	Pekka Enberg <penberg@...helsinki.fi>,
	linux-kernel@...r.kernel.org, Christoph Lameter <clameter@....com>,
	Nick Piggin <npiggin@...e.de>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	"Rafael J. Wysocki" <rjw@...k.pl>, Yinghai.Lu@....com
Subject: Re: [bug] mm/slab.c boot crash in -git, "kernel BUG at mm/slab.c:2103!"

On (11/04/08 11:24), Ingo Molnar didst pronounce:
> 
> * Pekka Enberg <penberg@...helsinki.fi> wrote:
> 
> > On Fri, Apr 11, 2008 at 12:05 PM, Pekka Enberg <penberg@...helsinki.fi> wrote:
> > >  >  Right. Then you probably want to look into any changes in arch/x86/
> > >  >  related to setting up the zonelists. I'm fairly certain this is not a
> > >  >  slab bug and I don't see any recent changes to the page allocator
> > >  >  either that would explain this.
> > >
> > >  I'd be willing to put some money on this:
> > >
> > >  http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=b7ad149d62ffffaccb9f565dfe7e5bae739d6836
> > 
> > And I'd lose as you're 32-bit. Oh well, that's the price to pay for 
> > pretending to know x86 arch internals.
> 
> yeah, sorry - we are working hard to unify generic bits like that, but 
> it's a huge architecture.
> 
> btw., i always felt that the zone/memory setup is rather fragile and 
> ad-hoc in places and it trusts the architecture code too much. Just in 
> the .25 cycle i've seen about a dozen bugs all around that thing. I 
> believe we should work on making the info that an architecture feeds to 
> the MM "fool proof" - i.e. sanity-check for overlaps and other common 
> setup errors.

I hadn't realised that such setup errors were common. It should be already able
to handle some overlapping problems in add_active_range().

I'm playing catch-up here but looking at your dmesg output, I see the
following snippets.

[    0.000000]  BIOS-e820: 0000000000000000 - 000000000009f800 (usable)
[    0.000000]  BIOS-e820: 000000000009f800 - 00000000000a0000 (reserved)
[    0.000000]  BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
[    0.000000]  BIOS-e820: 0000000000100000 - 00000000efff8000 (usable)
[    0.000000]  BIOS-e820: 00000000efff8000 - 00000000f0000000 (ACPI data)

There are two portions of usable memory with a few holes there.

[    0.000000]  BIOS-e820: 00000000fec00000 - 00000000fec10000 (reserved)
[    0.000000]  BIOS-e820: 00000000fee00000 - 00000000fee10000 (reserved)
[    0.000000]  BIOS-e820: 00000000fff80000 - 0000000100000000 (reserved)
[    0.000000]  BIOS-e820: 0000000100000000 - 0000000110000000 (usable)

And is memory over the 4GB boundary but....

[    0.000000] Warning only 4GB will be used.
[    0.000000] Use a HIGHMEM64G enabled kernel.
[    0.000000] Entering add_active_range(0, 0, 1048576) 0 entries of 256 used

It's recognised and only memory below 4GB is registered and it's all on
node 0. However, I do note that it also registers all the holes as valid
memory. The memory should never get freed because it should be reserved
during boot by reserve_bootmem() but it still raises an eyebrow.

[    0.000000] early_node_map[1] active PFN ranges
[    0.000000]     0:        0 ->  1048576
[    0.000000] On node 0 totalpages: 1048576
[    0.000000]   DMA zone: 32 pages used for memmap
[    0.000000]   DMA zone: 0 pages reserved
[    0.000000]   DMA zone: 4064 pages, LIFO batch:0
[    0.000000]   Normal zone: 1760 pages used for memmap
[    0.000000]   Normal zone: 223520 pages, LIFO batch:31
[    0.000000]   HighMem zone: 6400 pages used for memmap
[    0.000000]   HighMem zone: 812800 pages, LIFO batch:31
[    0.000000]   Movable zone: 0 pages used for memmap

And from this, it looks like memmap is getting setup. So far, it looks
like basic initialisation was ok.

> It is easy for an architecture to mess up those things... 
> Especially on oddball systems that are too large or too small to be 
> normally tested. It's a common, reoccuring bug pattern that we could 
> avoid by being a bit more resilient.
> 
> if this is a zone setup bug then a sanity-check could catch it right 
> where it happens - not much later in the slab code or so.
> 
> 	Ingo
> 

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ