lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20061006153629.GA19756@in.ibm.com>
Date:	Fri, 6 Oct 2006 11:36:29 -0400
From:	Vivek Goyal <vgoyal@...ibm.com>
To:	Mel Gorman <mel@...net.ie>
Cc:	Steve Fox <drfickle@...ibm.com>, Andi Kleen <ak@...e.de>,
	Badari Pulavarty <pbadari@...ibm.com>,
	Martin Bligh <mbligh@...igh.org>,
	Andrew Morton <akpm@...l.org>,
	lkml <linux-kernel@...r.kernel.org>, netdev@...r.kernel.org,
	kmannth@...ibm.com, Andy Whitcroft <apw@...dowen.org>
Subject: Re: 2.6.18-mm2 boot failure on x86-64

On Fri, Oct 06, 2006 at 03:33:12PM +0100, Mel Gorman wrote:
> > Linux version 2.6.18-git22 (root@...3b239) (gcc version 4.1.0 (SUSE Linux)) #2 SMP Thu Oct 5 19:05:36 PDT 2006
> > Command line: root=/dev/sda1 vga=791  ip=9.47.67.239:9.47.67.50:9.47.67.1:255.255.255.0 resume=/dev/sdb1 showopts earlyprintk=serial,ttyS0,57600 console=tty0 console=ttyS0,57600 autobench_args: root=/dev/sda1 ABAT:1160100417
> > BIOS-provided physical RAM map:
> >  BIOS-e820: 0000000000000000 - 000000000009ac00 (usable)
> >  BIOS-e820: 000000000009ac00 - 00000000000a0000 (reserved)
> >  BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
> >  BIOS-e820: 0000000000100000 - 00000000bff764c0 (usable)
> >  BIOS-e820: 00000000bff764c0 - 00000000bff98880 (ACPI data)
> >  BIOS-e820: 00000000bff98880 - 00000000c0000000 (reserved)
> >  BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
> >  BIOS-e820: 0000000100000000 - 0000000c00000000 (usable)
> 
> I continued what Steve was doing this morning to see could this be
> pinned down. After placing 'CHECK;' in a few places as suggested by
> Andi's check, the problem code was identified as that following in
> mm/bootmem.c#init_bootmem_core()
> 
>         mapsize = get_mapsize(bdata);
>         memset(bdata->node_bootmem_map, 0xff, mapsize);
> 
> That explains the value in the array at least. A few more printfs around
> this point printed out the following in the boot log
> 
> init_bootmem_core(0, 1909, 0, 12582912)
> init_bootmem_core: Calling memset(0xFFFF810000775000, 1572864)
> AAGH: afinfo corrupted at mm/bootmem.c:121
> 
> where;
> 
> 1909 == mapstart
> 0 == start
> 12582912 == end
> 1572864 == mapsize
> 
> mapstart, start and end being the parameters being passed to
> init_bootmem_core(). This means we are calling memset for the physical
> range 0x775000 -> 0x8F5000 which is in a usable range according to the
> BIOS-e820 map it appears.
> 

Hi Mel,

Where is bss placed in physical memory? I guess bss_start and bss_stop
from System.map will tell us. That will confirm that above memset step is
stomping over bss. Then we have to just find that somewhere probably
we allocated wrong physical memory area for bootmem allocator map.

Thanks
Vivek

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ