lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 5 Oct 2006 21:08:55 +0200
From:	Andi Kleen <ak@...e.de>
To:	vgoyal@...ibm.com
Cc:	Steve Fox <drfickle@...ibm.com>,
	Badari Pulavarty <pbadari@...ibm.com>,
	Martin Bligh <mbligh@...igh.org>,
	Andrew Morton <akpm@...l.org>,
	lkml <linux-kernel@...r.kernel.org>, netdev@...r.kernel.org,
	kmannth@...ibm.com, Andy Whitcroft <apw@...dowen.org>,
	Mel Gorman <mel@....ul.ie>
Subject: Re: 2.6.18-mm2 boot failure on x86-64

On Thursday 05 October 2006 20:52, Vivek Goyal wrote:
> On Thu, Oct 05, 2006 at 08:27:02PM +0200, Andi Kleen wrote:
> > On Thursday 05 October 2006 19:57, Steve Fox wrote:
> > > On Thu, 2006-10-05 at 17:40 +0200, Andi Kleen wrote:
> > > 
> > > > Please don't snip the Code: line. It is fairly important.
> > > 
> > > Sorry about that. The remote console I was using appears to overwrite
> > > some text after I force the reboot. Here's a clean one.
> > > 
> > > global ffffffffffffffff
> > 
> > Ok that definitely shouldn't be in there.
> > 
> > I guess we need to track when it gets corrupted. Can you send the full
> > boot log with this patch applied?
> > 
> 
> Just recalled one more observation about the problem when keith had
> reported it last. If I just move .bss before .data_nosave instead
> of it being at the end, keith's problem had disappeared.

Yes, that could well be that it's something in the new bootmap 
management.  Steve's box failed at

Using ACPI (MADT) for SMP configuration information
Nosave address range: 000000000009a000 - 000000000009b000
Nosave address range: 000000000009b000 - 00000000000a0000
Nosave address range: 00000000000a0000 - 00000000000e0000
Nosave address range: 00000000000e0000 - 0000000000100000
Nosave address range: 00000000bff76000 - 00000000bff77000
Nosave address range: 00000000bff77000 - 00000000bff98000
Nosave address range: 00000000bff98000 - 00000000bff99000
Nosave address range: 00000000bff99000 - 00000000c0000000
Nosave address range: 00000000c0000000 - 00000000fec00000
Nosave address range: 00000000fec00000 - 0000000100000000
Allocating PCI resources starting at c4000000 (gap: c0000000:3ec00000)
afinfo corrupted at init/main.c:512

which is directly after that code does lots of stuff.

Mel might want to take a look (and perhaps
also cut down a little on the ugly printks ...) 

BTW I found one of my test systems too now which does a lot of:
I'm about to leave for vacation so i won't have time to track it down
any time soon. But here is it for reference.

-Andi

Please enable the IOMMU option in the BIOS setup
This costs you 64 MB of RAM
Mapping aperture over 65536 KB of RAM @ 8000000
Bad page state in process 'swapper'
page:ffff810003ee5480 flags:0x0000000000000000 mapping:0000000000000000 mapcount:1 count:0
Trying to fix it up, but a reboot is needed
Backtrace:   

Call Trace:  
 [<ffffffff8020ac84>] show_trace+0x34/0x47
 [<ffffffff8020aca9>] dump_stack+0x12/0x17
 [<ffffffff802586a7>] bad_page+0x57/0x81
 [<ffffffff80258791>] __free_pages_ok+0x64/0x247
 [<ffffffff807cca72>] free_all_bootmem_core+0xcc/0x1a9
 [<ffffffff807ca08b>] numa_free_all_bootmem+0x3b/0x77
 [<ffffffff807c915e>] mem_init+0x44/0x186
 [<ffffffff807bc5f0>] start_kernel+0x17b/0x207
 [<ffffffff807bc168>] _sinittext+0x168/0x16c

Bad page state in process 'swapper'
page:ffff810003ee54b8 flags:0x0000000000000000 mapping:0000000000000000 mapcount:1 count:0
Trying to fix it up, but a reboot is needed
Backtrace:   

Call Trace:  
 [<ffffffff8020ac84>] show_trace+0x34/0x47
 [<ffffffff8020aca9>] dump_stack+0x12/0x17
 [<ffffffff802586a7>] bad_page+0x57/0x81
 [<ffffffff80258791>] __free_pages_ok+0x64/0x247
 [<ffffffff807cca72>] free_all_bootmem_core+0xcc/0x1a9
 [<ffffffff807ca08b>] numa_free_all_bootmem+0x3b/0x77
 [<ffffffff807c915e>] mem_init+0x44/0x186
 [<ffffffff807bc5f0>] start_kernel+0x17b/0x207
 [<ffffffff807bc168>] _sinittext+0x168/0x16c


... lots more of those ...
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ