[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <200610051740.58511.ak@suse.de>
Date: Thu, 5 Oct 2006 17:40:58 +0200
From: Andi Kleen <ak@...e.de>
To: Steve Fox <drfickle@...ibm.com>
Cc: Badari Pulavarty <pbadari@...ibm.com>,
Martin Bligh <mbligh@...igh.org>, vgoyal@...ibm.com,
Andrew Morton <akpm@...l.org>,
lkml <linux-kernel@...r.kernel.org>, netdev@...r.kernel.org,
kmannth@...ibm.com, Andy Whitcroft <apw@...dowen.org>
Subject: Re: 2.6.18-mm2 boot failure on x86-64
On Thursday 05 October 2006 17:32, Steve Fox wrote:
> On Thu, 2006-10-05 at 08:12 -0700, Badari Pulavarty wrote:
>
> > Can you post the latest panic stack again (with CONFIG_DEBUG_KERNEL) ?
>
> CONFIG_DEBUG_KERNEL should be on
>
> > Last time I couldn't match your instruction dump to any code segment
> > in the routine. And also, can you post your .config file. I have
> > an amd64 and em64t machine and both work fine...
>
> Unable to handle kernel NULL pointer dereference at 0000000000000827 RIP:
> [<ffffffff804705e6>] xfrm_register_mode+0x36/0x60
> PGD 0
> Oops: 0000 [1] SMP
> CPU 0
> Modules linked in:
> Pid: 1, comm: swapper Not tainted 2.6.18-git22 #1
> RIP: 0010:[<ffffffff804705e6>] [<ffffffff804705e6>] xfrm_register_mode+0x36/0x60
> RSP: 0000:ffff810bffcbded0 EFLAGS: 00010286
> RAX: 000000000000081f RBX: ffffffff805588a0 RCX: 0000000000000000
> RDX: ffffffffffffffff RSI: 0000000000000002 RDI: ffffffff80559550
> RBP: 00000000ffffffef R08: 000000003f924371 R09: 0000000000000000
> R10: ffff810bffcbdcb0 R11: 0000000000000154 R12: 0000000000000000
> R13: ffff810bffcbdef0 R14: 0000000000000000 R15: 0000000000000000
> FS: 0000000000000000(0000) GS:ffffffff805d2000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> CR2: 0000000000000827 CR3: 0000000000201000 CR4: 00000000000006e0
> Process swapper (pid: 1, threadinfo ffff810bffcbc000, task ffff810bffcbb4e0)
> Stack: 0000000000000000 ffffffff8061fb48 0000000000000000 ffffffff80207182
> 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> 0000000000000000 0000000000000000 0000000000000000 0000000000090000
Please don't snip the Code: line. It is fairly important.
>
> The base config file I'm using is at
> http://flooterbu.net/kernel/elm3b239-2.6.17.config
My guess is that something is wrong with the global variable it is accessing.
Can you post the output of grep -5 xfrm_policy_afinfo ?
I wonder if that variable overlaps something else.
And please add a
printk("global %p\n", xfrm_policy_afinfo[family]);
at the beginning of net/xfrm/xfrm_poliy.c:xfrm_policy_lock_afinfo
and post the output.
If not then it's possible
that some nearby variable is overflowing or similar. Adding some padding
around xfrm_policy_afinfo would show that.
Another way if that global is proven to be corrupted will be to add
checks all over the boot process to track down where it gets corrupted.
-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists