[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20070918014537.GK23367404@sgi.com>
Date: Tue, 18 Sep 2007 11:45:37 +1000
From: David Chinner <dgc@....com>
To: Justin Piszcz <jpiszcz@...idpixels.com>
Cc: linux-kernel@...r.kernel.org, xfs@....sgi.com
Subject: Re: 2.6.20 (XFS? related) crash after uptime of > 180 days during apt-get dist-upgrade on Debian Testing
On Mon, Sep 17, 2007 at 01:20:17PM -0400, Justin Piszcz wrote:
> Including the XFS mailing list in here too because it may be an XFS bug
> looking at the call trace.
>
> System: Debian Testing
> Kernel: 2.6.20
> Config: Attached
>
> I was running apt-get dist-upgrade as I always do to get the latest
> packages upgraded and the kernel OOPS'd when it was upgrading 'tzdata' and
> the process went into D-state and I had to reboot.
>
> The config file is from 2.6.20 but it had been moved to a 2.6.22 directory
> for an upgrade, but all of the options have been left unchanged.
>
> Here is the *OOPS I captured via dmesg before I rebooted:
>
> [16201055.214559] nfsd: last server has exited
> [16201055.214566] nfsd: unexporting all filesystems
> [17341583.697472] BUG: unable to handle kernel paging request at virtual
> address 99e00750
> [17341583.697480] printing eip:
> [17341583.697482] c01531b0
> [17341583.697484] *pde = 00000000
> [17341583.697488] Oops: 0000 [#1]
> [17341583.697491] CPU: 0
> [17341583.697493] EIP: 0060:[<c01531b0>] Not tainted VLI
> [17341583.697494] EFLAGS: 00210286 (2.6.20 #3)
> [17341583.697502] EIP is at __d_lookup+0x5d/0xd6
> [17341583.697505] eax: c8d7c17e ebx: 99e00750 ecx: 00000011 edx:
> c17f9200
> [17341583.697508] esi: 99e00750 edi: d2a10016 ebp: c7fe2304 esp:
> dba35d98
> [17341583.697511] ds: 007b es: 007b ss: 0068
> [17341583.697514] Process kdm_greet (pid: 22119, ti=dba34000 task=f52d4a70
> task.ti=dba34000)
> [17341583.697516] Stack: c8d7c17e 00000000 dba35e10 f705d478 dba35db8
> 0000002c d2a10016 d2a10042 [17341583.697522] dba35e10 dba35f30
> dba35e10 c014ab6d dba35e1c c18c5240 dba35f04 c021877e [17341583.697528]
> d2a10042 dba35e10 c8d7c17e dba35f30 c014c38f d2a10016 00000101 dba35e48
> [17341583.697534] Call Trace:
> [17341583.697537] [<c014ab6d>] do_lookup+0x1c/0x168
> [17341583.697540] [<c021877e>] xfs_vn_lookup+0x53/0x77
> [17341583.697547] [<c014c38f>] __link_path_walk+0x6e8/0xb1b
> [17341583.697551] [<c0153698>] dput+0x18/0x121
> [17341583.697554] [<c014c805>] link_path_walk+0x43/0xb8
> [17341583.697558] [<c014ca0a>] do_path_lookup+0x75/0x181
> [17341583.697561] [<c0145fda>] get_empty_filp+0x2f/0xe5
> [17341583.697566] [<c014d468>] __path_lookup_intent_open+0x45/0x80
> [17341583.697570] [<c014d517>] path_lookup_open+0x20/0x25
> [17341583.697573] [<c014d5db>] open_namei+0x66/0x58a
> [17341583.697576] [<c0143c35>] do_filp_open+0x25/0x40
> [17341583.697580] [<c0143c8e>] do_sys_open+0x3e/0xc7
> [17341583.697584] [<c0143d52>] sys_open+0x1c/0x20
> [17341583.697587] [<c0102920>] syscall_call+0x7/0xb
> [17341583.697591] =======================
> [17341583.697593] Code: 81 f2 01 00 37 9e 8b 0d 18 3f 44 c0 d3 ea 31 d0 23
> 05 14 3f 44 c0 8b 15 1c 3f 44 c0 8b 34 82 85 f6 75 08 eb 4d 89 de 85 db 74
> 47 <8b> 1e 0f 18 03 90 8d 6e f4 8b 04 24 3b 45 18 75 e9 8b 44 24 0c
> [17341583.697621] EIP: [<c01531b0>] __d_lookup+0x5d/0xd6 SS:ESP
> 0068:dba35d98
> [17341583.697626] <1>BUG: unable to handle kernel paging request at
> virtual address 99e00750
> [17341648.066740] printing eip:
> [17341648.066786] c01531b0
> [17341648.066868] *pde = 00000000
> [17341648.066916] Oops: 0000 [#2]
> [17341648.066965] CPU: 0
> [17341648.066966] EIP: 0060:[<c01531b0>] Not tainted VLI
> [17341648.066967] EFLAGS: 00010286 (2.6.20 #3)
> [17341648.067115] EIP is at __d_lookup+0x5d/0xd6
> [17341648.067165] eax: 1efcce0e ebx: 99e00750 ecx: 00000011 edx:
> c17f9200
> [17341648.067219] esi: 99e00750 edi: cc87901a ebp: c7fe2304 esp:
> f7755f04
> [17341648.067271] ds: 007b es: 007b ss: 0068
> [17341648.067320] Process dpkg (pid: 24684, ti=f7754000 task=d9846a70
> task.ti=f7754000)
> [17341648.067371] Stack: 1efcce0e 46dd3a20 f7755f5c e489fe28 00000000
> 00000010 cc87901a 00000000 [17341648.067715] e489fe28 00000001
> f7755f54 c014b7cb f7755f5c ef0d4098 ffffffd9 cc879000 [17341648.068056]
> 00000001 f7755f54 c014cf84 f7755f54 e489fe28 c18c5240 1efcce0e 00000010
> [17341648.068397] Call Trace:
> [17341648.068482] [<c014b7cb>] __lookup_hash+0x4a/0xef
> [17341648.068563] [<c014cf84>] do_rmdir+0x69/0xbb
> [17341648.068642] [<c0102920>] syscall_call+0x7/0xb
> [17341648.068724] =======================
> [17341648.068770] Code: 81 f2 01 00 37 9e 8b 0d 18 3f 44 c0 d3 ea 31 d0 23
> 05 14 3f 44 c0 8b 15 1c 3f 44 c0 8b 34 82 85 f6 75 08 eb 4d 89 de 85 db 74
> 47 <8b> 1e 0f 18 03 90 8d 6e f4 8b 04 24 3b 45 18 75 e9 8b 44 24 0c
> [17341648.070874] EIP: [<c01531b0>] __d_lookup+0x5d/0xd6 SS:ESP
> 0068:f7755f04
> [17341648.070988]
>
> I doubt I can reproduce it as it has happened after 180 days or so, and I
> am upgrading to 2.6.22.6 but I was wondering what exactly happened here?
No idea - it looks like dkpg was trying to remove a directory on the
same path the lookup was and both have gone splat in __d_lookup on
the same dentry. Something happened in those 180 days that left a
landmine that was tripped over here, I think. I can't see any way of
tracking it down from this, but thanks for reporting it anyway,
Justin.
Cheers,
Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists