[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Pine.LNX.4.64.0609211944500.20970@blonde.wat.veritas.com>
Date: Thu, 21 Sep 2006 20:04:49 +0100 (BST)
From: Hugh Dickins <hugh@...itas.com>
To: Andrew Morton <akpm@...l.org>
cc: Anton Altaparmakov <aia21@....ac.uk>,
Jonathan Woithe <jwoithe@...sics.adelaide.edu.au>,
linux-kernel@...r.kernel.org
Subject: Re: Fw: 2.6.17 oops, possibly ntfs/mmap related
On Thu, 21 Sep 2006, Andrew Morton wrote:
> On Thu, 21 Sep 2006 15:41:36 +0100
> Anton Altaparmakov <aia21@....ac.uk> wrote:
> >
> > BUG: unable to handle kernel paging request at virtual address 020030d2
> >
> > The values of the relevant variables from the oops are:
> >
> > page = 0xc2248fa0
> > page->mapping = 0xe3a79eac
> > page->mapping->a_ops = 0x020030aa
>
> I wonder if page->mapping really wanted to be 0xc3a79eac, only something
> set bit 29.
Perhaps, but I'm more suspicious of that 0x0200 top half of the a_ops ptr.
>
> > Note that 0x020030aa+0x28 = 020030d2 which is the oops causing address
> > and 0x28 is the offset of the releasepage function pointer in the
> > address space operations structure...
> >
> > This oops is not identical to the oopses pointed out by Jonathan at:
> >
> > http://www.atrad.com.au/~jwoithe/kernel/oopses-20060913.txt
> >
> > But those oopses have to do with pages also so could be related...
>
> Looks a bit different - Jonathan appears to have pulled a bad page* out
> of the radix tree whereas you got your page off the LRU.
Jonathan does show a second oops, from a later boot:
BUG: unable to handle kernel paging request at virtual address 0010c744
printing eip:
c013be50
*pde = 00000000
Oops: 0002 [#1]
Modules linked in: ntfs 8139too via_agp agpgart usb_storage ehci_hcd uhci_hcd usbcore
CPU: 0
EIP: 0060:[<c013be50>] Tainted: G M VLI
EFLAGS: 00010282 (2.6.17 #2)
EIP is at anon_vma_unlink+0x16/0x3c
eax: 0010c740 ebx: cf1070cc ecx: cf107104 edx: cf8bc740
esi: cf8bc740 edi: b7e82000 ebp: 00000000 esp: cdad7f58
I haven't worked out the disassembly in detail to support the idea
(though certainly anon_vma_unlink would be trying to list_del around
here), but that eax and esi do suggest a corrupted list: somehow the
top half of a pointer overwritten by the top half of LIST_POISON1.
And in Anton's case, the top half of a pointer overwritten by the
bottom half of LIST_POISON2.
Maybe just coincidence, and I've nothing more illuminating to add;
but just a hint of a list_del going very wrong somewhere?
Hugh
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists