[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <A01161D0-D08F-4522-A546-90057B2E1F2A@mit.edu>
Date: Wed, 12 Jan 2011 19:44:17 -0500
From: Theodore Tso <tytso@....EDU>
To: Sebastian Ott <sebott@...ux.vnet.ibm.com>
Cc: "linux-ext4@...r.kernel.org development" <linux-ext4@...r.kernel.org>,
LKML Kernel <linux-kernel@...r.kernel.org>,
pm list <linux-pm@...ts.linux-foundation.org>
Subject: Re: Oops while going into hibernate
On Jan 12, 2011, at 1:49 PM, Sebastian Ott wrote:
>
> I did about 5 re-tests with different loads, the result was always:
>
> [ 249.238697] EXT4-fs (dasda1): inode #1106 has NULL jinode
>
> # debugfs /dev/dasda1
> debugfs 1.41.10 (10-Feb-2009)
> debugfs: ncheck 1106
> Inode Pathname
> 1106 /usr/bin/killall
> debugfs: q
>
That looks really bogus. /usr/bin/killall is a system binary, and there's no good reason that hibernation should be trying to write pages to that binary.
You said originally that the oops was happening "while going into hibernation right after resuming with...". So that means you did a successful suspend/resume, and then the second suspend caused the oops? It looks like somehow the pages were left marked as dirty, so the writeback daemons attempted writing back a page to an inode which was never opened read/write (and in fact as a text page for /usr/bin/killall, was mapped read/only). Given that ext4 initializes jinode only when the file is opened read/write, the fact that it is null, and the fact that it makes no sense that a program would be modifying /usr/bin/killall as part of a suspend/resume, it looks very much like we just unmasked a software suspend bug....
-- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists