[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20101207142145.GA27861@think>
Date: Tue, 7 Dec 2010 09:21:45 -0500
From: Chris Mason <chris.mason@...cle.com>
To: Matt <jackdachef@...il.com>
Cc: Mike Snitzer <snitzer@...hat.com>, Milan Broz <mbroz@...hat.com>,
Andi Kleen <andi@...stfloor.org>,
linux-btrfs <linux-btrfs@...r.kernel.org>,
dm-devel <dm-devel@...hat.com>,
Linux Kernel <linux-kernel@...r.kernel.org>,
htd <htd@...cy-poultry.org>, htejun@...il.com,
linux-ext4@...r.kernel.org, Jon Nelson <jnelson@...poni.net>
Subject: Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-crypt
barrier support is effective)
On Sun, Dec 05, 2010 at 12:47:11AM +0100, Matt wrote:
> > OK.
>
> meanwhile I think I got some interesting news:
>
> after some time of running (around 1 to 1.5 hours) I noticed the
> following BUG with ext4:
>
> [ 4421.503477] ------------[ cut here ]------------
> [ 4421.503482] kernel BUG at fs/ext4/inode.c:2714!
>
> kernel compiled was from sources checked out at
> 1de3e3df917459422cb2aecac440febc8879d410
Looking at 1de3e3df917459422cb2aecac440febc8879d410:
Line 2714 in fs/ext4/inode.c is this:
/*
* If the page does not have buffers (for whatever reason),
* try to create them using block_prepare_write. If this
* fails, redirty the page and move on.
*/
if (!page_buffers(page)) {
^^^^^^^^^^^^^^^^^^^^^^^^^^^
if (block_prepare_write(page, 0, len,
noalloc_get_block_write)) {
redirty_page:
redirty_page_for_writepage(wbc, page);
unlock_page(page);
return 0;
}
commit_write = 1;
}
Which means we're really hitting this:
/* If we *know* page->private refers to buffer_heads */
#define page_buffers(page) \
({ \
BUG_ON(!PagePrivate(page)); \
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
((struct buffer_head *)page_private(page)); \
})
#define page_has_buffers(page) PagePrivate(page)
Looks like Ted fixed it here:
commit b1142e8fec6a594723e5054055a7b53379b90490
Author: Theodore Ts'o <tytso@....edu>
Date: Thu Oct 28 17:33:57 2010 -0400
ext4: BUG_ON fix: check if page has buffers before calling page_buffers()
Basically, once you hit this oops, ext4 is done. No files you created
after the oops will be there when you reboot, and the rest of your
lockups etc are because the jbd process had some locks held when it
crashed.
Was there also a report of corruption w/dm-crypt and XFS?
Last night I ran dm-crypt + the cpu scalability patch + ext4 +
2.6.37-rc3 in a long stress, and it passed without any problems. If
dm-crypt were not doing the IO properly, this test probably would have
found it (+/- strange block sizes, races with O_DIRECT and other exotic
fun).
-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists