[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110815020652.GE3524@thunk.org>
Date: Sun, 14 Aug 2011 22:06:52 -0400
From: Ted Ts'o <tytso@....edu>
To: Curt Wohlgemuth <curtw@...gle.com>
Cc: Ext4 Developers List <linux-ext4@...r.kernel.org>
Subject: Re: [PATCH] jbd2: instrument jh on wrong transaction BUG_ON's
On Sun, Aug 14, 2011 at 01:01:01PM -0700, Curt Wohlgemuth wrote:
>
> It seems that jh->b_transaction is NULL, and deferencing it to print
> out jh->b_transaction->t_tid causes the NULL pointer deref.
Hmm, if jh->b_transaction is null, then
jbd2_journal_get_write_access() must not have been called on the
journal_head. But the relevant code from ext4_journalled_writepage() is:
ret = walk_page_buffers(handle, page_bufs, 0, len, NULL,
do_journal_get_write_access);
err = walk_page_buffers(handle, page_bufs, 0, len, NULL,
write_end_fn);
(write_end_fn is the function which calls which ultimately calls
ext4_handle_dirty_metadata which then calls
jbd2_journal_dirty_metadata(), which is where you're seeing the BUG_ON).
I suspect what is going on is do_get_write_access() is returning an
error, which means we never set jh->b_transaction. Hence when we call
jbd2_journal_data_metadata(), we trigger the BUG.
OK, what could cause do_get_write_access() to return an error? Two
conditions: if the handle is aborted (due to a previous error), in
which case it returns -EROFS, or if it can't get the memory needed
make a copy of the buffer, in which case there should have been a
"do_get_write_access: OOM for frozen buffer" error message earlier in
the log. (No, it's not prefixed with JBD2 --- it probably should be.)
Any chance you're seeing any indication of either possibility in the
messages before the BUG message?
- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists