[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130513030712.GC25996@thunk.org>
Date: Sun, 12 May 2013 23:07:12 -0400
From: Theodore Ts'o <tytso@....edu>
To: Tony Luck <tony.luck@...il.com>
Cc: Dmitry Monakhov <dmonakhov@...nvz.org>, eunb.song@...sung.com,
"linux-ext4@...r.kernel.org" <linux-ext4@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: EXT4 panic at jbd2_journal_put_journal_head() in 3.9+
On Sun, May 12, 2013 at 07:04:45PM -0700, Tony Luck wrote:
> My git bisect finally competed and points the a finger at:
>
> commit ae4647fb7654676fc44a97e86eb35f9f06b99f66
> Author: Jan Kara <jack@...e.cz>
> Date: Fri Apr 12 00:03:42 2013 -0400
>
> jbd2: reduce journal_head size
>
> Remove unused t_cow_tid field (ext4 copy-on-write support doesn't seem
> to be happening) and change b_modified and b_jlist to bitfields thus
> saving 8 bytes in the structure.
Both you and Eunbong Song bisected to the same commit, so presumably
the right thing to do at this point is to revert it. Have you tried
reverting the commit and demonstrating that the problem goes away
afterwards?
The reason why I ask is that I'm completely at a lost to understand
why this commit could be making a difference. Loooking at the commit,
we're converting two unsigned fields, neither of which use more than 4
bits or 1 bits, respectively, to use bitfields instead. Why this
could be causing __journal_remove_journal_head() to fail, especially
in the way that it does, isn't making any sense to me. We are
technically accessing jh->b_jlist without first locking
jbd2_lock_bh_state(), but (a) it shouldn't make a difference whether
we use a bitfield or 32-bit unsigned value, and (b) by the time we get
to __journal_remove_journal_head(), nothing should be using the
journal head, and we've locked jbd_lock_bh_journal_head(), which
should prevent any one else from starting to use the journal head.
Applying patch where I don't understand how it would make things
better, even if it is a revert, scares me. If we are going to do
this, and since I haven't yet been able to reproduce it on my testing
setup, could you try taking Linus's just released 3.10-rc1 release,
and revert commit ae4647fb765467, and confirm that this avoids the
crash which you are seeing?
Thanks,
- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists