[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140707155310.GB8254@thunk.org>
Date: Mon, 7 Jul 2014 11:53:10 -0400
From: Theodore Ts'o <tytso@....edu>
To: David Jander <david@...tonic.nl>
Cc: Dmitry Monakhov <dmonakhov@...nvz.org>,
Matteo Croce <technoboy85@...il.com>,
"Darrick J. Wong" <darrick.wong@...cle.com>,
linux-ext4@...r.kernel.org
Subject: Re: ext4: journal has aborted
An update from today's ext4 concall. Eric Whitney can fairly reliably
reproduce this on his Panda board with 3.15, and definitely not on
3.14. So at this point there seems to be at least some kind of 3.15
regression going on here, regardless of whether it's in the eMMC
driver or the ext4 code. (It also means that the bug fix I found is
irrelevant for the purposes of working this issue, since that's a much
harder to hit, and that bug has been around long before 3.14.)
The problem in terms of narrowing it down any further is that the
Pandaboard is running into RCU bugs which makes it hard to test the
early 3.15-rcX kernels. There is some indication that the bug showed
up in the ext4 patches which Linus pulled at the beginning of
3.15-rc3. However, due to the ARM (or at least Pandaboard) RCU bugs,
it's not possible to bisect test this on the Pandaboard.
And on the x86_64, it takes most of a day to confirm the absence of a
test failure. (Although this is with a HDD, so assuming that we don't
have an eMMC as well as an ext4 regression in 3.15, it seems likely
that the problem is with some kind of ext4 regression sometime between
3.14 and 3.15.
So we are making progress, but it's slow. Hopefuly we'll know more in
the near future.
Thanks to everyone who has been working on this bug!
Cheers,
- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists