[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <508823D4.4010407@redhat.com>
Date: Wed, 24 Oct 2012 12:22:28 -0500
From: Eric Sandeen <sandeen@...hat.com>
To: "Theodore Ts'o" <tytso@....edu>, Nix <nix@...eri.org.uk>,
linux-ext4@...r.kernel.org, linux-kernel@...r.kernel.org,
"J. Bruce Fields" <bfields@...ldses.org>,
Bryan Schumaker <bjschuma@...app.com>,
Peng Tao <bergwolf@...il.com>, Trond.Myklebust@...app.com,
gregkh@...uxfoundation.org,
Toralf Förster
<toralf.foerster@....de>
Subject: Re: Apparent serious progressive ext4 data corruption bug in 3.6.3
(and other stable branches?)
On 10/24/2012 12:23 AM, Theodore Ts'o wrote:
> On Tue, Oct 23, 2012 at 11:27:09PM -0500, Eric Sandeen wrote:
>>
>> Ok, fair enough. If the BBU is working, nobarrier is ok; I don't trust
>> journal_async_commit, but that doesn't mean this isn't a regression.
>
> Note that Toralf has reported almost exactly the same set of symptoms,
> but he's using an external USB stick --- and as far as I know he
> wasn't using nobarrier and/or the journal_async_commit. Toralf, can
> you confirm what, if any, mount options you were using when you saw
> it.
>
> I've been looking at this some more, and there's one other thing that
> the short circuit code does, which is neglects setting the
> JBD2_FLUSHED flag, which is used by the commit code to know when it
> needs to reset the s_start fields in the superblock when we make our
> next commit. However, this would only happen if the short circuit
> code is getting hit some time other than when the file system is
> getting unmounted --- and that's what Eric and I can't figure out how
> it might be happening. Journal flushes outside of an unmount does
> happen as part of online resizing, the FIBMAP ioctl, or when the file
> system is frozen. But it didn't sound like Toralf or Nix was using
> any of those features. (Toralf, Nix, please correct me if my
> assumptions here is wrong).
If I freeze w/ anything in the log, then s_start !=0 and we proceed
normally. If I re-freeze w/o anything in the log, it's already set to
FLUSHED (which makes sense) so not re-setting it doesn't matter. So I
don't see that that's an issue.
As for FIBMAP I think we only do journal_flush if it's data=journal.
In other news, Phoronix is on the case, so expect escalating freakouts ;)
-Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists