[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100712193726.GB12804@atrey.karlin.mff.cuni.cz>
Date: Mon, 12 Jul 2010 21:37:27 +0200
From: Jan Kara <jack@...e.cz>
To: "Amir G." <amir73il@...rs.sourceforge.net>
Cc: Eric Sandeen <sandeen@...hat.com>,
Ric Wheeler <rwheeler@...hat.com>,
Ext4 Developers List <linux-ext4@...r.kernel.org>
Subject: Re: [PATCH] fix for consistency errors after crash
Hi,
> I've seen you guys had some open RH bugs on ext3, who all share in
> common the "bit already free" error.
>
> This bug I reported can explain many different problems in ext[34].
>
> Essentially, every time there is a kernel crash (or hard reboot)
> during delete/truncate of a large file,
> it may result in "bit already clear" error after reboot.
>
> The problem is very simple and so is the fix.
> I proved the problem with 100% recreation chances using a small patch,
> instead of running statistical stress tests.
> All I did was to add a print and 10 seconds delay after transaction
> restart in ext3_free_branches and reboot > 5 seconds after the
> transaction restarts, so that kjournald will have time to commit the
> old transaction.
> After the reboot, I always get "bit already clear" errors, because the
> "half large truncate" transaction is not handled properly.
>
> I did not get any response from ext4 guys so far and since this bug
> dates back to ext3,
> I was hoping you guys could take a look and put your weight on pushing
> the fix upstream.
Thanks for a ping. Your analysis and the fix looks correct to me. Attached is
a fix of the problem for ext3 which I'll merge if noone objects. BTW: I've also
updated a comment a bit so you might want to include than in an ext4 patch as
well.
Honza
--
Jan Kara <jack@...e.cz>
SuSE CR Labs
View attachment "0001-ext3-Avoid-filesystem-corruption-after-a-crash-under.patch" of type "text/x-diff" (3410 bytes)
Powered by blists - more mailing lists