lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6601abe90906051442t294c2e79i3c6dc38c1d53e5e0@mail.gmail.com>
Date:	Fri, 5 Jun 2009 14:42:18 -0700
From:	Curt Wohlgemuth <curtw@...gle.com>
To:	Theodore Tso <tytso@....edu>, Aioanei Rares <krnl.list@...il.com>,
	Alan Jenkins <alan-jenkins@...fmail.co.uk>,
	linux-ext4@...r.kernel.org,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: Mild filesystem corruption on ext4 (no journal)

On Fri, Jun 5, 2009 at 11:01 AM, Theodore Tso<tytso@....edu> wrote:
> On Fri, Jun 05, 2009 at 05:40:33PM +0300, Aioanei Rares wrote:
>>> When I upgrade libc from 2.7 (debian stable) to 2.9 (debian unstable),
>>> the locale breaks every reboot, and I have to repair it by running
>>> locale-gen.  This happened now when I only upgraded libc, in order to
>>> play with signalfd().  It also happened before, when I upgraded the
>>> entire machine to debian unstable (which I later reverted).
>>>
>>> The problem is that /usr/lib/locale/locale-archive gets corrupted when
>>> I reboot.  The exact corruption differs with each reboot (i.e. the
>>> md5sum differs).  Last time, the first ~70K was overwritten with data
>>> from xorg.log and my web browsing history.  I have copies of the
>>> original and corrupted state which I can send, the full file is 1.3
>>> megs, but I can limit it to the first 70K, since that's all that was
>>> corrupted.
>
>> I suspect, although I might be wrong, that this is not a kernel-related
>> problem.
>
> Actually, I suspect it is indeed a kernel-related problem.  The
> problem has been reported before, with a repeatable test case:
>
>        http://bugzilla.kernel.org/show_bug.cgi?id=13292
>
> The problem shows up after you unmount and remount the filesystem.
> Before you the filesystem is unmounted, the locale-archive file has
> the correct md5sum.  After you unmount and remount the filesystem, the
> filesystem is corrupted.  I'm guessing that some data blocks aren't
> getting marked as needing writeback, so the previous contents on disk
> aren't written back.  I was able to show that even though the mounted
> filesystem had the correct information, direct access to the disk
> using debugfs showed the blocks on disk had the contents that would be
> revealed after the filesystem was unmounted and remounted.
>
> The problem only shows up when using ext4 without a journal, and I was
> never able to create a simpler reproduction case.  The last time I
> tried to work on this bug was approximately a month ago.  About two
> weeks ago Frank from Google tried reproducing it, but he wasn't able
> to do so using his 2.6.26-based kernel plus an updated ext4.
> Unfortunately, I haven't had time to look at it since then, or to
> check to see if some of the more recent patches scheduled for the
> 2.6.31 merge window might have changed the behaviour of this bug.

Just FYI: Frank Mayhar has recreated this issue in a recent kernel
(though we're not seeing it with our 2.6.26 kernel + ext4 patches),
and is actively working on it.

Curt

>
>                                           - Ted
>
>
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ