lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2CE44BD3DBCF9541909CCB42F11CA392825D27@SFO1EXC-MBXP06.nbttech.com>
Date:	Wed, 6 Jun 2012 05:44:47 +0000
From:	Ming Lei <Ming.Lei@...erbed.com>
To:	Eric Sandeen <sandeen@...hat.com>
CC:	"linux-ext4@...r.kernel.org" <linux-ext4@...r.kernel.org>
Subject: RE: ext4 corruption during unexpected power cycle in the middle of
 writing

Is this behavior documented somewhere?

-----Original Message-----
From: Eric Sandeen [mailto:sandeen@...hat.com] 
Sent: Tuesday, June 05, 2012 10:32 PM
To: Ming Lei
Cc: linux-ext4@...r.kernel.org
Subject: Re: ext4 corruption during unexpected power cycle in the middle of writing

On 6/6/12 12:24 AM, Ming Lei wrote:
> I ran the power cycle test during the middle of file writing and after bootup, I ran force fsck and found two errors (If I run fsck -p -v I don't see the errors). From what I saw I think it is file system meta data corruption. Fsck can repair it but each time I ran the same test and I hit the same issue. 
> 
> I don't think it is relevant but want to point out that sda6 shares the same drive as another partition on sda(sda3) is used for the raid6 array for /var.
> 
> The same issue was found whenever barrier is on or off, and the disk drive write cache is enabled or disabled. The test result shown below is when barrier is on and disk write cache is disabled. 
> 
> I use kernel version 2.6.32SL6 version. I also see the same issue on 2.6.9 based kernel on the same hardware with ext3 file system.
> 
> My question is:
> 1) Is the issue caused from something unique in my box? Configuration error?
> 2) Is it possible my version of fsck reported false errors?

Sort of.  You got:

> Free blocks count wrong (118366120, counted=76269471).
> Fix? yes
> 
> Free inodes count wrong (30081013, counted=30081004).
> Fix? yes

Those are the superblock counters, which aren't journaled - only the bg counters are logged via the journal, IIRC.

They aren't false... they are just expected due to the design I'm afraid.

If you had mounted/unmounted/fsck'd you wouldn't have seen errors, because at mount time the superblock gets updated from all of the individual bg counters in ext4_fill_super:

        /*
         * The journal may have updated the bg summary counts, so we
         * need to update the global counters.
         */

> 3) Is this a known issue? ? Is it a kernel bug?

yes.  Not really.  ;)

> 4) How do I find what's wrong?

I think this is by design, though maybe a little unfortunate in that it is unexpected to get fsck errors on a journaling filesystem after a crash...

I ran into this same thing when doing recovery testing for > 16T filesystems.

-Eric

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ