lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sun, 28 Jun 2009 22:30:37 -0500
From:	Eric Sandeen <sandeen@...hat.com>
To:	Krzysztof Kosiński <tweenk.pl@...il.com>
CC:	linux-ext4@...r.kernel.org
Subject: Re: Massive corruption on RAID0

Krzysztof Kosiński wrote:
> Hello
> 
> Here is my story: I recently migrated a server from Windows to Ubuntu
> 9.04. I formatted all disks with ext4. The server has 5 disks: three
> SCSI (9GB for /, 2x18GB for /data/small and /home) and two IDE
> (2x300GB). I put the IDE disks in a RAID0: each had a single partition
> with type set to "fd", and the entire resulting device (/dev/md0) was
> formatted with ext4 as:
> 
> mkfs.ext4 -b 4096 -E stride=16 /dev/md0
> 
> All was well until a power outage that left the filesystem on /dev/md0
> unmountable (the others were fine after an fsck). I made a backup of
> the corrupted array to another disk and ran fsck, but it ended up in
> an infinite loop. After some unsuccessful tinkering, I restored the
> backup and found out that large portions of the group descriptor table
> are filled with random junk. Moreover, all backup superblocks are
> either corrupted or zeroed, and I found partial copies of an
> identically corrupted table at various weird offsets (including
> 176888, 600344, 1036536, 1462520, 1887256 and 5326832); neither of
> these copies were preceded by anything resembling a superblock. Here
> is a copy of first 39 blocks of the corrupted disk:
> http://tweenk.artfx.pl/super.bin

It's awfully hard to say what went wrong given this information.
However, power failures mean that write caches on drives go away and
without barriers (which md raid0 won't pass, IIRC), that means that
journal ordering guarantees are shot, and so corruption can happen - but
I would not expect huge swaths of crud sprinkled over the drive.

Is the super.bin above from before the fiddling you did (i.e. right
after the power loss?)  The superblock is marked with errors, I wonder
if there were other errors reported on the filesystem prior to the power
loss; you might check your logs ...

-Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ