lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 18 Jan 2007 21:11:58 +0100
From:	noah <noah123@...il.com>
To:	linux-kernel@...r.kernel.org
Subject: Data corruption with raid5/dm-crypt/lvm/reiserfs on 2.6.19.2

Hi!

I'm experiencing data corruption in the following setup:

1. mdadm --create /dev/md0 -n3 -lraid5 /dev/hda1 /dev/hdc1 /dev/hde1
2. cryptsetup -c aes-cbc-essiva:sha256 luksFormat /dev/md0 mykey
3. cryptsetup -d mykey luksOpen /dev/md0 cryptvol
4. pvcreate /dev/mapper/cryptvol
5. vgcreate vg0 /dev/cryptvol
6. lvcreate -n root  -L10G vg0
7. mkreiserfs -q /dev/vg0/root
8. mkdir /.newroot; mount /dev/vg0/root /.newroot
9. mkdir /.realroot; mount -o bind / /.realroot
10. tar cf - -C /.realroot|tar xvpf - -C /.newroot

With Linux 2.6.18 (it's broken, OK, but there's still something wrong
even in 2.6.19.2 so keep on reading) I started getting warnings from
ReiserFS indicating severe data corruptions.  Reiserfsck confirms
this.  It usually happened while extracting the Linux source tree.

So after asking around I found out dm-crypt had a bug[1] fixed in
early December.
It got fixed in 2.6.19 and the fix was backported and included in 2.6.18.6[2].

Fine, so I upgraded to 2.6.18.6, rebuilt the array from scratch and
did the whole procedure again.
No messages from reiserfs in dmesg this time, but reiserfsck still
revealed severe data corruption.
I also found compressed archives and ISO-images for which I had
md5sums to be corrupt.

I then upgraded to 2.6.19.2 with the exact same result as with 2.6.18.6.
I even verified this on a fairly new computer with different hardware
(Intel CPU and chipset).

Figured it maybe was some kind of race condition so on my second try
on 2.6.19.2, when recreating the array, I let md finish resyncing it
before copying over the files.
This time, reiserfsck didn't complain.

Just for the sake of fun, I did the whole thing again, rebuilding the
array from scratch, let md resync the third drive and then I started
to copy over all files again.  Thinking the cause of the problem was
heavy disk I/O I tried to stress the other LVM volumes residing on md0
using tar during the copy.  Everything seemed fine; no problems arose.

Did a few reboots and confirmed that reiserfsck didn't have any
complaints on any of the filesystems residing on the LVM volumes on
md0.

Started using the machine as normal, and half a day later I unmounted
the filesystems and ran reiserfsck just to make sure everything still
was OK.  Unfortunately, it wasn't.


The drives in the array are three brand new drives, 2x250GB and one
200GB, all three IDE drives.
According to SMART there's no problems with them.  And they worked
fine in my previous RAID1 setup with dm-crypt and LVM, by the way.
The computer itself is an Athlon XP with less than 1GB of RAM on a M/B
with nForce2 chipset FWIW.  No memory errors were detected with
memtest86+ (I completed the full test).
I haven't tried using another filesystem as I've got quite a lot of
faith in reiserfs's stability.

Is anybody else experiencing these problems?
Unfortunately I'm only able to do limited testing due to busy days,
but I'd love to help if I can.


[1] Here's a thread on the recently fixed data corruption bug in dm-crypt
http://article.gmane.org/gmane.linux.kernel.device-mapper.dm-crypt/1974

[2] The backport of the dm-crypt fix for 2.6.18.6 is here
http://uwsg.iu.edu/hypermail/linux/kernel/0612.1/2299.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists