lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Date:	Tue, 31 Oct 2006 17:09:54 +0100
From:	"Andreas Paulsson" <andreas.paulsson@...arden.se>
To:	"Neil Brown" <neilb@...e.de>
Cc:	<linux-kernel@...r.kernel.org>
Subject: RE: RE: RE: PROBLEM: raid5 just dies

>On Monday October 30, andreas.paulsson@...arden.se wrote:
>> >Exactly how are aes-loop and raid5 connected together?
>> 
>> We use 5x300gb drives in a raid5 array, which is then used as a 
>> physical disk in an lvm volume, with one logical volume. This logical

>> volume is then encrypted with "losetup -e aes /dev/loop1 
>> /dev/vg0/lv0", and then formatted with ReiserFS.

>Thanks.
>
>It could be a hardware problem....
>The symptom is that we try to free some memory and a
>consistency check tells us that the memory wasn't allocated.
>So a single bit error in the address could be thecause.
>Running memtest86 for a while wouldn't hurt if you haven't
>already done that.

I'll see if I get get someone to do a memtest for me (the box is 600km
away).

>You have three layers here: loop over dm over md/raid5.
>So if it is a software problem it could be in any of these layers,
>or in an interaction between two of them.

>1/ how repeatable is this?

cp -arv source target
.. and then wait for a while, and it crashes. That is, 100% repeatable.

>2/ how much room have you got to experiment?
> Could you remake the array without the loop/aes and see if you can
>reproduce the problem?

I could, but then I would need to redo all the work again, since we need
the disk to be encrypted too.

>Could you remake the array without the LVM layer and see if you can
>reproduce the problem?

No, this is impossible, we need to have LVM because we are making one
big volume that consists of raid5 and raid1 devices.

>Do you have CONFIG_DEBUG_PAGEALLOC and CONFIG_DEBUG_SLAB set?
>If not could you >recompile with those set to see if they
>provide more helpful information.

I can try and see if it helps.

>I must admit I am somewhat at a loss.
>I cannot see much room for problems leading to that
>particular point in the code that would not be seen
>by lots more people than just you.

Heh, you're not the only one that's lost.

>NeilBrown

/Andreas
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ