[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <17734.32978.304514.114875@cse.unsw.edu.au>
Date: Tue, 31 Oct 2006 09:46:42 +1100
From: Neil Brown <neilb@...e.de>
To: "Andreas Paulsson" <andreas.paulsson@...arden.se>
Cc: <linux-kernel@...r.kernel.org>
Subject: Re: SV: PROBLEM: raid5 just dies
On Monday October 30, andreas.paulsson@...arden.se wrote:
> >Exactly how are aes-loop and raid5 connected together?
>
> We use 5x300gb drives in a raid5 array, which is then used as a physical
> disk in an lvm volume, with one logical volume. This logical volume is
> then encrypted with "losetup -e aes /dev/loop1 /dev/vg0/lv0", and then
> formatted with ReiserFS.
Thanks.
It could be a hardware problem....
The symptom is that we try to free some memory and a consistency check
tells us that the memory wasn't allocated. So a single bit error in
the address could be the cause. Running memtest86 for a while
wouldn't hurt if you haven't already done that.
You have three layers here: loop over dm over md/raid5.
So if it is a software problem it could be in any of these layers, or
in an interaction between two of them.
1/ how repeatable is this?
2/ how much room have you got to experiment?
Could you remake the array without the loop/aes and see if you can
reproduce the problem?
Could you remake the array without the LVM layer and see if you can
reproduce the problem?
Do you have CONFIG_DEBUG_PAGEALLOC and CONFIG_DEBUG_SLAB set? If not
could you recompile with those set to see if they provide more helpful
information.
I must admit I am somewhat at a loss. I cannot see much room for
problems leading to that particular point in the code that would not
be seen by lots more people than just you.
NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists