[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-id: <ccdefeca5c690.4aa04e8e@shaw.ca>
Date: Thu, 03 Sep 2009 23:17:34 -0600
From: Thomas Fjellstrom <tfjellstrom@...w.ca>
To: David Rees <drees76@...il.com>
Cc: linux-kernel@...r.kernel.org
Subject: Re: mvsas issues
----- Original Message -----
From: David Rees <drees76@...il.com>
Date: Thursday, September 3, 2009 6:32 pm
Subject: Re: mvsas issues
To: tfjellstrom@...w.ca
Cc: linux-kernel@...r.kernel.org
> On Thu, Sep 3, 2009 at 5:09 PM, Thomas
> Fjellstrom<tfjellstrom@...w.ca> wrote:
> > I just got a Marvell SAS card, And I've been trying to copy
> one set of disks
> > over to another before one of the drives dies for good...
>
> Have you tried using one of the rescue oriented versions of dd
> to make
> a copy of the failing disk? Then copy the image to a good
> disk and
> recreate your array (do it in read-only mode!) and see what you can
> recover.
>
I haven't yet, with 2.6.30, there doesn't seem to be an mvsas driver, in 2.6.31-rc5 it OOPSes the kernel, and in 2.6.31.-rc8 it locks up the entire marvell controller (That is all ports start returning errors) so ALL drives on it are useless. One additional problem is the drive thats giving up the ghost is one of two 2TB disks thats in a temporary md raid0 array. The rest of my disks are all 1TB or smaller. Trying to backup one half of the raid0 pair to a couple separate 1TB disks would be a pain to say the least. And last, but not least, today the drive was decideing to error out after 10 minutes of any kind of actual use. It works after boot for a while no matter which controller its on, but after a bit of load it croaks.
I've decided to give the system and the drive a rest for tonight and try to get it backed up in the morning. Its possible the temperature here was causing the drive to get a little too warm (60c+?). Its strange though. The drive can last 12 hours as long as the load is very light. I might get some ata errors, but nothing libata can't handle with a reset (it causes a brief pause, but stuff keeps going), but the more I try to do, the faster it decides to throw an error. The errors I got on the AMD sata controller was very similar to a recent report about high rate transfers causing libata/linux to somehow loose connection with the host. Also the same error I saw people attribute to possible power or interrupt issues. But the errors are completely different on the mvsas controller.
> http://en.wikipedia.org/wiki/Dd_%28Unix%29#Recovery-
> oriented_variants_of_dd
> -Dave
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists