lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Sun, 14 Dec 2008 04:05:11 -0500 (EST)
From:	Justin Piszcz <jpiszcz@...idpixels.com>
To:	Redeeman <redeeman@...anurb.dk>
cc:	Bill Davidsen <davidsen@....com>, linux-kernel@...r.kernel.org,
	linux-raid@...r.kernel.org, xfs@....sgi.com,
	smartmontools-support@...ts.sourceforge.net
Subject: Re: Have the velociraptors in a test system now, checkout the
 errors.



On Sun, 14 Dec 2008, Redeeman wrote:

> On Wed, 2008-12-10 at 19:11 -0500, Bill Davidsen wrote:
>> Justin Piszcz wrote:
>>> Point of thread: Two problems, mentioned in detail below, NCQ in Linux
>>> when used in a RAID configuration and two, something with how Linux
>>> interacts with the drives causes lots of problems as when I run the WD
>>> tools on the disks, they do not show any errors.
>>>
>>> If anyone has/would like me to run any debugging/patches/etc on this
>>> system feel free to suggest/send me things to try out.  After I put
>>> the VR's in a test system, I left NCQ enabled and I made a 10 disk
>>> raid5 to see how fast I could get it to fail, I ran bonnie++ shown
>>> below as a disk benchmark/stress test:
>>>
>>> For the next test I will repeat this one but with NCQ disabled, having
>>> NCQ enabled makes it fail very easily.  Then I want to re-run the test
>>> with RAID6.
>>>
>>> bonnie++ -d /r1/test -s 1000G -m p63 -n 16:100000:16:64
>>>
>>> $ df -h
>>> /dev/md3              2.5T  5.5M  2.5T   1% /r1
>>>
>>> And the results?  Two disk "failures" according to md/Linux within a
>>> few hours as shown below:
>>>
>>> Note, the NCQ-related errors are what I talk about all of the time, if
>>> you use
>>> NCQ and Linux in a RAID environment with WD drives, well-- good luck.
>>>
>>> Two-disks failed out of the RAID5 and I currentlty cannot even 'see'
>>> one of the drives with smartctl, will reboot the host and check sde
>>> again.
>>>
>>> After a reboot, it comes up and has no errors, really makes one wonder
>>> where/what the bugs is/are, there are two I can see:
>>> 1. NCQ issue on at least WD drives in Linux in SW md/RAID
>>> 2. Velociraptor/other disks reporting all kinds of sector errors etc,
>>> but when you use the WD 11.x disk tools program and run all of their
>>> tests it says the disks have no problems whatsoever!  The smart
>>> statistics do confirm this.  Currently, TLER is on for all disks, for
>>> the duration of these tests.
>>
>> Just a few comments on this, I have several RAID arrays built on Seagate
>> using NCQ, and yet to have a problem. I have NCQ on with my WD drives,
>> non-RAID, and haven't had an issue with them either. The WDs run a lot
>> cooler than the SG, but they are probably getting less use, as well. If
>> the WD are still on sale after the holiday I may grab a few more and run
>> RAID, by then I will have some small sense of trusting them.
> Velociraptors, or which WD?
Velociraptors or Raptors or 750GiB disks (in Linux SW Raid) the NCQ issue 
does not appear to occur on Raptors on a 3ware card though, only in 
Linux+SW_raid.  The regular raptor/750gib also run just fine as standalone 
with NCQ.

Justin.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ