lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 21 Jun 2007 10:40:50 -0400 (EDT)
From:	Justin Piszcz <jpiszcz@...idpixels.com>
To:	Mattias Wadenstein <maswan@....umu.se>
cc:	Neil Brown <neilb@...e.de>, David Chinner <dgc@....com>,
	Avi Kivity <avi@...o.co.il>, david@...g.hm,
	linux-kernel@...r.kernel.org, linux-raid@...r.kernel.org
Subject: Re: limits on raid



On Thu, 21 Jun 2007, Mattias Wadenstein wrote:

> On Thu, 21 Jun 2007, Neil Brown wrote:
>
>> I have that - apparently naive - idea that drives use strong checksum,
>> and will never return bad data, only good data or an error.  If this
>> isn't right, then it would really help to understand what the cause of
>> other failures are before working out how to handle them....
>
> In theory, that's how storage should work. In practice, silent data 
> corruption does happen. If not from the disks themselves, somewhere along the 
> path of cables, controllers, drivers, buses, etc. If you add in fcal, you'll 
> get even more sources of failure, but usually you can avoid SANs (if you care 
> about your data).
>
> Well, here is a couple of the issues that I've seen myself:
>
> A hw-raid controller returning every 64th bit as 0, no matter what's on disk. 
> With no error condition at all. (I've also heard from a collegue about this 
> on every 64k, but not seen that myself.)
>
> An fcal switch occasionally resetting, garbling the blocks in transit with 
> random data. Lost a few TB of user data that way.
>
> Add to this the random driver breakage that happens now and then. I've also 
> had a few broken filesystems due to in-memory corruption due to bad ram, not 
> sure there is much hope of fixing that though.
>
> Also, this presentation is pretty worrying on the frequency of silent data 
> corruption:
>
> https://indico.desy.de/contributionDisplay.py?contribId=65&sessionId=42&confId=257
>
> /Mattias Wadenstein
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

Very interesting slides/presentation, going to watch it shortly.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ