linux-kernel - Re: [patch] document flash/RAID dangers

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4A947E05.8070406@redhat.com>
Date:	Tue, 25 Aug 2009 20:12:53 -0400
From:	Ric Wheeler <rwheeler@...hat.com>
To:	Pavel Machek <pavel@....cz>
CC:	david@...g.hm, Theodore Tso <tytso@....edu>,
	Florian Weimer <fweimer@....de>,
	Goswin von Brederlow <goswin-v-b@....de>,
	Rob Landley <rob@...dley.net>,
	kernel list <linux-kernel@...r.kernel.org>,
	Andrew Morton <akpm@...l.org>, mtk.manpages@...il.com,
	rdunlap@...otime.net, linux-doc@...r.kernel.org,
	linux-ext4@...r.kernel.org, corbet@....net
Subject: Re: [patch] document flash/RAID dangers

On 08/25/2009 08:06 PM, Pavel Machek wrote:
> On Tue 2009-08-25 19:48:09, Ric Wheeler wrote:
>>
>>> ---
>>> There are storage devices that high highly undesirable properties
>>> when they are disconnected or suffer power failures while writes are
>>> in progress; such devices include flash devices and MD RAID 4/5/6
>>> arrays.  These devices have the property of potentially
>>> corrupting blocks being written at the time of the power failure, and
>>> worse yet, amplifying the region where blocks are corrupted such that
>>> additional sectors are also damaged during the power failure.
>>
>> I would strike the entire mention of MD devices since it is your
>> assertion, not a proven fact. You will cause more data loss from common
>
> That actually is a fact. That's how MD RAID 5 is designed. And btw
> those are originaly Ted's words.
>

Ted did not design MD RAID5.

>> events (single sector errors, complete drive failure) by steering people
>> away from more reliable storage configurations because of a really rare
>> edge case (power failure during split write to two raid members while
>> doing a RAID rebuild).
>
> I'm not sure what's rare about power failures. Unlike single sector
> errors, my machine actually has a button that produces exactly that
> event. Running degraded raid5 arrays for extended periods may be
> slightly unusual configuration, but I suspect people should just do
> that for testing. (And from the discussion, people seem to think that
> degraded raid5 is equivalent to raid0).

Power failures after a full drive failure with a split write during a rebuild?

>
>>> Otherwise, file systems placed on these devices can suffer silent data
>>> and file system corruption.  An forced use of fsck may detect metadata
>>> corruption resulting in file system corruption, but will not suffice
>>> to detect data corruption.
>>>
>>
>> This is very misleading. All storage "can" have silent data loss, you are
>> making a statement without specifics about frequency.
>
> substitute with "can (by design)"?

By Pavel's unproven casual observation?

>
> Now, if you can suggest useful version of that document meeting your
> criteria?
>
> 								Pavel

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/