[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4A9BCCEF.7010402@redhat.com>
Date: Mon, 31 Aug 2009 09:15:27 -0400
From: Ric Wheeler <rwheeler@...hat.com>
To: Christoph Hellwig <hch@...radead.org>
CC: Michael Tokarev <mjt@....msk.ru>, david@...g.hm,
Pavel Machek <pavel@....cz>, Theodore Tso <tytso@....edu>,
NeilBrown <neilb@...e.de>, Rob Landley <rob@...dley.net>,
Florian Weimer <fweimer@....de>,
Goswin von Brederlow <goswin-v-b@....de>,
kernel list <linux-kernel@...r.kernel.org>,
Andrew Morton <akpm@...l.org>, mtk.manpages@...il.com,
rdunlap@...otime.net, linux-doc@...r.kernel.org,
linux-ext4@...r.kernel.org, corbet@....net
Subject: Re: raid is dangerous but that's secret (was Re: [patch] ext2/3:
document conditions when reliable operation is possible)
On 08/30/2009 12:35 PM, Christoph Hellwig wrote:
> On Sun, Aug 30, 2009 at 06:44:04PM +0400, Michael Tokarev wrote:
>>> If you lose power with the write caches enabled on that same 5 drive
>>> RAID set, you could lose as much as 5 * 32MB of freshly written data on
>>> a power loss (16-32MB write caches are common on s-ata disks these
>>> days).
>>
>> This is fundamentally wrong. Many filesystems today use either barriers
>> or flushes (if barriers are not supported), and the times when disk drives
>> were lying to the OS that the cache got flushed are long gone.
>
> While most common filesystem do have barrier support it is:
>
> - not actually enabled for the two most common filesystems
> - the support for write barriers an cache flushing tends to be buggy
> all over our software stack,
>
Or just missing - I think that MD5/6 simply drop the requests at present.
I wonder if it would be worth having MD probe for write cache enabled & warn if
barriers are not supported?
>>> For MD5 (and MD6), you really must run with the write cache disabled
>>> until we get barriers to work for those configurations.
>>
>> I highly doubt barriers will ever be supported on anything but simple
>> raid1, because it's impossible to guarantee ordering across multiple
>> drives. Well, it *is* possible to have write barriers with journalled
>> (and/or with battery-backed-cache) raid[456].
>>
>> Note that even if raid[456] does not support barriers, write cache
>> flushes still works.
>
> All currently working barrier implementations on Linux are built upon
> queue drains and cache flushes, plus sometimes setting the FUA bit.
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists