[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090830163513.GA25899@infradead.org>
Date: Sun, 30 Aug 2009 12:35:13 -0400
From: Christoph Hellwig <hch@...radead.org>
To: Michael Tokarev <mjt@....msk.ru>
Cc: Ric Wheeler <rwheeler@...hat.com>, david@...g.hm,
Pavel Machek <pavel@....cz>, Theodore Tso <tytso@....edu>,
NeilBrown <neilb@...e.de>, Rob Landley <rob@...dley.net>,
Florian Weimer <fweimer@....de>,
Goswin von Brederlow <goswin-v-b@....de>,
kernel list <linux-kernel@...r.kernel.org>,
Andrew Morton <akpm@...l.org>, mtk.manpages@...il.com,
rdunlap@...otime.net, linux-doc@...r.kernel.org,
linux-ext4@...r.kernel.org, corbet@....net
Subject: Re: raid is dangerous but that's secret (was Re: [patch] ext2/3:
document conditions when reliable operation is possible)
On Sun, Aug 30, 2009 at 06:44:04PM +0400, Michael Tokarev wrote:
>> If you lose power with the write caches enabled on that same 5 drive
>> RAID set, you could lose as much as 5 * 32MB of freshly written data on
>> a power loss (16-32MB write caches are common on s-ata disks these
>> days).
>
> This is fundamentally wrong. Many filesystems today use either barriers
> or flushes (if barriers are not supported), and the times when disk drives
> were lying to the OS that the cache got flushed are long gone.
While most common filesystem do have barrier support it is:
- not actually enabled for the two most common filesystems
- the support for write barriers an cache flushing tends to be buggy
all over our software stack,
>> For MD5 (and MD6), you really must run with the write cache disabled
>> until we get barriers to work for those configurations.
>
> I highly doubt barriers will ever be supported on anything but simple
> raid1, because it's impossible to guarantee ordering across multiple
> drives. Well, it *is* possible to have write barriers with journalled
> (and/or with battery-backed-cache) raid[456].
>
> Note that even if raid[456] does not support barriers, write cache
> flushes still works.
All currently working barrier implementations on Linux are built upon
queue drains and cache flushes, plus sometimes setting the FUA bit.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists