[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <47B98D8A.7090506@emc.com>
Date: Mon, 18 Feb 2008 08:52:10 -0500
From: Ric Wheeler <ric@....com>
To: Michael Tokarev <mjt@....msk.ru>
CC: device-mapper development <dm-devel@...hat.com>,
Andi Kleen <andi@...stfloor.org>, linux-kernel@...r.kernel.org
Subject: Re: [dm-devel] Re: [PATCH] Implement barrier support for single device
DM devices
Michael Tokarev wrote:
> Ric Wheeler wrote:
>> Alasdair G Kergon wrote:
>>> On Fri, Feb 15, 2008 at 03:20:10PM +0100, Andi Kleen wrote:
>>>> On Fri, Feb 15, 2008 at 04:07:54PM +0300, Michael Tokarev wrote:
>>>>> I wonder if it's worth the effort to try to implement this.
>>> My personal view (which seems to be in the minority) is that it's a
>>> waste of our development time *except* in the (rare?) cases similar to
>>> the ones Andi is talking about.
>> Using working barriers is important for normal users when you really
>> care about data loss and have normal drives in a box. We do power fail
>> testing on boxes (with reiserfs and ext3) and can definitely see a lot
>> of file system corruption eliminated over power failures when barriers
>> are enabled properly.
>>
>> It is not unreasonable for some machines to disable barriers to get a
>> performance boost, but I would not do that when you are storing things
>> you really need back.
>
> The talk here is about something different - about supporting barriers
> on md/dm devices, i.e., on pseudo-devices which uses multiple real devices
> as components (software RAIDs etc). In this "world" it's nearly impossible
> to support barriers if there are more than one underlying component device,
> barriers only works if there's only one component. And the talk is about
> supporting barriers only in "minority" of cases - mostly for simplest
> device-mapper case only, NOT covering any raid1 or other "fancy" configurations.
I understand that. Most of the time, dm or md devices are composed of
uniform components which will uniformly support (or not) the cache flush
commands used by barriers.
>
>> Of course, you don't need barriers when you either disable the write
>> cache on the drives or use a battery backed RAID array which gives you a
>> write cache that will survive power outages...
>
> Two things here.
>
> First, I still don't understand why in God's sake barriers are "working"
> while regular cache flushes are not. Almost no consumer-grade hard drive
> supports write barriers, but they all support regular cache flushes, and
> the latter should be enough (while not the most speed-optimal) to ensure
> data safety. Why to require write cache disable (like in XFS FAQ) instead
> of going the flush-cache-when-appropriate (as opposed to write-barrier-
> when-appropriate) way?
Barriers have different flavors, but can be composed of "cache" flushes
which are supported on all drives that I have seen (S-ATA and ATA) for
many years now. That is the flavor of barriers that we test with S-ATA &
ATA drives.
The issue is that without flushing/invalidating (or other way of
controlling the behavior of your storage), the file system has no way to
make sure that all data is on persistent & non-volatile media.
>
> And second, "surprisingly", battery-backed RAID write caches tends to fail
> too, sometimes... ;) Usually, such a battery is enough to keep the data
> in memory for several hours only (sine many RAID controllers uses regular
> RAM for memory caches, which requires some power to keep its state), --
> I come across this issue the hard way, and realized that only very few
> persons around me who manages raid systems even knows about this problem -
> that the battery-backed cache is only for some time... For example,
> power failed at evening, and by tomorrow morning, batteries are empty
> already. Or, with better batteries, think about a weekend... ;)
> (I've seen some vendors now uses flash-based backing store for caches
> instead, which should ensure far better results here).
>
> /mjt
>
That is why you need to get a good array, not just a simple controller ;-)
Most arrays do not use batteries to hold up the write cache, they use
the batteries to move any cached data to non-volatile media in the time
that the batteries hold up.
You could certainly get this kind of behavior from the flash scheme you
describe above as well...
ric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists