[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <49D0AA4A.6020308@redhat.com>
Date: Mon, 30 Mar 2009 07:17:30 -0400
From: Ric Wheeler <rwheeler@...hat.com>
To: "Andreas T.Auer" <andreas.t.auer_lkml_73537@...us.ath.cx>
CC: Alan Cox <alan@...rguk.ukuu.org.uk>, Theodore Tso <tytso@....edu>,
Mark Lord <lkml@....ca>,
Stefan Richter <stefanr@...6.in-berlin.de>,
Jeff Garzik <jeff@...zik.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Matthew Garrett <mjg59@...f.ucam.org>,
Andrew Morton <akpm@...ux-foundation.org>,
David Rees <drees76@...il.com>, Jesper Krogh <jesper@...gh.cc>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: Linux 2.6.29
Andreas T.Auer wrote:
> On 30.03.2009 11:05 Alan Cox wrote:
>
>>> It seems you still didn't get the point. ext3 data=ordered is not the
>>> problem. The problem is that the average developer doesn't expect the fs
>>> to _re-order_ stuff. This is how most common fs did work long before
>>>
>>>
>> No it isn´t. Standard Unix file systems made no such guarantee and would
>> write out data out of order. The disk scheduler would then further
>> re-order things.
>>
>>
>>
> You surely know that better: Did fs actually write "later" data quite
> long before "earlier" data? During the flush data may be re-ordered, but
> was it also _done_ outside of it?
>
People keep forgetting that storage (even on your commodity s-ata class
of drives) has very large & volatile cache. The disk firmware can hold
writes in that cache as long as it wants, reorder its writes into
anything that makes sense and has no explicit ordering promises.
This is where the write barrier code comes in - for file systems that
care about ordering for data, we use barrier ops to impose the required
ordering.
In a similar way, fsync() gives applications the power to impose their
own ordering.
If we assume that we can "save" an fsync cost with ordering mode, we
have to keep in mind that the file system will need to do the expensive
cache flushes in order to preserve its internal ordering.
>
>> If you think the ¨guarantees¨ from before ext3 are normal defaults you´ve
>> been writing junk code
>>
>>
>>
> I'm still on ReiserFS since it was considered stable in some SuSE 7.x.
> And I expected it to be fairly ordered, but as a network protocol
> programmer I didn't rely on the ordering of fs write-outs yet.
>
With reiserfs, you will have barriers on by default in SLES/opensuse
which will keep (at least fs meta-data) properly ordered....
ric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists