[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090329140005.GI15492@elf.ucw.cz>
Date: Sun, 29 Mar 2009 16:00:05 +0200
From: Pavel Machek <pavel@....cz>
To: Artem Bityutskiy <Artem.Bityutskiy@...ia.com>
Cc: Artem Bityutskiy <dedekind@...dex.ru>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: replace() system call needed (was Re: EXT4-ish "fixes" in UBIFS)
On Sun 2009-03-29 16:57:06, Artem Bityutskiy wrote:
> ext Pavel Machek wrote:
>> On Sun 2009-03-29 16:07:35, Artem Bityutskiy wrote:
>>> Pavel Machek wrote:
>>>> On Sun 2009-03-29 16:00:45, Artem Bityutskiy wrote:
>>>>> Pavel Machek wrote:
>>>>>>>>> 2. create/write/rename leads to empty files
>>>>>>>> ..but this should not be. If we want to make that explicit, we should
>>>>>>>> provide "replace()" operation; where replace is rename that makes sure
>>>>>>>> that source file is completely on media before commiting the rename.
>>>>>>> Well, OK, we can fsync() before rename, we just need clean rules
>>>>>>> for this, so that all Linux FSes would follow them. Would be nice
>>>>>>> to have final agreement on all this stuff.
>>>>>> My proposal is
>>>>>>
>>>>>> rename() stays.
>>>>> It stays and:
>>>>>
>>>>> 1. does _not_ fsync
>>>> Does not fsync. If someone wants to make sure one of the files is on
>>>> the disk, he should use replace(). [On non-linux systems, replace()
>>>> should be implemented as fsync/rename in libc or something.]
>>> I would be happy with these rules. But the fact is, application
>>> people just refuse to add fsync before rename. They say that the
>>> FS has to do this. And they say that even Linus supports them,
>>
>> That's good. fsync before rename would be ugly regression (on ext3 at
>> least). We should get them to use replace() syscall, not get them to
>> add fsyncs. [Of course, that means we need replace syscall first. :-)]
>
> I'd say it is better to fix ext3 then.
? I don't get this.
ext3's rename() is already equivalent to proposed replace(). The
problem is that btrfs's and ubifs's renames are not.
So doing extra fsync() on ext3 is actually an performance regression
-> we do not want applications to randomly add open-coded fsyncs().
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists