[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <1371764058.18527.140661246414097.671B4999@webmail.messagingengine.com>
Date: Thu, 20 Jun 2013 17:34:18 -0400
From: Ryan Lortie <desrt@...rt.ca>
To: linux-ext4@...r.kernel.org
Subject: ext4 file replace guarantees
hi,
I recently read the kernel documentation on the topic of guarantees
provided by ext4 when renaming-over-existing. I found this:
(*) == default
auto_da_alloc(*) Many broken applications don't use fsync() when
noauto_da_alloc replacing existing files via patterns such
as
fd =
open("foo.new")/write(fd,..)/close(fd)/
rename("foo.new", "foo"), or
worse yet,
fd = open("foo",
O_TRUNC)/write(fd,..)/close(fd).
If auto_da_alloc is enabled,
ext4 will detect
the replace-via-rename and
replace-via-truncate
patterns and force that any
delayed allocation
blocks are allocated such that
at the next
journal commit, in the default
data=ordered
mode, the data blocks of the new
file are forced
to disk before the rename()
operation is
committed. This provides
roughly the same level
of guarantees as ext3, and
avoids the
"zero-length" problem that can
happen when a
system crashes before the
delayed allocation
blocks are forced to disk.
in https://www.kernel.org/doc/Documentation/filesystems/ext4.txt
which says to me "replace by rename is guaranteed safe in modern ext4,
under default mount options".
I understand that this was added after the "ext4 is eating my data"
panic in 2009.
Knowing that ext4 provides this guarantee caused me to modify GLib to
remove the fsync() that we used to do from g_file_set_contents(), if we
detect that we are on ext2/3/4:
https://git.gnome.org/browse/glib/commit/?id=9d0c17b50102267a5029b58b1f44efbad82d8f03
(we already skipped the fsync() on btrfs since this filesystem
guarantees that replace-by-rename is safe):
"""
What are the crash guarantees of overwrite-by-rename?
Overwriting an existing file using a rename is atomic. That means that
either the old content of the file is there or the new content. A
sequence like this:
"""
in
https://btrfs.wiki.kernel.org/index.php/FAQ#What_are_the_crash_guarantees_of_overwrite-by-rename.3F
We don't really care too much about ext2 (although it would be great if
there was a convenient API to detect the difference between
ext2/ext3/ext4 filesystems since they all share one magic number).
Anyway... by mistake, this patch (removing fsync on ext4) got backported
into one of our stable releases and landed in Debian and the Fedora 19
beta, where many users started reporting data loss.
So what's the story here? Is this safe or not?
The _only_ thing that I can think of is that GLib also does an
fallocate() before writing the data. Does doing fallocate() before
write() void the rename-is-safe guarantees or is this just a filesystem
bug?
In any case, we have reverted the patch for now to work around the
issue.
It would be great if I could find out some official word on what the
guaranteed behaviour of the filesystem is with respect to
replace-by-rename. Trying to dance around these issues is starting to
get a bit annoying...
Thanks in advance.
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists