[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130622143053.GF4727@thunk.org>
Date: Sat, 22 Jun 2013 10:30:53 -0400
From: Theodore Ts'o <tytso@....edu>
To: "Sidorov, Andrei" <Andrei.Sidorov@...isi.com>
Cc: "Joseph D. Wagner" <joe@...ephdwagner.info>,
"linux-ext4@...r.kernel.org" <linux-ext4@...r.kernel.org>,
Ryan Lortie <desrt@...rt.ca>
Subject: Re: ext4 file replace guarantees
On Sat, Jun 22, 2013 at 02:01:39PM +0000, Sidorov, Andrei wrote:
>
> This doesn't work in power loss scenario.
> First of all majority of hdd's still have 512b sectors, so it is possible that
> hdd won't have a chance to write all 8 sectors.
> This doesn't work even with 4k drives because they are susceptible to spliced
> sector writes. Well, 512b are susceptible too, but 4k drives have wider
> window.
Torn writes can happen, yes, but they are relatively rare. Most file
systems don't protect against them, so if you're worried about that
sort of thing, you need to go beyond using fsync(). Even if you are
using a file system with metadata journalling, in the case of a torn
write, we'll detect the corrupt metadata, but at that point guarantees
about what files will be accessible are out the window. Fortunately,
this is not a common event.
There are techniques for protecting against torn writes, but they have
engineering tradeoffs, which you may or may not be willing to live
with. After all, if you're worried about these sorts of things,
hopefully you will have engineered your system to deal with other
events which are at the a similar or higher levels of probability ---
such as the hard drive developing bad sectors (which is generally how
most HDD's treat sectors that are incompletely written due to spliced
sector writes) or even dying catastrophically.
For many of the use cases that Ryan and GNOME have been dealing with,
which are desktop apps where the precious data at question are things
like the high score board for games, or the window position of desktop
applications, this is probably beyond what they need to be concerned
with.
(And at the industrial data center scale, you may use very different
techniques --- such as computer-level or rack-level battery backups,
diesel generators, cloud file systems which send the data to multiple
different servers on multiple different racks, etc. And at that
scale, you might not even use a file system journal or send CACHE
FLUSH commands, because you've engineered the entire system against
failure, and you accept the fact that having multiple levels of power
backup fails, or multiple HDD's all dying at the same time before the
cloud file system has a chance to rereplicate the data, is good
enough. Nothing is ever going to be 100% perfect; there's only a level
of data integrity which you are willing to pay for.)
- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists