[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180418180903.GD9897@fieldses.org>
Date: Wed, 18 Apr 2018 14:09:03 -0400
From: bfields@...ldses.org (J. Bruce Fields)
To: Andres Freund <andres@...razel.de>
Cc: Andreas Dilger <adilger@...ger.ca>,
20180410184356.GD3563@...nk.org,
"Theodore Y. Ts'o" <tytso@....edu>,
Ext4 Developers List <linux-ext4@...r.kernel.org>,
Linux FS Devel <linux-fsdevel@...r.kernel.org>,
Jeff Layton <jlayton@...hat.com>,
"Joshua D. Drake" <jd@...mandprompt.com>
Subject: Re: fsync() errors is unsafe and risks data loss
On Wed, Apr 11, 2018 at 07:17:52PM -0700, Andres Freund wrote:
> Hi,
>
> On 2018-04-11 15:52:44 -0600, Andreas Dilger wrote:
> > On Apr 10, 2018, at 4:07 PM, Andres Freund <andres@...razel.de> wrote:
> > > 2018-04-10 18:43:56 Ted wrote:
> > >> So for better or for worse, there has not been as much investment in
> > >> buffered I/O and data robustness in the face of exception handling of
> > >> storage devices.
> > >
> > > That's a bit of a cop out. It's not just databases that care. Even more
> > > basic tools like SCM, package managers and editors care whether they can
> > > proper responses back from fsync that imply things actually were synced.
> >
> > Sure, but it is mostly PG that is doing (IMHO) crazy things like writing
> > to thousands(?) of files, closing the file descriptors, then expecting
> > fsync() on a newly-opened fd to return a historical error.
>
> It's not just postgres. dpkg (underlying apt, on debian derived distros)
> to take an example I just randomly guessed, does too:
> /* We want to guarantee the extracted files are on the disk, so that the
> * subsequent renames to the info database do not end up with old or zero
> * length files in case of a system crash. As neither dpkg-deb nor tar do
> * explicit fsync()s, we have to do them here.
> * XXX: This could be avoided by switching to an internal tar extractor. */
> dir_sync_contents(cidir);
>
> (a bunch of other places too)
>
> Especially on ext3 but also on newer filesystems it's performancewise
> entirely infeasible to fsync() every single file individually - the
> performance becomes entirely attrocious if you do that.
Is that still true if you're able to use some kind of parallelism?
(async io, or fsync from multiple processes?)
--b.
Powered by blists - more mailing lists