[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190531192119.GB3066@mit.edu>
Date: Fri, 31 May 2019 15:21:19 -0400
From: "Theodore Ts'o" <tytso@....edu>
To: Amir Goldstein <amir73il@...il.com>
Cc: Jan Kara <jack@...e.cz>,
"Darrick J . Wong" <darrick.wong@...cle.com>,
Dave Chinner <david@...morbit.com>, Chris Mason <clm@...com>,
Al Viro <viro@...iv.linux.org.uk>,
linux-fsdevel <linux-fsdevel@...r.kernel.org>,
linux-xfs <linux-xfs@...r.kernel.org>,
Ext4 <linux-ext4@...r.kernel.org>,
Linux Btrfs <linux-btrfs@...r.kernel.org>,
Linux API <linux-api@...r.kernel.org>
Subject: Re: [RFC][PATCH] link.2: AT_ATOMIC_DATA and AT_ATOMIC_METADATA
On Fri, May 31, 2019 at 08:22:06PM +0300, Amir Goldstein wrote:
> >
> > This is I think more precise:
> >
> > This guarantee can be achieved by calling fsync(2) before linking
> > the file, but there may be more performant ways to provide these
> > semantics. In particular, note that the use of the AT_ATOMIC_DATA
> > flag does *not* guarantee that the new link created by linkat(2)
> > will be persisted after a crash.
>
> OK. Just to be clear, mentioning hardlinks and st_link is not needed
> in your opinion?
Your previous text stated that it was undefined what would happen to
all hardlinks belonging to the file, and that would imply that if a
file had N hard links, some in the directory which we are modifying,
and some in other directories, that somehow any of them might not be
present after the crash. And that's not the case. Suppose the file
currently has hardlinks test1/foo, test1/quux, and test2/baz --- and
we've called syncfs(2) on the file system so everything is persisted,
and then linkat(2) is used to create a new hardlink, test1/bar.
After a crash, the existence of test1/foo, test1/quux, and test2/baz
are not in question. It's only unclear whether or not test1/bar
exists after the crash.
As far as st_nlink is concerned, the presumption is that the file
system itself will be consistent after the crash. So if the hard link
has been persisted, then st_nlink will be incremented, if it has not,
it won't be.
Finally, one thing which gets hard about trying to state these sorts
of things as guarantees. Sometimes, the file system won't *know*
whether or not it can make these guarantees. For example what should
we do if the file system is mounted with nobarrier? If the overall
hardware design includes UPS's or some other kind of battery backup,
the guarantee may very well exist. But the file system code can't
know whether or not that is the case. So my inclination is to allow
the file system to accept the flag even if the mount option nobarrier
is in play --- but in that case, the guarantee is only if the rest of
the system is designed appropriately.
(For that matter, it used to be that there existed hard drives that
lied about whether they had a writeback cache, and/or made the CACHE
FLUSH a no-op so they could win the Winbench benchmarketing wars,
which was worth millions and millions of dollars in sales. So we can
only assume that the hardware isn't lying to us when we use words like
"guarantee".)
- Ted
Powered by blists - more mailing lists