[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170510114840.GF25137@quack2.suse.cz>
Date: Wed, 10 May 2017 13:48:40 +0200
From: Jan Kara <jack@...e.cz>
To: Jeff Layton <jlayton@...hat.com>
Cc: linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-btrfs@...r.kernel.org, linux-ext4@...r.kernel.org,
linux-cifs@...r.kernel.org, linux-nfs@...r.kernel.org,
linux-mm@...ck.org, jfs-discussion@...ts.sourceforge.net,
linux-xfs@...r.kernel.org, cluster-devel@...hat.com,
linux-f2fs-devel@...ts.sourceforge.net,
v9fs-developer@...ts.sourceforge.net, linux-nilfs@...r.kernel.org,
linux-block@...r.kernel.org, dhowells@...hat.com,
akpm@...ux-foundation.org, hch@...radead.org,
ross.zwisler@...ux.intel.com, mawilcox@...rosoft.com,
jack@...e.com, viro@...iv.linux.org.uk, corbet@....net,
neilb@...e.de, clm@...com, tytso@....edu, axboe@...nel.dk,
josef@...icpanda.com, hubcap@...ibond.com, rpeterso@...hat.com,
bo.li.liu@...cle.com
Subject: Re: [PATCH v4 14/27] fs: new infrastructure for writeback error
handling and reporting
On Tue 09-05-17 11:49:17, Jeff Layton wrote:
> Most filesystems currently use mapping_set_error and
> filemap_check_errors for setting and reporting/clearing writeback errors
> at the mapping level. filemap_check_errors is indirectly called from
> most of the filemap_fdatawait_* functions and from
> filemap_write_and_wait*. These functions are called from all sorts of
> contexts to wait on writeback to finish -- e.g. mostly in fsync, but
> also in truncate calls, getattr, etc.
>
> The non-fsync callers are problematic. We should be reporting writeback
> errors during fsync, but many places spread over the tree clear out
> errors before they can be properly reported, or report errors at
> nonsensical times.
>
> If I get -EIO on a stat() call, there is no reason for me to assume that
> it is because some previous writeback failed. The fact that it also
> clears out the error such that a subsequent fsync returns 0 is a bug,
> and a nasty one since that's potentially silent data corruption.
>
> This patch adds a small bit of new infrastructure for setting and
> reporting errors during address_space writeback. While the above was my
> original impetus for adding this, I think it's also the case that
> current fsync semantics are just problematic for userland. Most
> applications that call fsync do so to ensure that the data they wrote
> has hit the backing store.
>
> In the case where there are multiple writers to the file at the same
> time, this is really hard to determine. The first one to call fsync will
> see any stored error, and the rest get back 0. The processes with open
> fds may not be associated with one another in any way. They could even
> be in different containers, so ensuring coordination between all fsync
> callers is not really an option.
>
> One way to remedy this would be to track what file descriptor was used
> to dirty the file, but that's rather cumbersome and would likely be
> slow. However, there is a simpler way to improve the semantics here
> without incurring too much overhead.
>
> This set adds an errseq_t to struct address_space, and a corresponding
> one is added to struct file. Writeback errors are recorded in the
> mapping's errseq_t, and the one in struct file is used as the "since"
> value.
>
> This changes the semantics of the Linux fsync implementation such that
> applications can now use it to determine whether there were any
> writeback errors since fsync(fd) was last called (or since the file was
> opened in the case of fsync having never been called).
>
> Note that those writeback errors may have occurred when writing data
> that was dirtied via an entirely different fd, but that's the case now
> with the current mapping_set_error/filemap_check_error infrastructure.
> This will at least prevent you from getting a false report of success.
>
> The new behavior is still consistent with the POSIX spec, and is more
> reliable for application developers. This patch just adds some basic
> infrastructure for doing this. Later patches will change the existing
> code to use this new infrastructure.
>
> Signed-off-by: Jeff Layton <jlayton@...hat.com>
Just one nit below. Otherwise the patch looks good to me. You can add:
Reviewed-by: Jan Kara <jack@...e.cz>
> diff --git a/fs/file_table.c b/fs/file_table.c
> index 954d510b765a..d6138b6411ff 100644
> --- a/fs/file_table.c
> +++ b/fs/file_table.c
> @@ -168,6 +168,7 @@ struct file *alloc_file(const struct path *path, fmode_t mode,
> file->f_path = *path;
> file->f_inode = path->dentry->d_inode;
> file->f_mapping = path->dentry->d_inode->i_mapping;
> + file->f_wb_err = filemap_sample_wb_error(file->f_mapping);
Why do you sample here when you also sample in do_dentry_open()? I didn't
find any alloc_file() callers that would possibly care about writeback
errors...
Honza
--
Jan Kara <jack@...e.com>
SUSE Labs, CR
Powered by blists - more mailing lists