[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20160218001223.GJ19486@dastard>
Date: Thu, 18 Feb 2016 11:12:23 +1100
From: Dave Chinner <david@...morbit.com>
To: Ross Zwisler <ross.zwisler@...ux.intel.com>,
Jan Kara <jack@...e.cz>, linux-kernel@...r.kernel.org,
"J. Bruce Fields" <bfields@...ldses.org>,
Theodore Ts'o <tytso@....edu>,
Alexander Viro <viro@...iv.linux.org.uk>,
Andreas Dilger <adilger.kernel@...ger.ca>,
Andrew Morton <akpm@...ux-foundation.org>,
Dan Williams <dan.j.williams@...el.com>,
Jan Kara <jack@...e.com>,
Jeff Layton <jlayton@...chiereds.net>,
Jens Axboe <axboe@...nel.dk>,
Matthew Wilcox <willy@...ux.intel.com>,
linux-block@...r.kernel.org, linux-ext4@...r.kernel.org,
linux-fsdevel@...r.kernel.org, linux-mm@...ck.org,
linux-nvdimm@...ts.01.org, xfs@....sgi.com
Subject: Re: [PATCH v3 3/6] ext4: Online defrag not supported with DAX
On Wed, Feb 17, 2016 at 02:50:37PM -0700, Ross Zwisler wrote:
> On Tue, Feb 16, 2016 at 08:34:16PM -0700, Ross Zwisler wrote:
> > Online defrag operations for ext4 are hard coded to use the page cache.
> > See ext4_ioctl() -> ext4_move_extents() -> move_extent_per_page()
> >
> > When combined with DAX I/O, which circumvents the page cache, this can
> > result in data corruption. This was observed with xfstests ext4/307 and
> > ext4/308.
> >
> > Fix this by only allowing online defrag for non-DAX files.
>
> Jan,
>
> Thinking about this a bit more, it's probably the case that the data
> corruption I was observing was due to us skipping the writeback of the dirty
> page cache pages because S_DAX was set.
>
> I do think we have a problem with defrag because it is doing the extent
> swapping using the page cache, and we won't flush the dirty pages due to
> S_DAX being set.
>
> This patch is the quick and easy answer, and is perhaps appropriate for v4.5.
>
> Looking forward, though, what do you think the correct solution is? Making an
> extent swapper that doesn't use the page cache (as I believe XFS has? see
> xfs_swap_extents()),
XFS does the data copy in userspace using direct IO so we don't
care about whether DAX is enabled or not on either the source or
destination inode. i.e. xfs_swap_extents() is a pure
metadata operation, swapping the entire extent tree between two
inodes if the source data has not changed while the copy was in
progress.
Cheers,
Dave.
--
Dave Chinner
david@...morbit.com
Powered by blists - more mailing lists