[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20151030035533.GU19199@dastard>
Date: Fri, 30 Oct 2015 14:55:33 +1100
From: Dave Chinner <david@...morbit.com>
To: Ross Zwisler <ross.zwisler@...ux.intel.com>
Cc: linux-kernel@...r.kernel.org, "H. Peter Anvin" <hpa@...or.com>,
"J. Bruce Fields" <bfields@...ldses.org>,
Theodore Ts'o <tytso@....edu>,
Alexander Viro <viro@...iv.linux.org.uk>,
Andreas Dilger <adilger.kernel@...ger.ca>,
Dan Williams <dan.j.williams@...el.com>,
Ingo Molnar <mingo@...hat.com>, Jan Kara <jack@...e.com>,
Jeff Layton <jlayton@...chiereds.net>,
Matthew Wilcox <willy@...ux.intel.com>,
Thomas Gleixner <tglx@...utronix.de>,
linux-ext4@...r.kernel.org, linux-fsdevel@...r.kernel.org,
linux-mm@...ck.org, linux-nvdimm@...ts.01.org, x86@...nel.org,
xfs@....sgi.com, Andrew Morton <akpm@...ux-foundation.org>,
Matthew Wilcox <matthew.r.wilcox@...el.com>
Subject: Re: [RFC 00/11] DAX fsynx/msync support
On Thu, Oct 29, 2015 at 02:12:04PM -0600, Ross Zwisler wrote:
> This patch series adds support for fsync/msync to DAX.
>
> Patches 1 through 8 add various utilities that the DAX code will eventually
> need, and the DAX code itself is added by patch 9. Patches 10 and 11 are
> filesystem changes that are needed after the DAX code is added, but these
> patches may change slightly as the filesystem fault handling for DAX is
> being modified ([1] and [2]).
>
> I've marked this series as RFC because I'm still testing, but I wanted to
> get this out there so people would see the direction I was going and
> hopefully comment on any big red flags sooner rather than later.
>
> I realize that we are getting pretty dang close to the v4.4 merge window,
> but I think that if we can get this reviewed and working it's a much better
> solution than the "big hammer" approach that blindly flushes entire PMEM
> namespaces [3].
We need the "big hammer" regardless of fsync. If REQ_FLUSH and
REQ_FUA don't do the right thing when it comes to ordering journal
writes against other IO operations, then the filesystems are not
crash safe. i.e. we need REQ_FLUSH/REQ_FUA to commit all outstanding
changes back to stable storage, just like they do for existing
storage....
> [1] http://oss.sgi.com/archives/xfs/2015-10/msg00523.html
> [2] http://marc.info/?l=linux-ext4&m=144550211312472&w=2
> [3] https://lists.01.org/pipermail/linux-nvdimm/2015-October/002614.html
>
> Ross Zwisler (11):
> pmem: add wb_cache_pmem() to the PMEM API
> mm: add pmd_mkclean()
> pmem: enable REQ_FLUSH handling
> dax: support dirty DAX entries in radix tree
> mm: add follow_pte_pmd()
> mm: add pgoff_mkclean()
> mm: add find_get_entries_tag()
> fs: add get_block() to struct inode_operations
I don't think this is the right thing to do - it propagates the use
of bufferheads as a mapping structure into places where we do not
want bufferheads. We've recently added a similar block mapping
interface to the export operations structure for PNFS and that uses
a "struct iomap" which is far more suited to being an inode
operation this.
We have plans to move this to the inode operations for various
reasons. e.g: multipage write, adding interfaces that support proper
mapping of holes, etc:
https://www.redhat.com/archives/cluster-devel/2014-October/msg00167.html
So after many years of saying no to moving getblocks to the inode
operations it seems like the wrong thing to do now considering I
want to convert all the DAX code to use iomaps while only 2/3
filesystems are supported...
> dax: add support for fsync/sync
Why put the dax_flush_mapping() in do_writepages()? Why not call it
directly from the filesystem ->fsync() implementations where a
getblocks callback could also be provided?
Cheers,
Dave.
--
Dave Chinner
david@...morbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists