lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20151101233632.GG10656@dastard>
Date:	Mon, 2 Nov 2015 10:36:32 +1100
From:	Dave Chinner <david@...morbit.com>
To:	Dan Williams <dan.j.williams@...el.com>
Cc:	Ross Zwisler <ross.zwisler@...ux.intel.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"H. Peter Anvin" <hpa@...or.com>,
	"J. Bruce Fields" <bfields@...ldses.org>,
	Theodore Ts'o <tytso@....edu>,
	Alexander Viro <viro@...iv.linux.org.uk>,
	Andreas Dilger <adilger.kernel@...ger.ca>,
	Ingo Molnar <mingo@...hat.com>, Jan Kara <jack@...e.com>,
	Jeff Layton <jlayton@...chiereds.net>,
	Matthew Wilcox <willy@...ux.intel.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	linux-ext4@...r.kernel.org,
	linux-fsdevel <linux-fsdevel@...r.kernel.org>,
	Linux MM <linux-mm@...ck.org>,
	"linux-nvdimm@...ts.01.org" <linux-nvdimm@...ts.01.org>,
	X86 ML <x86@...nel.org>, xfs@....sgi.com,
	Andrew Morton <akpm@...ux-foundation.org>,
	Matthew Wilcox <matthew.r.wilcox@...el.com>
Subject: Re: [RFC 00/11] DAX fsynx/msync support

On Fri, Oct 30, 2015 at 12:51:40PM -0700, Dan Williams wrote:
> On Fri, Oct 30, 2015 at 12:43 PM, Ross Zwisler
> <ross.zwisler@...ux.intel.com> wrote:
> > On Fri, Oct 30, 2015 at 11:34:07AM -0700, Dan Williams wrote:
> >> This is great to have when the flush-the-world solution ends up
> >> killing performance.  However, there are a couple mitigating options
> >> for workloads that dirty small amounts and flush often that we need to
> >> collect data on:
> >>
> >> 1/ Using cache management and pcommit from userspace to skip calls to
> >> msync / fsync.  Although, this does not eliminate all calls to
> >> blkdev_issue_flush as the fs may invoke it for other reasons.  I
> >> suspect turning on REQ_FUA support eliminates a number of those
> >> invocations, and pmem already satisfies REQ_FUA semantics by default.
> >
> > Sure, I'll turn on REQ_FUA in addition to REQ_FLUSH - I agree that PMEM
> > already handles the requirements of REQ_FUA, but I didn't realize that it
> > might reduce the number of REQ_FLUSH bios we receive.
> 
> I'll let Dave chime in, but a lot of the flush requirements come from
> guaranteeing the state of the metadata, if metadata updates can be
> done with REQ_FUA then there is no subsequent need to flush.

No need for cache flushes in this case, but we still need the IO
scheduler to order such operations correctly.

> >> 2/ Turn off DAX and use the page cache.  As Dave mentions [1] we
> >> should enable this control on a per-inode basis.  I'm folding in this
> >> capability as a blkdev_ioctl for the next version of the raw block DAX
> >> support patch.
> >
> > Umm...I think you just said "the way to avoid this delay is to just not use
> > DAX".  :)  I don't think this is where we want to go - we are trying to make
> > DAX better, not abandon it.
> 
> That's a bit of an exaggeration.  Avoiding DAX where it is not
> necessary is not "abandoning DAX", it's using the right tool for the
> job.  Page cache is fine for many cases.

Think btrfs - any file that uses COW can't use DAX for write.
Everything has to be buffered, unless the nodatacow flag is set, and
then DAX can be used. Indeed, on ext4 if you are using file
encryption you can't use DAX.

IOWs, we already know that we have to support mixed DAX/non-DAX
access within the same filesystem, so I'm with Dan here...

Cheers,

Dave.
-- 
Dave Chinner
david@...morbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ