lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 3 Sep 2015 10:57:25 +1000
From:	Dave Chinner <david@...morbit.com>
To:	"Kirill A. Shutemov" <kirill@...temov.name>
Cc:	Ross Zwisler <ross.zwisler@...ux.intel.com>,
	linux-kernel@...r.kernel.org,
	Alexander Viro <viro@...iv.linux.org.uk>,
	Andrew Morton <akpm@...l.org>, Christoph Hellwig <hch@....de>,
	Dave Hansen <dave.hansen@...ux.intel.com>,
	"H. Peter Anvin" <hpa@...or.com>, Hugh Dickins <hughd@...gle.com>,
	Ingo Molnar <mingo@...hat.com>,
	"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
	linux-fsdevel@...r.kernel.org, linux-mm@...ck.org,
	linux-nvdimm@...ts.01.org, Matthew Wilcox <willy@...ux.intel.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Thomas Gleixner <tglx@...utronix.de>, x86@...nel.org
Subject: Re: [PATCH] dax, pmem: add support for msync

On Wed, Sep 02, 2015 at 12:13:21PM +0300, Kirill A. Shutemov wrote:
> On Wed, Sep 02, 2015 at 08:49:22AM +1000, Dave Chinner wrote:
> > On Tue, Sep 01, 2015 at 01:08:04PM +0300, Kirill A. Shutemov wrote:
> > > On Tue, Sep 01, 2015 at 09:38:03AM +1000, Dave Chinner wrote:
> > > > On Mon, Aug 31, 2015 at 12:59:44PM -0600, Ross Zwisler wrote:
> > > > Even for DAX, msync has to call vfs_fsync_range() for the filesystem to commit
> > > > the backing store allocations to stable storage, so there's not
> > > > getting around the fact msync is the wrong place to be flushing
> > > > DAX mappings to persistent storage.
> > > 
> > > Why?
> > > IIUC, msync() doesn't have any requirements wrt metadata, right?
> > 
> > Of course it does. If the backing store allocation has not been
> > committed, then after a crash there will be a hole in file and
> > so it will read as zeroes regardless of what data was written and
> > flushed.
> 
> Any reason why backing store allocation cannot be committed on *_mkwrite?

Oh, I could change that if you want, it'll just be ridiculously
slow because it requires journal flushes on every page fault that
needs to change the filesytsem block map (i.e. every allocation and/or
every unwritten extent conversion).

Sycnhronous journalling requires flushing the log on every
transaction commit. That involves switching to a work queue, copying
the changes into a log buffer, issuing IO to flush the journal,
waiting for that to complete, etc. i.e.  synchronous journalling
incurs a minimum overhead of 4 context switches per page fault that
needs to allocate/convert backing store, along with all the CPU time
needed to process the journal commit.

> diff --git a/mm/mmap.c b/mm/mmap.c
> index 3f78bceefe5a..f2e29a541e14 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -1645,6 +1645,15 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
>                         vma->vm_ops = &dummy_ops;
>                 }
>  
> +               /*
> +                * Make sure that for VM_MIXEDMAP VMA has both
> +                * vm_ops->page_mkwrite and vm_ops->pfn_mkwrite or has none.
> +                */
> +               if ((vma->vm_ops->page_mkwrite || vma->vm_ops->pfn_mkwrite) &&
> +                               vma->vm_flags & VM_MIXEDMAP) {
> +                       VM_BUG_ON_VMA(!vma->vm_ops->page_mkwrite, vma);
> +                       VM_BUG_ON_VMA(!vma->vm_ops->pfn_mkwrite, vma);
> +               }

Doesn't really help developers that don't use CONFIG_DEBUG_VM. i.e
it's the FS developers that you need to warn, not VM developers -
in this case a "WARN_ON_ONCE" is probably more appropriate.

Cheers,

Dave.
-- 
Dave Chinner
david@...morbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ