lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150902190401.GC32255@linux.intel.com>
Date:	Wed, 2 Sep 2015 13:04:01 -0600
From:	Ross Zwisler <ross.zwisler@...ux.intel.com>
To:	Boaz Harrosh <boaz@...xistor.com>
Cc:	Christoph Hellwig <hch@....de>, Dave Chinner <david@...morbit.com>,
	Andrew Morton <akpm@...l.org>,
	Dave Hansen <dave.hansen@...ux.intel.com>,
	linux-nvdimm@...ts.01.org, Peter Zijlstra <peterz@...radead.org>,
	x86@...nel.org, Hugh Dickins <hughd@...gle.com>,
	linux-kernel@...r.kernel.org, linux-mm@...ck.org,
	Ingo Molnar <mingo@...hat.com>,
	Alexander Viro <viro@...iv.linux.org.uk>,
	"H. Peter Anvin" <hpa@...or.com>, linux-fsdevel@...r.kernel.org,
	Thomas Gleixner <tglx@...utronix.de>,
	"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
Subject: Re: [PATCH] dax, pmem: add support for msync

On Tue, Sep 01, 2015 at 03:18:41PM +0300, Boaz Harrosh wrote:
> So the approach we took was a bit different to exactly solve these
> problem, and to also not over flush too much. here is what we did.
> 
> * At vm_operations_struct we also override the .close vector (say call it dax_vm_close)
> 
> * At dax_vm_close() on writable files call ->fsync(,vma->vm_start, vma->vm_end,)
>   (We have an inode flag if the file was actually dirtied, but even if not, that will
>    not be that bad, so a file was opened for write, mmapped, but actually never
>    modified. Not a lot of these, and the do nothing cl_flushing is very fast)
> 
> * At ->fsync() do the actual cl_flush for all cases but only iff
> 	if (mapping_mapped(inode->i_mapping) == 0)
> 		return 0;
> 
>   This is because data written not through mmap is already persistent and we
>   do not need the cl_flushing
> 
> Apps expect all these to work:
> 1. open mmap m-write msync ... close
> 2. open mmap m-write fsync ... close
> 3. open mmap m-write unmap ... fsync close
> 
> 4. open mmap m-write sync ...

So basically you made close have an implicit fsync?  What about the flow that
looks like this:

5. open mmap close m-write

This guy definitely needs an msync/fsync at the end to make sure that the
m-write becomes durable.  

Also, the CLOSE(2) man page specifically says that a flush does not occur at
close:
	A successful close does not guarantee that the data has been
	successfully  saved  to  disk,  as  the  kernel defers  writes.   It
	is not common for a filesystem to flush the buffers when the stream is
	closed.  If you need to be sure that the data is physically stored,
	use fsync(2).  (It will depend on the disk  hardware  at this point.)

I don't think that adding an implicit fsync to close is the right solution -
we just need to get msync and fsync correctly working.

> The first 3 are supported with above, because what happens is that at [3]
> the fsync actually happens on unmap and fsync is redundant in that case.
> 
> The only broken scenario is [3]. We do not have a list of "dax-dirty" inodes
> per sb to iterate on and call inode-sync on. This cause problems mostly in
> freeze because with actual [3] scenario the file will be eventually closed
> and persistent, but after the call to sync returns.
> 
> Its on my TODO to fix [3] based on instructions from Dave.
> The mmap call will put the inode on the list and the dax_vm_close will
> remove it. One of the regular dirty list should be used as suggested by
> Dave.

I believe in the above two paragraphs you meant [4], so the 

4. open mmap m-write sync ...

case needs to be fixed so that we can detect DAX-dirty inodes?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ