lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 8 Aug 2016 19:32:47 +0000
From:	"Kani, Toshimitsu" <toshi.kani@....com>
To:	Jan Kara <jack@...e.cz>, "Boylston, Brian" <brian.boylston@....com>
CC:	Dave Chinner <david@...morbit.com>,
	"linux-nvdimm@...ts.01.org" <linux-nvdimm@...ts.01.org>,
	"xfs@....sgi.com" <xfs@....sgi.com>,
	"linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>,
	"linux-ext4@...r.kernel.org" <linux-ext4@...r.kernel.org>,
	Ross Zwisler <ross.zwisler@...ux.intel.com>
Subject: RE: Subtle races between DAX mmap fault and write path

> > Jan Kara wrote on 2016-08-08:
> > > On Fri 05-08-16 19:58:33, Boylston, Brian wrote:
> >
> > I used NVML 1.1 for the measurements.  In this version and with the
> > hardware that I used, the pmem_persist() flow is:
> >
> >   pmem_persist()
> >     pmem_flush()
> >       Func_flush() == flush_clflush
> >         CLFLUSH
> >     pmem_drain()
> >       Func_predrain_fence() == predrain_fence_empty
> >         no-op
> >
> > So, I don't think that pmem_persist() does anything to cause the filesystem
> > to flush metadata as it doesn't make any system calls?
> 
> Ah, you are right. I somehow misread what is in NVML sources. I agree with
> Christoph that _persist suffix is then misleading for the reasons he stated
> but that's irrelevant to the test you did.
> 
> So it indeed seems that in your test movnt + sfence is an order of
> magnitude faster than cached memcpy + cflush + sfence. I'm surprised I have
> to say.

movnt is posted to WC buffer, which is asynchronously evicted to memory
when each line is filled.

clflush, on the other hand, must be serialized.  So, it has to synchronously evict
line-by-line.  clflushopt, when supported by new CPUs, should be a lot faster as
it can execute simultaneously and does not have to wait line-by-line.  It'd be still
slower than uncached copy, though. 

-Toshi 
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists