lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sun, 10 Jan 2021 23:40:42 +0000
From:   Al Viro <>
To:     Mikulas Patocka <>
Cc:     Andrew Morton <>,
        Dan Williams <>,
        Vishal Verma <>,
        Dave Jiang <>,
        Ira Weiny <>,
        Matthew Wilcox <>, Jan Kara <>,
        Steven Whitehouse <>,
        Eric Sandeen <>,
        Dave Chinner <>,
        Theodore Ts'o <>,
        Wang Jianchao <>,
        "Kani, Toshi" <>,
        "Norton, Scott J" <>,
        "Tadakamadla, Rajesh" <>,,,
Subject: Re: [RFC v2] nvfs: a filesystem for persistent memory

On Sun, Jan 10, 2021 at 04:14:55PM -0500, Mikulas Patocka wrote:

> That's a good point. I split nvfs_rw_iter to separate functions 
> nvfs_read_iter and nvfs_write_iter - and inlined nvfs_rw_iter_locked into 
> both of them. It improved performance by 1.3%.
> > Not that it had been more useful on the write side, really,
> > but that's another story (nvfs_write_pages() handling of
> > copyin is... interesting).  Let's figure out what's going
> > on with the read overhead first...
> > 
> > lib/iov_iter.c primitives certainly could use massage for
> > better code generation, but let's find out how much of the
> > PITA is due to those and how much comes from you fighing
> > the damn thing instead of using it sanely...
> The results are:
> read:                                           6.744s
> read_iter:                                      7.417s
> read_iter - separate read and write path:       7.321s
> Al's read_iter:                                 7.182s
> Al's read_iter with _copy_to_iter:              7.181s

	* overhead of hardening stuff is noise here
	* switching to more straightforward ->read_iter() cuts
the overhead by about 1/3.

	Interesting...  I wonder how much of that is spent in
iterate_and_advance() glue inside copy_to_iter() here.  There's
certainly quite a bit of optimizations possible in those
primitives and your usecase makes a decent test for that...

	Could you profile that and see where is it spending
the time, on instruction level?

Powered by blists - more mailing lists