lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Sun, 10 Jan 2021 23:40:42 +0000 From: Al Viro <viro@...iv.linux.org.uk> To: Mikulas Patocka <mpatocka@...hat.com> Cc: Andrew Morton <akpm@...ux-foundation.org>, Dan Williams <dan.j.williams@...el.com>, Vishal Verma <vishal.l.verma@...el.com>, Dave Jiang <dave.jiang@...el.com>, Ira Weiny <ira.weiny@...el.com>, Matthew Wilcox <willy@...radead.org>, Jan Kara <jack@...e.cz>, Steven Whitehouse <swhiteho@...hat.com>, Eric Sandeen <esandeen@...hat.com>, Dave Chinner <dchinner@...hat.com>, Theodore Ts'o <tytso@....edu>, Wang Jianchao <jianchao.wan9@...il.com>, "Kani, Toshi" <toshi.kani@....com>, "Norton, Scott J" <scott.norton@....com>, "Tadakamadla, Rajesh" <rajesh.tadakamadla@....com>, linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org, linux-nvdimm@...ts.01.org Subject: Re: [RFC v2] nvfs: a filesystem for persistent memory On Sun, Jan 10, 2021 at 04:14:55PM -0500, Mikulas Patocka wrote: > That's a good point. I split nvfs_rw_iter to separate functions > nvfs_read_iter and nvfs_write_iter - and inlined nvfs_rw_iter_locked into > both of them. It improved performance by 1.3%. > > > Not that it had been more useful on the write side, really, > > but that's another story (nvfs_write_pages() handling of > > copyin is... interesting). Let's figure out what's going > > on with the read overhead first... > > > > lib/iov_iter.c primitives certainly could use massage for > > better code generation, but let's find out how much of the > > PITA is due to those and how much comes from you fighing > > the damn thing instead of using it sanely... > > The results are: > > read: 6.744s > read_iter: 7.417s > read_iter - separate read and write path: 7.321s > Al's read_iter: 7.182s > Al's read_iter with _copy_to_iter: 7.181s So * overhead of hardening stuff is noise here * switching to more straightforward ->read_iter() cuts the overhead by about 1/3. Interesting... I wonder how much of that is spent in iterate_and_advance() glue inside copy_to_iter() here. There's certainly quite a bit of optimizations possible in those primitives and your usecase makes a decent test for that... Could you profile that and see where is it spending the time, on instruction level?
Powered by blists - more mailing lists