[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210110234042.GX3579531@ZenIV.linux.org.uk>
Date: Sun, 10 Jan 2021 23:40:42 +0000
From: Al Viro <viro@...iv.linux.org.uk>
To: Mikulas Patocka <mpatocka@...hat.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Dan Williams <dan.j.williams@...el.com>,
Vishal Verma <vishal.l.verma@...el.com>,
Dave Jiang <dave.jiang@...el.com>,
Ira Weiny <ira.weiny@...el.com>,
Matthew Wilcox <willy@...radead.org>, Jan Kara <jack@...e.cz>,
Steven Whitehouse <swhiteho@...hat.com>,
Eric Sandeen <esandeen@...hat.com>,
Dave Chinner <dchinner@...hat.com>,
Theodore Ts'o <tytso@....edu>,
Wang Jianchao <jianchao.wan9@...il.com>,
"Kani, Toshi" <toshi.kani@....com>,
"Norton, Scott J" <scott.norton@....com>,
"Tadakamadla, Rajesh" <rajesh.tadakamadla@....com>,
linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
linux-nvdimm@...ts.01.org
Subject: Re: [RFC v2] nvfs: a filesystem for persistent memory
On Sun, Jan 10, 2021 at 04:14:55PM -0500, Mikulas Patocka wrote:
> That's a good point. I split nvfs_rw_iter to separate functions
> nvfs_read_iter and nvfs_write_iter - and inlined nvfs_rw_iter_locked into
> both of them. It improved performance by 1.3%.
>
> > Not that it had been more useful on the write side, really,
> > but that's another story (nvfs_write_pages() handling of
> > copyin is... interesting). Let's figure out what's going
> > on with the read overhead first...
> >
> > lib/iov_iter.c primitives certainly could use massage for
> > better code generation, but let's find out how much of the
> > PITA is due to those and how much comes from you fighing
> > the damn thing instead of using it sanely...
>
> The results are:
>
> read: 6.744s
> read_iter: 7.417s
> read_iter - separate read and write path: 7.321s
> Al's read_iter: 7.182s
> Al's read_iter with _copy_to_iter: 7.181s
So
* overhead of hardening stuff is noise here
* switching to more straightforward ->read_iter() cuts
the overhead by about 1/3.
Interesting... I wonder how much of that is spent in
iterate_and_advance() glue inside copy_to_iter() here. There's
certainly quite a bit of optimizations possible in those
primitives and your usecase makes a decent test for that...
Could you profile that and see where is it spending
the time, on instruction level?
Powered by blists - more mailing lists