[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170802221359.GA20666@linux.intel.com>
Date: Wed, 2 Aug 2017 16:13:59 -0600
From: Ross Zwisler <ross.zwisler@...ux.intel.com>
To: Matthew Wilcox <willy@...radead.org>
Cc: Ross Zwisler <ross.zwisler@...ux.intel.com>,
Andrew Morton <akpm@...ux-foundation.org>,
linux-kernel@...r.kernel.org, "karam . lee" <karam.lee@....com>,
Minchan Kim <minchan@...nel.org>,
Jerome Marchand <jmarchan@...hat.com>,
Nitin Gupta <ngupta@...are.org>, seungho1.park@....com,
Christoph Hellwig <hch@....de>,
Dan Williams <dan.j.williams@...el.com>,
Dave Chinner <david@...morbit.com>, Jan Kara <jack@...e.cz>,
Jens Axboe <axboe@...nel.dk>,
Vishal Verma <vishal.l.verma@...el.com>,
linux-nvdimm@...ts.01.org
Subject: Re: [PATCH 0/3] remove rw_page() from brd, pmem and btt
On Fri, Jul 28, 2017 at 10:31:43AM -0700, Matthew Wilcox wrote:
> On Fri, Jul 28, 2017 at 10:56:01AM -0600, Ross Zwisler wrote:
> > Dan Williams and Christoph Hellwig have recently expressed doubt about
> > whether the rw_page() interface made sense for synchronous memory drivers
> > [1][2]. It's unclear whether this interface has any performance benefit
> > for these drivers, but as we continue to fix bugs it is clear that it does
> > have a maintenance burden. This series removes the rw_page()
> > implementations in brd, pmem and btt to relieve this burden.
>
> Why don't you measure whether it has performance benefits? I don't
> understand why zram would see performance benefits and not other drivers.
> If it's going to be removed, then the whole interface should be removed,
> not just have the implementations removed from some drivers.
Okay, I've run a bunch of performance tests with the PMEM and with BTT entry
points for rw_pages() in a swap workload, and in all cases I do see an
improvement over the code when rw_pages() is removed. Here are the results
from my random lab box:
Average latency of swap_writepage()
+------+------------+---------+-------------+
| | no rw_page | rw_page | Improvement |
+-------------------------------------------+
| PMEM | 5.0 us | 4.7 us | 6% |
+-------------------------------------------+
| BTT | 6.8 us | 6.1 us | 10% |
+------+------------+---------+-------------+
Average latency of swap_readpage()
+------+------------+---------+-------------+
| | no rw_page | rw_page | Improvement |
+-------------------------------------------+
| PMEM | 3.3 us | 2.9 us | 12% |
+-------------------------------------------+
| BTT | 3.7 us | 3.4 us | 8% |
+------+------------+---------+-------------+
The workload was pmbench, a memory benchmark, run on a system where I had
severely restricted the amount of memory in the system with the 'mem' kernel
command line parameter. The benchmark was set up to test more memory than I
allowed the OS to have so it spilled over into swap.
The PMEM or BTT device was set up as my swap device, and during the test I got
a few hundred thousand samples of each of swap_writepage() and
swap_writepage(). The PMEM/BTT device was just memory reserved with the
memmap kernel command line parameter.
Thanks, Matthew, for asking for performance data. It looks like removing this
code would have been a mistake.
Powered by blists - more mailing lists