[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <96438a42-9510-444b-f90e-ed4e12f356c9@oracle.com>
Date: Tue, 19 Dec 2017 11:42:35 -0800
From: Rao Shoaib <rao.shoaib@...cle.com>
To: Matthew Wilcox <willy@...radead.org>
Cc: linux-kernel@...r.kernel.org, paulmck@...ux.vnet.ibm.com,
brouer@...hat.com, linux-mm@...ck.org
Subject: Re: [PATCH] kfree_rcu() should use the new kfree_bulk() interface for
freeing rcu structures
On 12/19/2017 11:12 AM, Matthew Wilcox wrote:
> On Tue, Dec 19, 2017 at 09:52:27AM -0800, rao.shoaib@...cle.com wrote:
>> This patch updates kfree_rcu to use new bulk memory free functions as they
>> are more efficient. It also moves kfree_call_rcu() out of rcu related code to
>> mm/slab_common.c
>>
>> Signed-off-by: Rao Shoaib <rao.shoaib@...cle.com>
>> ---
>> include/linux/mm.h | 5 ++
>> kernel/rcu/tree.c | 14 ----
>> kernel/sysctl.c | 40 +++++++++++
>> mm/slab.h | 23 +++++++
>> mm/slab_common.c | 198 ++++++++++++++++++++++++++++++++++++++++++++++++++++-
>> 5 files changed, 264 insertions(+), 16 deletions(-)
> You've added an awful lot of code. Do you have any performance measurements
> that shows this to be a win?
I did some micro benchmarking when I was developing the code and did see
performance gains -- see attached.
I tried several networking benchmarks but was not able to get any
improvement . The reason is that these benchmarks do not exercise the
code we are improving. So I looked at the kernel source for users of
kfree_rcu(). It turns out that directory deletion code calls kfree_rcu
to free the data structure when an entry is deleted. Based on that I
created two benchmarks.
1) make_dirs -- This benchmark creates multi level directory structure
and than deletes it. It's the delete part where we see the performance
gain of about 8.3%. The creation time remains same.
This benchmark was derived from fdtree benchmark at
https://computing.llnl.gov/?set=code&page=sio_downloads ==>
https://github.com/llnl/fdtree
2) tsock -- I also noticed that a socket has an entry in a directory and
when the socket is closed the directory entry is deleted. So I wrote a
simple benchmark that goes in a loop a million times and opens and
closes 10 sockets per iteration. This shows an improvement of 7.6%
I have attached the benchmarks and results. Unchanged results are for
stock kernel, Changed are for the modified kernel.
Shoaib
Download attachment "make_dirs.tar" of type "application/x-tar" (10240 bytes)
Download attachment "tsock.tar" of type "application/x-tar" (10240 bytes)
Powered by blists - more mailing lists