[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <c1e5a3d4-c5ac-d6ee-88ab-d9e2aa433b16@linux.vnet.ibm.com>
Date: Mon, 9 Oct 2017 13:07:36 +0530
From: Anshuman Khandual <khandual@...ux.vnet.ibm.com>
To: Aaron Lu <aaron.lu@...el.com>, linux-mm <linux-mm@...ck.org>,
lkml <linux-kernel@...r.kernel.org>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Andi Kleen <ak@...ux.intel.com>,
Dave Hansen <dave.hansen@...el.com>,
Huang Ying <ying.huang@...el.com>,
Tim Chen <tim.c.chen@...ux.intel.com>,
Kemi Wang <kemi.wang@...el.com>
Subject: Re: [PATCH] page_alloc.c: inline __rmqueue()
On 10/09/2017 11:14 AM, Aaron Lu wrote:
> __rmqueue() is called by rmqueue_bulk() and rmqueue() under zone->lock
> and that lock can be heavily contended with memory intensive applications.
>
> Since __rmqueue() is a small function, inline it can save us some time.
> With the will-it-scale/page_fault1/process benchmark, when using nr_cpu
> processes to stress buddy:
>
> On a 2 sockets Intel-Skylake machine:
> base %change head
> 77342 +6.3% 82203 will-it-scale.per_process_ops
>
> On a 4 sockets Intel-Skylake machine:
> base %change head
> 75746 +4.6% 79248 will-it-scale.per_process_ops
>
> This patch adds inline to __rmqueue().
>
> Signed-off-by: Aaron Lu <aaron.lu@...el.com>
Ran it through kernel bench and ebizzy micro benchmarks. Results
were comparable with and without the patch. May be these are not
the appropriate tests for this inlining improvement. Anyways it
does not have any performance degradation either.
Reviewed-by: Anshuman Khandual <khandual@...ux.vnet.ibm.com>
Tested-by: Anshuman Khandual <khandual@...ux.vnet.ibm.com>
Powered by blists - more mailing lists