[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20131220155143.GA22595@localhost>
Date: Fri, 20 Dec 2013 23:51:43 +0800
From: Fengguang Wu <fengguang.wu@...el.com>
To: Mel Gorman <mgorman@...e.de>
Cc: Alex Shi <alex.shi@...aro.org>, Ingo Molnar <mingo@...nel.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Thomas Gleixner <tglx@...utronix.de>,
Andrew Morton <akpm@...ux-foundation.org>,
H Peter Anvin <hpa@...or.com>, Linux-X86 <x86@...nel.org>,
Linux-MM <linux-mm@...ck.org>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 0/4] Fix ebizzy performance regression due to X86 TLB
range flush v2
On Thu, Dec 19, 2013 at 02:34:50PM +0000, Mel Gorman wrote:
> On Wed, Dec 18, 2013 at 03:28:14PM +0800, Fengguang Wu wrote:
> > Hi Mel,
> >
> > I'd like to share some test numbers with your patches applied on top of v3.13-rc3.
> >
> > Basically there are
> >
> > 1) no big performance changes
> >
> > 76628486 -0.7% 76107841 TOTAL vm-scalability.throughput
> > 407038 +1.2% 412032 TOTAL hackbench.throughput
> > 50307 -1.5% 49549 TOTAL ebizzy.throughput
> >
>
> I'm assuming this was an ivybridge processor.
The test boxes brickland2 and lkp-ib03 are ivybridge; lkp-snb01 is sandybridge.
> How many threads were ebizzy tested with?
The below case has params string "400%-5-30", which means
nr_threads = 400% * nr_cpu = 4 * 48 = 192
iterations = 5
duration = 30
v3.13-rc3 eabb1f89905a0c809d13
--------------- -------------------------
50307 ~ 1% -1.5% 49549 ~ 0% lkp-ib03/micro/ebizzy/400%-5-30
50307 -1.5% 49549 TOTAL ebizzy.throughput
> The memory ranges used by the vm scalability benchmarks are
> probably too large to be affected by the series but I'm guessing.
Do you mean these lines?
3345155 ~ 0% -0.3% 3335172 ~ 0% brickland2/micro/vm-scalability/16G-shm-pread-rand-mt
33249939 ~ 0% +3.3% 34336155 ~ 1% brickland2/micro/vm-scalability/1T-shm-pread-seq
The two cases run 128 threads/processes, each accessing randomly/sequentially
a 64GB shm file concurrently. Sorry the 16G/1T prefixes are somehow misleading.
> I doubt hackbench is doing any flushes and the 1.2% is noise.
Here are the proc-vmstat.nr_tlb_remote_flush numbers for hackbench:
513 ~ 3% +4.3e+16% 2.192e+17 ~85% lkp-nex05/micro/hackbench/800%-process-pipe
603 ~ 3% +7.7e+16% 4.669e+17 ~13% lkp-nex05/micro/hackbench/800%-process-socket
6124 ~17% +5.7e+15% 3.474e+17 ~26% lkp-nex05/micro/hackbench/800%-threads-pipe
7565 ~49% +5.5e+15% 4.128e+17 ~68% lkp-nex05/micro/hackbench/800%-threads-socket
21252 ~ 6% +1.3e+15% 2.728e+17 ~39% lkp-snb01/micro/hackbench/1600%-threads-pipe
24516 ~16% +8.3e+14% 2.034e+17 ~53% lkp-snb01/micro/hackbench/1600%-threads-socket
I tried rebuild kernels with distclean and this time got the below
hackbench changes. I'll queue the hackbench test in all our test boxes
to get a more complete evaluation.
v3.13-rc3 eabb1f89905a0c809d13
--------------- -------------------------
232925 ~ 0% -8.4% 213339 ~ 5% lkp-snb01/micro/hackbench/1600%-process-pipe
232925 -8.4% 213339 TOTAL hackbench.throughput
This time, the ebizzy params are refreshed and the test case is
exercised in all our test machines. The results that have changed are:
v3.13-rc3 eabb1f89905a0c809d13
--------------- -------------------------
873 ~ 0% +0.7% 879 ~ 0% lkp-a03/micro/ebizzy/200%-100-10
873 ~ 0% +0.7% 879 ~ 0% lkp-a04/micro/ebizzy/200%-100-10
873 ~ 0% +0.8% 880 ~ 0% lkp-a06/micro/ebizzy/200%-100-10
49242 ~ 0% -1.2% 48650 ~ 0% lkp-ib03/micro/ebizzy/200%-100-10
26176 ~ 0% -1.6% 25760 ~ 0% lkp-sbx04/micro/ebizzy/200%-100-10
2738 ~ 0% +0.2% 2744 ~ 0% lkp-t410/micro/ebizzy/200%-100-10
80776 -1.2% 79793 TOTAL ebizzy.throughput
The full change set is attached.
> > 2) huge proc-vmstat.nr_tlb_* increases
> >
> > 99986527 +3e+14% 2.988e+20 TOTAL proc-vmstat.nr_tlb_local_flush_one
> > 3.812e+08 +2.2e+13% 8.393e+19 TOTAL proc-vmstat.nr_tlb_remote_flush_received
> > 3.301e+08 +2.2e+13% 7.241e+19 TOTAL proc-vmstat.nr_tlb_remote_flush
> > 5990864 +1.2e+15% 7.032e+19 TOTAL proc-vmstat.nr_tlb_local_flush_all
> >
>
> The accounting changes can be mostly explained by "x86: mm: Clean up
> inconsistencies when flushing TLB ranges". flush_all was simply not
> being counted before so I would claim that the old figure was simply
> wrong and did not reflect reality.
>
> Alterations on when range versus global flushes would affect the other
> counters but arguably it's now behaving as originally intended by the tlb
> flush shift.
OK.
> > Here are the detailed numbers. eabb1f89905a0c809d13 is the HEAD commit
> > with 4 patches applied. The "~ N%" notations are the stddev percent.
> > The "[+-] N%" notations are the increase/decrease percent. The
> > brickland2, lkp-snb01, lkp-ib03 etc. are testbox names.
> >
>
> Are positive numbers always better?
Not necessarily. A positive change merely means the absolute numbers
of hackbench.throughput, ebizzy.throughput, etc. are increased in the
new kernel. But yes, for the above stats, it happen to be "the higher,
the better".
> If so, most of these figures look good to me and support the series
> being merged. Please speak up if that is in error.
Agreed, except that I'll need to re-evaluate the hackbench test case.
> I do see a few major regressions like this
>
> > 324497 ~ 0% -100.0% 0 ~ 0% brickland2/micro/vm-scalability/16G-truncate
>
> but I have no idea what the test is doing and whether something happened
> that the test broke that time or if it's something to be really
> concerned about.
This test case simply creates sparse files, populate them with zeros,
then delete them in parallel. Here $mem is physical memory size 128G,
$nr_cpu is 120.
for i in `seq $nr_cpu`
do
create_sparse_file $SPARSE_FILE-$i $((mem / nr_cpu))
cp $SPARSE_FILE-$i /dev/null
done
for i in `seq $nr_cpu`
do
rm $SPARSE_FILE-$i &
done
Thanks,
Fengguang
View attachment "eabb1f89905a0c809d13ec27795ced089c107eb8" of type "text/plain" (74167 bytes)
Powered by blists - more mailing lists