[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090813083524.GC21389@elte.hu>
Date: Thu, 13 Aug 2009 10:35:24 +0200
From: Ingo Molnar <mingo@...e.hu>
To: Balbir Singh <balbir@...ux.vnet.ibm.com>
Cc: Daisuke Nishimura <nishimura@....nes.nec.co.jp>,
Andrew Morton <akpm@...ux-foundation.org>,
"lizf@...fujitsu.com" <lizf@...fujitsu.com>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
"menage@...gle.com" <menage@...gle.com>, xemul@...nvz.org,
prarit@...hat.com, andi.kleen@...el.com,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-mm@...ck.org" <linux-mm@...ck.org>
Subject: Re: [UPDATED][PATCH][mmotm] Help Root Memory Cgroup Resource
Counters Scale Better (v5)
* Balbir Singh <balbir@...ux.vnet.ibm.com> wrote:
> Without Patch
>
> Performance counter stats for '/home/balbir/parallel_pagefault':
>
> 5826093739340 cycles # 809.989 M/sec
> 408883496292 instructions # 0.070 IPC
> 7057079452 cache-references # 0.981 M/sec
> 3036086243 cache-misses # 0.422 M/sec
> With this patch applied
>
> Performance counter stats for '/home/balbir/parallel_pagefault':
>
> 5957054385619 cycles # 828.333 M/sec
> 1058117350365 instructions # 0.178 IPC
> 9161776218 cache-references # 1.274 M/sec
> 1920494280 cache-misses # 0.267 M/sec
Nice how the instruction count and the IPC value incraesed, and the
cache-miss count decreased.
Btw., a 'perf stat' suggestion: you can also make use of built-in
error bars via repeating parallel_pagefault N times:
aldebaran:~> perf stat --repeat 3 /bin/ls
Performance counter stats for '/bin/ls' (3 runs):
1.108886 task-clock-msecs # 0.875 CPUs ( +- 4.316% )
0 context-switches # 0.000 M/sec ( +- 0.000% )
0 CPU-migrations # 0.000 M/sec ( +- 0.000% )
254 page-faults # 0.229 M/sec ( +- 0.000% )
3461896 cycles # 3121.958 M/sec ( +- 3.508% )
3044445 instructions # 0.879 IPC ( +- 0.134% )
21213 cache-references # 19.130 M/sec ( +- 1.612% )
2610 cache-misses # 2.354 M/sec ( +- 39.640% )
0.001267355 seconds time elapsed ( +- 4.762% )
that way even small changes in metrics can be identified as positive
effects of a patch, if the improvement is beyond the error
percentage that perf reports.
For example in the /bin/ls numbers i cited above, the 'instructions'
value can be trusted up to 99.8% (with a ~0.13% noise), while say
the cache-misses value can not really be trusted, as it has 40% of
noise. (Increasing the repeat count will drive down the noise level
- at the cost of longer measurement time.)
For your patch the improvement is so drastic that this isnt needed -
but the error estimations can be quite useful for more borderline
improvements. (and they are also useful in finding and proving small
performance regressions)
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists