[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090813084316.GI5087@balbir.in.ibm.com>
Date: Thu, 13 Aug 2009 14:13:16 +0530
From: Balbir Singh <balbir@...ux.vnet.ibm.com>
To: Ingo Molnar <mingo@...e.hu>
Cc: Daisuke Nishimura <nishimura@....nes.nec.co.jp>,
Andrew Morton <akpm@...ux-foundation.org>,
"lizf@...fujitsu.com" <lizf@...fujitsu.com>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
"menage@...gle.com" <menage@...gle.com>, xemul@...nvz.org,
prarit@...hat.com, andi.kleen@...el.com,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-mm@...ck.org" <linux-mm@...ck.org>
Subject: Re: [UPDATED][PATCH][mmotm] Help Root Memory Cgroup Resource
Counters Scale Better (v5)
* Ingo Molnar <mingo@...e.hu> [2009-08-13 10:35:24]:
>
> * Balbir Singh <balbir@...ux.vnet.ibm.com> wrote:
>
> > Without Patch
> >
> > Performance counter stats for '/home/balbir/parallel_pagefault':
> >
> > 5826093739340 cycles # 809.989 M/sec
> > 408883496292 instructions # 0.070 IPC
> > 7057079452 cache-references # 0.981 M/sec
> > 3036086243 cache-misses # 0.422 M/sec
>
> > With this patch applied
> >
> > Performance counter stats for '/home/balbir/parallel_pagefault':
> >
> > 5957054385619 cycles # 828.333 M/sec
> > 1058117350365 instructions # 0.178 IPC
> > 9161776218 cache-references # 1.274 M/sec
> > 1920494280 cache-misses # 0.267 M/sec
>
> Nice how the instruction count and the IPC value incraesed, and the
> cache-miss count decreased.
>
> Btw., a 'perf stat' suggestion: you can also make use of built-in
> error bars via repeating parallel_pagefault N times:
>
> aldebaran:~> perf stat --repeat 3 /bin/ls
>
> Performance counter stats for '/bin/ls' (3 runs):
>
> 1.108886 task-clock-msecs # 0.875 CPUs ( +- 4.316% )
> 0 context-switches # 0.000 M/sec ( +- 0.000% )
> 0 CPU-migrations # 0.000 M/sec ( +- 0.000% )
> 254 page-faults # 0.229 M/sec ( +- 0.000% )
> 3461896 cycles # 3121.958 M/sec ( +- 3.508% )
> 3044445 instructions # 0.879 IPC ( +- 0.134% )
> 21213 cache-references # 19.130 M/sec ( +- 1.612% )
> 2610 cache-misses # 2.354 M/sec ( +- 39.640% )
>
> 0.001267355 seconds time elapsed ( +- 4.762% )
>
> that way even small changes in metrics can be identified as positive
> effects of a patch, if the improvement is beyond the error
> percentage that perf reports.
>
> For example in the /bin/ls numbers i cited above, the 'instructions'
> value can be trusted up to 99.8% (with a ~0.13% noise), while say
> the cache-misses value can not really be trusted, as it has 40% of
> noise. (Increasing the repeat count will drive down the noise level
> - at the cost of longer measurement time.)
>
> For your patch the improvement is so drastic that this isnt needed -
> but the error estimations can be quite useful for more borderline
> improvements. (and they are also useful in finding and proving small
> performance regressions)
Thanks for the tip, let me try and use the repeats feature. BTW, nice
work on the perf counters, I was pleasantly surprised to see a
wonderful tool in the kernel with a good set of options and detailed
analysis capabilities.
--
Balbir
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists