linux-kernel - Re: [UPDATED][PATCH][mmotm] Help Root Memory Cgroup Resource Counters Scale Better (v5)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20090813083524.GC21389@elte.hu>
Date:	Thu, 13 Aug 2009 10:35:24 +0200
From:	Ingo Molnar <mingo@...e.hu>
To:	Balbir Singh <balbir@...ux.vnet.ibm.com>
Cc:	Daisuke Nishimura <nishimura@....nes.nec.co.jp>,
	Andrew Morton <akpm@...ux-foundation.org>,
	"lizf@...fujitsu.com" <lizf@...fujitsu.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
	"menage@...gle.com" <menage@...gle.com>, xemul@...nvz.org,
	prarit@...hat.com, andi.kleen@...el.com,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"linux-mm@...ck.org" <linux-mm@...ck.org>
Subject: Re: [UPDATED][PATCH][mmotm] Help Root Memory Cgroup Resource
	Counters Scale Better (v5)


* Balbir Singh <balbir@...ux.vnet.ibm.com> wrote:

> Without Patch
> 
>  Performance counter stats for '/home/balbir/parallel_pagefault':
> 
>   5826093739340  cycles                   #    809.989 M/sec
>    408883496292  instructions             #      0.070 IPC
>      7057079452  cache-references         #      0.981 M/sec
>      3036086243  cache-misses             #      0.422 M/sec

> With this patch applied
> 
>  Performance counter stats for '/home/balbir/parallel_pagefault':
> 
>   5957054385619  cycles                   #    828.333 M/sec
>   1058117350365  instructions             #      0.178 IPC
>      9161776218  cache-references         #      1.274 M/sec
>      1920494280  cache-misses             #      0.267 M/sec

Nice how the instruction count and the IPC value incraesed, and the 
cache-miss count decreased.

Btw., a 'perf stat' suggestion: you can also make use of built-in 
error bars via repeating parallel_pagefault N times:

  aldebaran:~> perf stat --repeat 3 /bin/ls

 Performance counter stats for '/bin/ls' (3 runs):

       1.108886  task-clock-msecs         #      0.875 CPUs    ( +-   4.316% )
              0  context-switches         #      0.000 M/sec   ( +-   0.000% )
              0  CPU-migrations           #      0.000 M/sec   ( +-   0.000% )
            254  page-faults              #      0.229 M/sec   ( +-   0.000% )
        3461896  cycles                   #   3121.958 M/sec   ( +-   3.508% )
        3044445  instructions             #      0.879 IPC     ( +-   0.134% )
          21213  cache-references         #     19.130 M/sec   ( +-   1.612% )
           2610  cache-misses             #      2.354 M/sec   ( +-  39.640% )

    0.001267355  seconds time elapsed   ( +-   4.762% )

that way even small changes in metrics can be identified as positive 
effects of a patch, if the improvement is beyond the error 
percentage that perf reports.

For example in the /bin/ls numbers i cited above, the 'instructions' 
value can be trusted up to 99.8% (with a ~0.13% noise), while say 
the cache-misses value can not really be trusted, as it has 40% of 
noise. (Increasing the repeat count will drive down the noise level 
- at the cost of longer measurement time.)

For your patch the improvement is so drastic that this isnt needed - 
but the error estimations can be quite useful for more borderline 
improvements. (and they are also useful in finding and proving small 
performance regressions)

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/