lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090821112915.GA24647@elte.hu>
Date:	Fri, 21 Aug 2009 13:29:15 +0200
From:	Ingo Molnar <mingo@...e.hu>
To:	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
Cc:	Anton Blanchard <anton@...ba.org>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	balbir@...ux.vnet.ibm.com,
	Bharata B Rao <bharata@...ux.vnet.ibm.com>,
	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	mingo@...hat.com, hpa@...or.com, linux-kernel@...r.kernel.org,
	schwidefsky@...ibm.com, balajirrao@...il.com,
	dhaval@...ux.vnet.ibm.com, tglx@...utronix.de,
	akpm@...ux-foundation.org
Subject: Re: [PATCH] better align percpu counter (Was Re: [tip:sched/core]
	sched: cpuacct: Use bigger percpu counter batch values for stats
	counters


* KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com> wrote:

> On Thu, 20 Aug 2009 12:04:03 +0200
> Ingo Molnar <mingo@...e.hu> wrote:
> > * KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com> wrote:
> > > with your program
> > > before patch.
> > > cpuacct off : 414000-416000 ctsw per sec.
> > > cpuacct on  : 401000-404000 ctsw per sec.
> > > 
> > > after patch
> > > cpuacct on  : 412000-413000 ctsw per sec.
> > > 
> > > Maybe I should check cache-miss late ;)
> > 
> > Btw., in latest upstream you can do that via:
> > 
> >   cd tools/perf/
> >   make -j install
> > 
> >   perf stat --repeat 5 -- taskset -c 1 ./context_switch
> > 
> 
> tried. (on 8cpu/2socket host). It seems cache-miss decreases. But 
> IPC ..?

All the numbers have gone down - about the same amount of cycles but 
fewer instructions executed, and fewer cache-misses. That's good.

The Instructions Per Cycle metric got worse because cycles stayed 
constant. One thing is that you have triggered counter-over-commit 
(the 'scaled from' messages) - this means that more counters are 
used than the hardware has space for - so we round-robin schedule 
them.

If you want to get to the bottom of that, to get the most precise 
result try something like:

    perf stat --repeat 5 -a -e \
        cycles,instructions,L1-dcache-load-misses,L1-dcache-store-misses \
        -- ./ctxt_sw.sh

( this is almost the same as the command line you used, but without 
  the 'cache-misses' counter. Your CPU should be able to 
  simultaneously activate all these counters and they should count 
  100% of the events. )

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ