linux-kernel - Re: [generalized cache events] Re: [PATCH 1/1] perf tools: Add missing user space support for config1/config2

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20110422165007.GA18401@vps.sharma-home.net>
Date:	Fri, 22 Apr 2011 09:50:07 -0700
From:	arun@...rma-home.net
To:	Ingo Molnar <mingo@...e.hu>
Cc:	Stephane Eranian <eranian@...gle.com>,
	Arnaldo Carvalho de Melo <acme@...radead.org>,
	linux-kernel@...r.kernel.org, Andi Kleen <ak@...ux.intel.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Lin Ming <ming.m.lin@...el.com>,
	Arnaldo Carvalho de Melo <acme@...hat.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>, eranian@...il.com,
	Arun Sharma <asharma@...com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [generalized cache events] Re: [PATCH 1/1] perf tools: Add
 missing user space support for config1/config2

On Fri, Apr 22, 2011 at 12:52:11PM +0200, Ingo Molnar wrote:
> 
> Using the generalized cache events i can run:
> 
>  $ perf stat --repeat 10 -e cycles:u -e instructions:u -e l1-dcache-loads:u -e l1-dcache-load-misses:u ./array
> 
>  Performance counter stats for './array' (10 runs):
> 
>          6,719,130 cycles:u                   ( +-   0.662% )
>          5,084,792 instructions:u           #      0.757 IPC     ( +-   0.000% )
>          1,037,032 l1-dcache-loads:u          ( +-   0.009% )
>          1,003,604 l1-dcache-load-misses:u    ( +-   0.003% )
> 
>         0.003802098  seconds time elapsed   ( +-  13.395% )
> 
> I consider that this is 'bad', because for almost every dcache-load there's a 
> dcache-miss - a 99% L1 cache miss rate!

One could argue that all you need is cycles and instructions. If there is an
expensive load, you'll see that the load instruction takes many cycles and
you can infer that it's a cache miss.

Questions app developers typically ask me:

* If I fix all my top 5 L3 misses how much faster will my app go?
* Am I bottlenecked on memory bandwidth?
* I have 4 L3 misses every 1000 instructions and 15 branch mispredicts per
  1000 instructions. Which one should I focus on?

It's hard to answer some of these without access to all events.
While your approach of having generic events for commonly used counters
might be useful for some use cases, I don't see why exposing all vendor
defined events is harmful.

A clear statement on the last point would be helpful.

 -Arun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/