linux-kernel - Re: [RFCv2 0/8] perf tool: Add new event group management

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20120526192311.GM27374@one.firstfloor.org>
Date:	Sat, 26 May 2012 21:23:11 +0200
From:	Andi Kleen <andi@...stfloor.org>
To:	Jiri Olsa <jolsa@...hat.com>
Cc:	Andi Kleen <andi@...stfloor.org>, acme@...hat.com,
	a.p.zijlstra@...llo.nl, mingo@...e.hu, paulus@...ba.org,
	cjashfor@...ux.vnet.ibm.com, fweisbec@...il.com,
	linux-kernel@...r.kernel.org, tglx@...utronix.de
Subject: Re: [RFCv2 0/8] perf tool: Add new event group management

On Sat, May 26, 2012 at 02:38:58PM +0200, Jiri Olsa wrote:
> The startup patches just got in recently
> http://marc.info/?l=linux-kernel&m=133758460912306&w=2
> 
> so I'll continue on this shortly.. 

Great.

> If you have some ideas on this or real world examples,

Any of the proposed syntaxes looked fine for me. The important
part is that it works in some form.

> that would really help.. so far, here's the latest discussion:
> http://marc.info/?t=133357436900005&r=1&w=2

For example you want to measure sandy bridge frontend contention in a 
more useful way than the dubious event in standard perf.

The formula for this is 

N = 4*CPU_CLK_UNHALTED.THREAD           (4 execution slots) 
Percent_FE_bound = 100*(IDQ_UOPS_NOT_DELIVERED.CORE / N)

Translated into perf this is 

-e r53003c -e r53019c

and some glue to compute the formula:

#!/usr/bin/python
import sys

cyc, e1 = sys.stdin.readline().split(",")
uops, e2 = sys.stdin.readline().split(",")

N = 4 * float(cyc) 
P_FE = 100.0 * (float(uops) / N)
print "percent frontend bound: %.2f" % (P_FE)

perf stat -x, -e r53003c -e r53019c /bin/ls 2>log
./frontend.py < log
percent frontend bound: 41.53

My /bin/ls is 42% frontend bound.

Now you see we always have to measure the CPU_CLK_UNHALTED and 
IDQ_UOPS_NOT_DELIVERED.CORE together. Otherwise there is no useful output
from the formula.

The problem happens when we want to measure other things too. You tend
to quickly run out of 4 counters per CPU thread, so have to multiplex.
And that is where the groups are needed. Without the groups we have
to do multiple runs, instead of one that measures this all time sliced.

This is pretty common with all kinds of measurements.

-Andi

-- 
ak@...ux.intel.com -- Speaking for myself only.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/