[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <1437233094-12844-1-git-send-email-andi@firstfloor.org>
Date: Sat, 18 Jul 2015 08:24:45 -0700
From: Andi Kleen <andi@...stfloor.org>
To: acme@...nel.org
Cc: jolsa@...nel.org, linux-kernel@...r.kernel.org, namhyung@...nel.org
Subject: Cycles annotation support for perf tools v3
[v2: Addressed review comments. Fixed display problems and
correctly compute IPC now. See patches for detailed changes.]
[v3: Merged with current Arnaldo perf/core and added acked-by.]
[Note the respective kernel patches to report cycles are in
peterz's perf/core queue, but so far not in tip. The patchkit
can be tested however with the "fake cycles" debug patch added at
the end]
The upcoming Skylake CPU has a new timed branch stack feature,
that reports cycle counts for individual branches in the
last branch record.
This allows to get fine grained cost information for code, and also allows
to compute fine grained IPC.
Available from
git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc.git perf/skl-tools3
This patchkit adds support for this in the perf tools:
- Basic support for the cycles field like other branch fields
- Show cycles in the standard branch sort view (no IPC here,
as IPC needs the instruction counts from annotation)
- Annotate cycles and IPC in the assembler annotate view
- Add branch support to top, so we can do live annotation.
- Misc support, like dumping it in perf report -D
Example output for annotate (with made up numbers):
The second column is the IPC and third average cycles for the basic block.
│ static int hex(char ch) ▒
│ { ▒
0.12 │ push %rbp ◆
0.12 │ mov %rsp,%rbp ▒
0.12 │ sub $0x20,%rsp ▒
0.12 │ mov %edi,%eax ▒
0.12 │ mov %al,-0x14(%rbp) ▒
0.12 │ mov %fs:0x28,%rax ▒
0.12 │ mov %rax,-0x8(%rbp) ▒
0.12 │ xor %eax,%eax ▒
│ if ((ch >= '0') && (ch <= '9')) ▒
0.12 │ cmpb $0x2f,-0x14(%rbp) ▒
66.67 0.12 123 │ ↓ jle 31 ▒
0.12 │ cmpb $0x39,-0x14(%rbp) ▒
0.12 123 │ ↓ jg 31 ▒
│ return ch - '0'; ▒
22.22 0.12 │ movsbl -0x14(%rbp),%eax ▒
0.12 │ sub $0x30,%eax ▒
0.12 123 │ ↓ jmp 60 ▒
│ if ((ch >= 'a') && (ch <= 'f')) ▒
0.06 │31: cmpb $0x60,-0x14(%rbp) ▒
0.06 123 │ ↓ jle 46 ▒
0.06 │ cmpb $0x66,-0x14(%rbp) ▒
0.06 │ ↓ jg 46 ▒
│ return ch - 'a' + 10; ▒
0.06 │ movsbl -0x14(%rbp),%eax
Example output for branch view (again with fake data):
Overhead Command Source Shared Object Source Symbol Target Symbol Basic Block Cycles ◆
30.08% tcall tcall [.] f1 [.] f2 123 ▒
27.44% tcall tcall [.] f2 [.] f1 123 ▒
15.60% tcall tcall [.] main [.] f1 123 ▒
12.96% tcall tcall [.] f1 [.] main 123 ▒
12.86% tcall tcall [.] main [.] main 123 ▒
0.08% tcall [kernel.kallsyms] [k] hrtimer_interrupt [k] hrtimer_interrupt 123
IPC computation has a few limitations (see the comments in the respective patches),
in particular it punts on overlaping basic blocks.
The annotation only works for the interactive annotation. Currently it is not
working in the scripted perf annotate, as that is missing a lot of the
infrastructure needed for per instruction state.
It would be nice to add column headers to annotate.
So far no support in --branch-history or in perf script.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists