lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <1432749114-904-1-git-send-email-andi@firstfloor.org>
Date:	Wed, 27 May 2015 10:51:43 -0700
From:	Andi Kleen <andi@...stfloor.org>
To:	acme@...nel.org
Cc:	jolsa@...hat.com, namhyung@...nel.org, eranian@...gle.com,
	linux-kernel@...r.kernel.org
Subject: Cycles annotation support for perf tools v2

[v2: Addressed review comments. Fixed display problems and 
correctly compute IPC now. See patches for detailed changes.]

The upcoming Skylake CPU has a new timed branch stack feature,
that reports cycle counts for individual branches in the
last branch record.

This allows to get fine grained cost information for code, and also allows
to compute fine grained IPC.

Available from
git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc.git perf/skl-tools2

This patchkit adds support for this in the perf tools:
- Basic support for the cycles field like other branch fields
- Show cycles in the standard branch sort view (no IPC here,
  as IPC needs the instruction counts from annotation)
- Annotate cycles and IPC in the assembler annotate view
- Add branch support to top, so we can do live annotation.
- Misc support, like dumping it in perf report -D

The kernel support has been posted separately. I included a test patch
to generate fake data for testing on existing systems.

Example output for annotate (with made up numbers):
    
The second column is the IPC and third average cycles for the basic block.

                   │    static int hex(char ch)                                                                                                       ▒
                   │    {                                                                                                                             ▒
        0.12       │      push   %rbp                                                                                                                 ◆
        0.12       │      mov    %rsp,%rbp                                                                                                            ▒
        0.12       │      sub    $0x20,%rsp                                                                                                           ▒
        0.12       │      mov    %edi,%eax                                                                                                            ▒
        0.12       │      mov    %al,-0x14(%rbp)                                                                                                      ▒
        0.12       │      mov    %fs:0x28,%rax                                                                                                        ▒
        0.12       │      mov    %rax,-0x8(%rbp)                                                                                                      ▒
        0.12       │      xor    %eax,%eax                                                                                                            ▒
                   │            if ((ch >= '0') && (ch <= '9'))                                                                                       ▒
        0.12       │      cmpb   $0x2f,-0x14(%rbp)                                                                                                    ▒
 66.67  0.12   123 │    ↓ jle    31                                                                                                                   ▒
        0.12       │      cmpb   $0x39,-0x14(%rbp)                                                                                                    ▒
        0.12   123 │    ↓ jg     31                                                                                                                   ▒
                   │                    return ch - '0';                                                                                              ▒
 22.22  0.12       │      movsbl -0x14(%rbp),%eax                                                                                                     ▒
        0.12       │      sub    $0x30,%eax                                                                                                           ▒
        0.12   123 │    ↓ jmp    60                                                                                                                   ▒
                   │            if ((ch >= 'a') && (ch <= 'f'))                                                                                       ▒
        0.06       │31:   cmpb   $0x60,-0x14(%rbp)                                                                                                    ▒
        0.06   123 │    ↓ jle    46                                                                                                                   ▒
        0.06       │      cmpb   $0x66,-0x14(%rbp)                                                                                                    ▒
        0.06       │    ↓ jg     46                                                                                                                   ▒
                   │                    return ch - 'a' + 10;                                                                                         ▒
        0.06       │      movsbl -0x14(%rbp),%eax                                 

Example output for branch view (again with fake data):

Overhead  Command  Source Shared Object  Source Symbol                               Target Symbol                               Basic Block Cycles   ◆
  30.08%  tcall    tcall                 [.] f1                                      [.] f2                                      123                  ▒
  27.44%  tcall    tcall                 [.] f2                                      [.] f1                                      123                  ▒
  15.60%  tcall    tcall                 [.] main                                    [.] f1                                      123                  ▒
  12.96%  tcall    tcall                 [.] f1                                      [.] main                                    123                  ▒
  12.86%  tcall    tcall                 [.] main                                    [.] main                                    123                  ▒
   0.08%  tcall    [kernel.kallsyms]     [k] hrtimer_interrupt                       [k] hrtimer_interrupt                       123             

IPC computation has a few limitations (see the comments in the respective patches),
in particular it punts on overlaping basic blocks.

The annotation only works for the interactive annotation. Currently it is not
working in the scripted perf annotate, as that is missing a lot of the
infrastructure needed for per instruction state.

It would be nice to add column headers to annotate.

So far no support in --branch-history or in perf script.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ