linux-kernel - Re: Shift by one instruction in the perf annotate output

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20120127102741.GA31782@elte.hu>
Date:	Fri, 27 Jan 2012 11:27:41 +0100
From:	Ingo Molnar <mingo@...e.hu>
To:	Peter Zijlstra <a.p.zijlstra@...llo.nl>
Cc:	Lénaïc Huard <lenaic@...ard.fr.eu.org>,
	Paul Mackerras <paulus@...ba.org>,
	Arnaldo Carvalho de Melo <acme@...stprotocols.net>,
	linux-kernel@...r.kernel.org
Subject: Re: Shift by one instruction in the perf annotate output


* Peter Zijlstra <a.p.zijlstra@...llo.nl> wrote:

> > I am running Linux and perf 3.2 but I remember that previous 
> > versions suffered from the same issue.
> > 
> > I don’t know if it could be specific to my cpu:
> > processor       : 0
> > vendor_id       : GenuineIntel
> > cpu family      : 6
> > model           : 15
> > model name      : Intel(R) Core(TM)2 CPU          6600  @ 2.40GHz 
> 
> And sadly its the best you'll get on your machine, most Intel 
> chips after that (including the core2 shrink, but excluding 
> the latest core i7 SNB) can do better using a feature called 
> PEBS.

Which can be activated on those CPUs using the '-e cycles:pp' 
option (the first 'p' stands for 'precise', the second 'p' for 
'very precise' ;-).

In that case some rather non-obvious perf magic is activated (we 
use PEBS for precise samples and use the LBR hardware to rewind 
the IP), due to which annotation output looks like this:

         :        ffffffff810a6f51 <do_raw_spin_lock>:                                            ▒
    1.77 :        ffffffff810a6f51:       mov    $0x10000,%eax                                    ▒
   44.95 :        ffffffff810a6f56:       lock xadd %eax,(%rdi)                                   ▒
    1.25 :        ffffffff810a6f5a:       mov    %eax,%edx                                        ▒
    0.29 :        ffffffff810a6f5c:       shr    $0x10,%edx                                       ▒
    1.21 :        ffffffff810a6f5f:       cmp    %dx,%ax                                          ▒
    0.01 :        ffffffff810a6f62:       je     ffffffff810a6f6b <do_raw_spin_lock+0x1a>         ▒
   29.81 :        ffffffff810a6f64:       pause                                                   ▒
   16.45 :        ffffffff810a6f66:       mov    (%rdi),%ax                                       ▒
    4.27 :        ffffffff810a6f69:       jmp    ffffffff810a6f5f <do_raw_spin_lock+0xe>          ▒
    0.00 :        ffffffff810a6f6b:       retq                                                    ▒

the entries are both precise and show up in the right place.

On Core2 CPUs there's PEBS so 'p' will work, but there's no LBR 
so the IP-rewinding does not work.

Thanks,

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/