lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5588FCCE.8090403@intel.com>
Date:	Tue, 23 Jun 2015 09:29:34 +0300
From:	Adrian Hunter <adrian.hunter@...el.com>
To:	Arnaldo Carvalho de Melo <acme@...nel.org>
CC:	Ingo Molnar <mingo@...nel.org>, linux-kernel@...r.kernel.org,
	Jiri Olsa <jolsa@...hat.com>
Subject: Re: [PATCH V6 08/17] perf tools: Add Intel PT support

On 23/06/15 02:00, Arnaldo Carvalho de Melo wrote:
> Em Mon, Jun 22, 2015 at 11:26:34PM +0300, Adrian Hunter escreveu:
>> On 22/06/2015 9:24 p.m., Arnaldo Carvalho de Melo wrote:
>>> Em Fri, Jun 19, 2015 at 04:41:56PM -0300, Arnaldo Carvalho de Melo escreveu:
>>>> Em Fri, Jun 19, 2015 at 10:33:43PM +0300, Adrian Hunter escreveu:
>>>>> On 19/06/2015 7:04 p.m., Arnaldo Carvalho de Melo wrote:
>>>>>> Em Fri, May 29, 2015 at 04:33:36PM +0300, Adrian Hunter escreveu:
>>>>>>> Add support for Intel Processor Trace.
>>>>
>>>>>>> Intel PT support fits within the new auxtrace infrastructure.
>>>>>>> Recording is supporting by identifying the Intel PT PMU,
>>>>>>> parsing options and setting up events.  Decoding is supported
>>>>>>> by queuing up trace data by cpu or thread and then decoding
>>>>>>> synchronously delivering synthesized event samples into the
>>>>>>> session processing for tools to consume.
>>>>
>>>>>> So, at this point what commands should I use to test this? I expected to
>>>>>> be able to have some command here, in this changeset log, telling me
>>>>>> that what has been applied so far + this "Add Intel PT support", can be
>>>>>> used in such and such a fashion, obtaining this and that output.
>>>>
>>>>>> Now I'll go back and look at the cover letter to see what I can do at
>>>>>> this point and with access to a Broadwell class machine.
>>>>
>>>>> Actually you need the next patch "perf tools: Take Intel PT into use" to do anything.
>>>>
>>>> Yeah, saw that, the title of this patch fooled me into thinking that
>>>> Intel PT support was added :-)
>>>>
>>>> Anyway, stopping for a moment to push stuff ready to Ingo, will get back
>>>> to this after that.
>>>
>>> So, got back to it, added that "take it into use" patch and now trying
>>> to follow that documentation:
>>>
>>> [root@...f4 ~]# perf evlist
>>> intel_pt//u
>>> sched:sched_switch
>>> dummy:u
>>> [root@...f4 ~]# perf report
>>> [root@...f4 ~]#  perf record -e intel_pt//u -a sleep 10
>>> [ perf record: Woken up 1 times to write data ]
>>> [ perf record: Captured and wrote 0.379 MB perf.data ]
>>> [root@...f4 ~]#
>>> [root@...f4 ~]#
>>> [root@...f4 ~]# perf report
>>> [root@...f4 ~]# perf evlist
>>> intel_pt//u
>>> sched:sched_switch
>>> dummy:u
>>> [root@...f4 ~]# uname -r
>>> 4.1.0-rc8
>>> [root@...f4 ~]#
>>>
>>> I am not getting any "intel_pt//u" event, ideas?
>>
>> Events are synthesized by the decoder.  You should see 'instructions:u' events.
>>
>> What does perf report --stdio give?
> 
> Well, away from a Broadwell machine now, applied a few patches more and
> I'm now trying BTS, on this Ivy Bridge notebook (MacBook Air):
> 
> [    0.000000] DMI: Apple Inc. MacBookAir5,1/Mac-66F35F19FE2A0D05, BIOS MBA51.88Z.00EF.B02.1211271028 11/27/2012
> 
> [    0.116644] perf_event_intel: PMU erratum BJ122, BV98, HSD29 worked around, HT is on
> 
> [    0.061626] TSC deadline timer enabled
> [    0.061630] smpboot: CPU0: Intel(R) Core(TM) i7-3667U CPU @ 2.00GHz (fam: 06, model: 3a, stepping: 09)
> [    0.061661] Performance Events: PEBS fmt1+, 16-deep LBR, IvyBridge events, full-width counters, Intel PMU driver.
> [    0.061685] ... version:                3
> [    0.061686] ... bit width:              48
> [    0.061687] ... generic registers:      4
> [    0.061688] ... value mask:             0000ffffffffffff
> [    0.061690] ... max period:             0000ffffffffffff
> [    0.061691] ... fixed-purpose events:   3
> [    0.061692] ... event mask:             000000070000000f
> [    0.062587] x86: Booting SMP configuration:
> [    0.062589] .... node  #0, CPUs:      #1
> [    0.074078] microcode: CPU1 microcode updated early to revision 0x1b, date = 2014-05-29
> [    0.076715] NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter.
> [    0.076825]  #2 #3
> [    0.104559] x86: Booted up 1 node, 4 CPUs
> [    0.104563] smpboot: Total of 4 processors activated (19953.49 BogoMIPS)
> 
> [root@zoo ~]# perf record -e intel_bts//u --per-thread  sleep 5
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.531 MB perf.data ]
> [root@zoo ~]# perf evlist
> intel_bts//u
> dummy:u
> [root@zoo ~]#
> 
> [root@zoo ~]# perf report --stdio | head -40
> # To display the perf.data header info, please use --header/--header-only options.
> #
> #
> # Total Lost Samples: 0
> #
> # Samples: 0  of event 'intel_bts//u'
> # Event count (approx.): 0
> #
> # Overhead  Command  Shared Object  Symbol
> # ........  .......  .............  ......
> #
> 
> 
> # Samples: 0  of event 'dummy:u'
> # Event count (approx.): 0
> #
> # Overhead  Command  Shared Object  Symbol
> # ........  .......  .............  ......
> #
> 
> 
> # Samples: 22K of event 'branches:u'
> # Event count (approx.): 22548
> #
> # Overhead  Command  Shared Object     Symbol                
> # ........  .......  ................  ......................
> #
>      9.82%  sleep    [unknown]         [.] 0x00007f8e86c09061
>      8.28%  sleep    [unknown]         [.] 0x00007f8e86ea4e7e
>      6.97%  sleep    [unknown]         [.] 0x00007f8e86c09086
>      5.88%  sleep    [unknown]         [.] 0x00007f8e86e95726
>      5.28%  sleep    [unknown]         [.] 0x00007f8e86e9730d
>      4.69%  sleep    [unknown]         [.] 0x00007f8e86b06bc1
>      4.48%  sleep    [unknown]         [.] 0x00007f8e86c090c7
>      4.01%  sleep    [unknown]         [.] 0x00007f8e86c09027
>      4.01%  sleep    [unknown]         [.] 0x00007f8e86c0904c
>      2.84%  sleep    [unknown]         [.] 0x00007f8e86c0908c
>      2.74%  sleep    [unknown]         [.] 0x00007f8e86c0909d
>      2.74%  sleep    [unknown]         [.] 0x00007f8e86c09037
>      1.20%  sleep    [unknown]         [.] 0x00007f8e86af9c68
> 
> -----------------------------------------------------------------
> 
> Ok, so it synthesized the branches:u, that don't appear on the 'perf evlist'
> output, that is the command I use to see what kinds of events are contained
> in a given perf.data file, probably we should add a note there that
> branches:u will be synthesized at 'report' time, trying with 'perf script':
> 
> [root@zoo ~]# perf script | head -10
>   :8676  8676  1 branches:u: ffffffff81799b47 [unknown] ([unknown]) => 7f8e86e8bcf0 [unknown] ([unknown])
>   :8676  8676  1 branches:u: ffffffff81799b47 [unknown] ([unknown]) => 7f8e86e8bcf0 [unknown] ([unknown])
>   :8676  8676  1 branches:u:     7f8e86e8bcf3 [unknown] ([unknown]) => 7f8e86e8f980 [unknown] ([unknown])
>   :8676  8676  1 branches:u: ffffffff81799b47 [unknown] ([unknown]) => 7f8e86e8f9a6 [unknown] ([unknown])
>   :8676  8676  1 branches:u:     7f8e86e8fa11 [unknown] ([unknown]) => 7f8e86e8fa2f [unknown] ([unknown])
>   :8676  8676  1 branches:u:     7f8e86e8fa33 [unknown] ([unknown]) => 7f8e86e8fa18 [unknown] ([unknown])
>   :8676  8676  1 branches:u:     7f8e86e8fa33 [unknown] ([unknown]) => 7f8e86e8fa18 [unknown] ([unknown])
>   :8676  8676  1 branches:u:     7f8e86e8fa3f [unknown] ([unknown]) => 7f8e86e8fc18 [unknown] ([unknown])
>   :8676  8676  1 branches:u:     7f8e86e8fc20 [unknown] ([unknown]) => 7f8e86e8fc40 [unknown] ([unknown])
> 
> The synthesized records looks sane:
> 
> [root@zoo ~]# perf report -D | grep PERF_RECORD_ | tail -15
> 0x36f8 [0x78]: PERF_RECORD_MMAP -1/0: [0xffffffffa0933000(0x5000) @ 0]: x /lib/modules/4.1.0-rc5+/kernel/net/netfilter/xt_CHECKSUM.ko
> 0x3770 [0x68]: PERF_RECORD_MMAP -1/0: [0xffffffffa0938000(0x18000) @ 0]: x /lib/modules/4.1.0-rc5+/kernel/fs/fuse/fuse.ko
> 0x37d8 [0x68]: PERF_RECORD_MMAP -1/0: [0xffffffffa0950000(0x5f6affff) @ 0]: x /lib/modules/4.1.0-rc5+/kernel/crypto/ccm.ko
> 0x3840 [0x20]: PERF_RECORD_ITRACE_START pid: 8676 tid: 8676
> 0x3860 [0x28]: PERF_RECORD_COMM exec: sleep:8676/8676
> 0x3888 [0x68]: PERF_RECORD_MMAP2 8676/8676: [0x400000(0x6000) @ 0 fd:01 525758 1481335351]: r-xp /usr/bin/sleep
> 0x38f0 [0x70]: PERF_RECORD_MMAP2 8676/8676: [0x7f8e86e8b000(0x224000) @ 0 fd:01 534148 868528687]: r-xp /usr/lib64/ld-2.20.so
> 0x3960 [0x60]: PERF_RECORD_MMAP2 8676/8676: [0x7ffdc21d4000(0x2000) @ 0x7ffdc21d4000 00:00 0 0]: ---p [vdso]
> 0x39c0 [0x70]: PERF_RECORD_MMAP2 8676/8676: [0x7f8e86ace000(0x3bd000) @ 0 fd:01 531160 868528694]: r-xp /usr/lib64/libc-2.20.so
> 0x3a30 [0x30]: PERF_RECORD_AUX offset: 0 size: 0x80688 flags: 0 []
> 0x3a60 [0x30]: PERF_RECORD_EXIT(8676:8676):(8676:8676)
> 0x3a90 [0x30]: PERF_RECORD_AUX offset: 0x80688 size: 0x3b70 flags: 0 []
> 0x3ac0 [0x30]: PERF_RECORD_EXIT(8676:8676):(8676:8676)
> 0x3af0 [0x30]: PERF_RECORD_AUXTRACE size: 0x841f8  offset: 0  ref: 0x531e5c6921c  idx: 0  tid: 8676  cpu: -1
> 0x87d18 [0x8]: PERF_RECORD_FINISHED_ROUNDAggregated stats: (excludes AUX area (e.g. instruction trace) decoded / synthesized events)
> [root@zoo ~]# 
> 
> One of those samples
> 
>>>> (0x7f8e86e8fa3f - 0x7f8e86ace000) < 0x3bd000
> False
> 
> After the libc-2.20.so map
> 
>>>> (0x7f8e86e8fa3f - 0x7f8e86e8b000) < 0x224000
> True
> 
> Ok, so a /usr/lib64/ld-2.20.so sample, and:
> 
> [root@zoo ~]# rpm -q glibc-debuginfo
> glibc-debuginfo-2.20-8.fc21.x86_64
> 
> But then, it didn't even resolve the DSO, which it should, as I did manually :-/
> 
> Will continue investigating... Perhaps this is fixed in another patch? What I
> have test merged so far is at my tmp.perf/pt branch.

I tried the same commands with perf tools from that branch (tmp.perf/pt) and
it seemed to work fine.

One reason for not getting symbols is compiling perf tools without ELF support.

> 
> I will still update the cset comments and possibly do some other changes to
> preserve bisectability or some other fix so that we get more test output
> inserted in the changesets.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ