linux-kernel - Re: [PATCH RFC V9 2/3] perf,tools: per-event callgraph support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150807154938.GE3325@kernel.org>
Date:	Fri, 7 Aug 2015 12:49:38 -0300
From:	Arnaldo Carvalho de Melo <acme@...nel.org>
To:	Kan Liang <kan.liang@...el.com>
Cc:	Jiri Olsa <jolsa@...nel.org>, Namhyung Kim <namhyung@...nel.org>,
	Andi Kleen <ak@...ux.intel.com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH RFC V9 2/3] perf,tools: per-event callgraph support

Em Fri, Aug 07, 2015 at 12:38:43PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Thu, Aug 06, 2015 at 03:44:53PM -0400, kan.liang@...el.com escreveu:
> > From: Kan Liang <kan.liang@...el.com>
> > Here are some examples and test results.
> > 
> > 1. Comparing the elapsed time and perf.data size from "kernbench -M -H".
> > 
> >  The test command for FULL callgraph and time support.
> >    "perf record -e
> >    '{cpu/cpu-cycles,period=100000/,cpu/instructions,period=20000/p}'
> >    --call-graph fp --time"

Some other oddity I noticed while testing:

For:

[root@zoo ~]# perf report --header-only
# ========
# captured on: Fri Aug  7 12:39:13 2015
# hostname : zoo
# os release : 4.2.0-rc5+
# perf version : 4.2.rc4.g1cd951
# arch : x86_64
# nrcpus online : 4
# nrcpus avail : 4
# cpudesc : Intel(R) Core(TM) i7-3667U CPU @ 2.00GHz
# cpuid : GenuineIntel,6,58,9
# total memory : 8080692 kB
# cmdline : /home/acme/bin/perf record -e {cpu/cpu-cycles,call-graph=fp,time,period=10000/pp,cpu/instructions,call-graph=no,time=0,period=20000/p} -a 
# event : name = cpu/cpu-cycles,call-graph=fp,time,period=10000/pp, , id = { 38537, 38538, 38539, 38540 }, type = 4, size = 112, config = 0x3c, { sample_perio
# event : name = cpu/instructions,call-graph=no,time=0,period=20000/p, , id = { 38541, 38542, 38543, 38544 }, type = 4, size = 112, config = 0xc0, { sample_pe
# HEADER_CPU_TOPOLOGY info available, use -I to display
# HEADER_NUMA_TOPOLOGY info available, use -I to display
# pmu mappings: cpu = 4, software = 1, power = 7, uncore_imc = 8, tracepoint = 2, uncore_cbox_0 = 9, uncore_cbox_1 = 10, breakpoint = 5
# group: {cpu/cpu-cycles,call-graph=fp,time,period=10000/pp,cpu/instructions,call-graph=no,time=0,period=20000/p}
# ========
#

I get:

Samples: 2K of event 'cpu/instructions,call-graph=no,time=0,period=20000/p', Event count (approx.): 46956518
  Children      Self  Command          Shared Object Symbol                  ◆
-   67.56%     0.00%  qemu-system-x86  [unknown]     [.] 0xad5e258d4c544155  ▒
     0xad5e258d4c544155                                                      ▒
-   67.56%     0.00%  qemu-system-x86  libc-2.20.so  [.] __libc_start_main   ▒
     __libc_start_main                                                       ▒
     0xad5e258d4c544155                                                      ▒
-   67.56%     0.00%  qemu-system-x86  perf          [.] main                ▒
     main                                                                    ▒
     __libc_start_main                                                       ▒
     0xad5e258d4c544155                                                      ▒
-   67.56%     0.00%  qemu-system-x86  perf          [.] run_builtin         ▒
     run_builtin                                                             ▒
     main                                                                    ▒
     __libc_start_main                                                       ▒

This is in the 'perf report' TUI, why, for an event with 'callgraph=no', we
get callchains? How come?

If I try it with the --stdio, well, its there as well:

# Samples: 2K of event 'cpu/instructions,call-graph=no,time=0,period=20000/p'
# Event count (approx.): 46956518
#
# Children      Self  Command          Shared Object               Symbol                                        
# ........  ........  ...............  ..........................  ..............................................
#
    67.56%     0.00%  qemu-system-x86  [unknown]                   [.] 0xad5e258d4c544155                        
            |
            ---0xad5e258d4c544155

    67.56%     0.00%  qemu-system-x86  libc-2.20.so                [.] __libc_start_main                         
            |
            ---__libc_start_main
               0xad5e258d4c544155

    67.56%     0.00%  qemu-system-x86  perf                        [.] main                                      
            |
            ---main
               __libc_start_main
               0xad5e258d4c544155

    67.56%     0.00%  qemu-system-x86  perf                        [.] run_builtin                               
            |
            ---run_builtin
               main
               __libc_start_main
               0xad5e258d4c544155

Can you take a look? What I have, i.e. the patches from Jiri and Kan
moving stuff out of the way of the python binding and this patch is at
the tmp.perf/core branch in my git tree:

  git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git

- Arnaldo

 
> Jiri, while testing this I noticed that the message for EINVAL when
> using the cpu// syntax (per-event settings) is cryptic:
> 
>   [root@zoo ~]# perf record -e 'cpu/cpu-cycles,call-graph=fp,time,period=100000/p' ls
>   Error:
>   The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (cpu/cpu-cycles,call-graph=fp,time,period=100000/p).
>   /bin/dmesg may provide additional information.
>   No CONFIG_PERF_EVENTS=y kernel support configured?
> 
> Whereas if we use -F, it is much, much clearer, telling the user exactly
> what is failing and what needs to be done to make it work:
> 
>   [root@zoo ~]# perf record -F 100000 -e cpu/cpu-cycles/ usleep 1
>   Maximum frequency rate (25000) reached.
>   Please use -F freq option with lower value or consider
>   tweaking /proc/sys/kernel/perf_event_max_sample_rate.
>   [root@zoo ~]# 
> 
> Hope this is something easy to wire up, given your event parsing kung foo
> skillz...
> 
> ;-)
> 
> - Arnaldo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/