lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <37D7C6CF3E00A74B8858931C1DB2F07701953495@SHSMSX103.ccr.corp.intel.com>
Date:	Wed, 30 Sep 2015 21:09:39 +0000
From:	"Liang, Kan" <kan.liang@...el.com>
To:	Jiri Olsa <jolsa@...nel.org>,
	Arnaldo Carvalho de Melo <acme@...nel.org>
CC:	Andi Kleen <andi@...stfloor.org>,
	Ulrich Drepper <drepper@...il.com>,
	Will Deacon <will.deacon@....com>,
	Stephane Eranian <eranian@...gle.com>,
	Don Zickus <dzickus@...hat.com>,
	lkml <linux-kernel@...r.kernel.org>,
	David Ahern <dsahern@...il.com>,
	Ingo Molnar <mingo@...nel.org>,
	"Namhyung Kim" <namhyung@...nel.org>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>
Subject: RE: [PATCHv2 00/45] perf stat: Add scripting support


> hi,
> sending another version of stat scripting.
> 
> v2 changes:
>   - rebased to latest Arnaldo's perf/core
>   - patches 1 to 11 already merged in
>   - added --per-core/--per-socket/-A options for perf stat report
>     command to allow custom aggregation in stat report, please
>     check new examples below
>   - couple changelogs changes
> 
> The initial attempt defined its own formula lang and allowed triggering
> user's script on the end of the stat command:
>   http://marc.info/?l=linux-kernel&m=136742146322273&w=2
> 
> This patchset abandons the idea of new formula language and rather adds
> support to:
>   - store stat data into perf.data file
>   - add python support to process stat events
> 
> Basically it allows to store stat data into perf.data and post process it with
> python scripts in a similar way we do for sampling data.
> 
> The stat data are stored in new stat, stat-round, stat-config user events.
>   stat        - stored for each read syscall of the counter
>   stat round  - stored for each interval or end of the command invocation
>   stat config - stores all the config information needed to process data
>                 so report tool could restore the same output as record
> 
> The python script can now define 'stat__<eventname>_<modifier>'
> functions to get stat events data and 'stat__interval' to get stat-round data.
> 
> See CPI script example in scripts/python/stat-cpi.py.
> 
> Also available in:
>   git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git
>   perf/stat_script
> 
> thanks,
> jirka
> 
> 
> Examples:
> 
> - To record data for command stat workload:
> 
>   $ perf stat record kill
>   ...
> 
>    Performance counter stats for 'kill':
> 
>             0.372007      task-clock (msec)         #    0.613 CPUs utilized
>                    3      context-switches          #    0.008 M/sec
>                    0      cpu-migrations            #    0.000 K/sec
>                   62      page-faults               #    0.167 M/sec
>            1,129,973      cycles                    #    3.038 GHz
>      <not supported>      stalled-cycles-frontend
>      <not supported>      stalled-cycles-backend
>              813,313      instructions              #    0.72  insns per cycle
>              166,161      branches                  #  446.661 M/sec
>                8,747      branch-misses             #    5.26% of all branches
> 
>          0.000607287 seconds time elapsed
> 

The default file for perf stat record is perf.data.
It's easy to be mix up with the data file from perf record.
How about using perf.data.stat to instead?


> - To report perf stat data:
> 
>   $ perf stat report
> 
>    Performance counter stats for '/home/jolsa/bin/perf stat record kill':
> 
>             0.372007      task-clock (msec)         #      inf CPUs utilized
>                    3      context-switches          #    0.008 M/sec
>                    0      cpu-migrations            #    0.000 K/sec
>                   62      page-faults               #    0.167 M/sec
>            1,129,973      cycles                    #    3.038 GHz
>      <not supported>      stalled-cycles-frontend
>      <not supported>      stalled-cycles-backend
>              813,313      instructions              #    0.72  insns per cycle
>              166,161      branches                  #  446.661 M/sec
>                8,747      branch-misses             #    5.26% of all branches
> 
>          0.000000000 seconds time elapsed
> 
> - To store system-wide period stat data:
> 
>   $ perf stat -e cycles:u,instructions:u -a -I 1000 record
>   #           time             counts unit events
>        1.000265471        462,311,482      cycles:u                   (100.00%)
>        1.000265471        590,037,440      instructions:u
>        2.000483453        722,532,336      cycles:u                   (100.00%)
>        2.000483453        848,678,197      instructions:u
>        3.000759876         75,990,880      cycles:u                   (100.00%)
>        3.000759876         86,187,813      instructions:u
>   ^C     3.213960893         85,329,533      cycles:u                   (100.00%)
>        3.213960893        135,954,296      instructions:u
> 
> - To report perf stat data:
>

Could we support perf report as well?
If I run perf report with the data file, there are some warnings.
We know the data file is from perf stat or perf record, so it should
be not hard to handle the warnings.
Also it's better that all the new record type (CPU/THREAD_MAP,
STAT_CONFIG, STAT and etc) can be dumped by perf report -D.
It shows unhandled now.

>   $ perf stat report
>   #           time             counts unit events
>        1.000265471        462,311,482      cycles:u                   (100.00%)
>        1.000265471        590,037,440      instructions:u
>        2.000483453        722,532,336      cycles:u                   (100.00%)
>        2.000483453        848,678,197      instructions:u
>        3.000759876         75,990,880      cycles:u                   (100.00%)
>        3.000759876         86,187,813      instructions:u
>        3.213960893         85,329,533      cycles:u                   (100.00%)
>        3.213960893        135,954,296      instructions:u
> 
> - To run stat-cpi.py script over perf.data:
> 
>   $ perf script -s scripts/python/stat-cpi.py
>          1.000265: cpu -1, thread -1 -> cpi 0.783529 (462311482/590037440)
>          2.000483: cpu -1, thread -1 -> cpi 0.851362 (722532336/848678197)
>          3.000760: cpu -1, thread -1 -> cpi 0.881689 (75990880/86187813)
>          3.213961: cpu -1, thread -1 -> cpi 0.627634 (85329533/135954296)
> 
> - To pipe data from stat to stat-cpi script:
> 
>   $ perf stat -e cycles:u,instructions:u -A -C 0 -I 1000 record | perf script -s
> scripts/python/stat-cpi.py
>          1.000192: cpu 0, thread -1 -> cpi 0.739535 (23921908/32347236)
>          2.000376: cpu 0, thread -1 -> cpi 1.663482 (2519340/1514498)
>          3.000621: cpu 0, thread -1 -> cpi 1.396308 (16162767/11575362)
>          4.000700: cpu 0, thread -1 -> cpi 1.092246 (20077258/18381624)
>          5.000867: cpu 0, thread -1 -> cpi 0.473816 (45157586/95306156)
>          6.001034: cpu 0, thread -1 -> cpi 0.532792 (43701668/82023818)
>          7.001195: cpu 0, thread -1 -> cpi 1.122059 (29890042/26638561)
> 
> - Raw script stat data output:
> 
>   $ perf stat -e cycles:u,instructions:u -A -C 0 -I 1000 record | perf --no-
> pager script
>   CPU   THREAD             VAL             ENA             RUN            TIME EVENT
>     0       -1        12302059      1000811347      1000810712      1000198821 cycles:u
>     0       -1         2565362      1000823218      1000823218      1000198821
> instructions:u
>     0       -1        14453353      1000812704      1000812704      2000382283 cycles:u
>     0       -1         4600932      1000799342      1000799342      2000382283
> instructions:u
>     0       -1        15245106      1000774425      1000774425      3000538255 cycles:u
>     0       -1         2624324      1000769310      1000769310      3000538255
> instructions:u
> 
> - To display different aggregation in report:
> 


This one doesn't work well with uncore event.

sudo ./perf stat -e uncore_imc_1/cas_count_read/ -a --per-socket record 
-- sleep 5                   
 Performance counter stats for 'system wide':

S0        1               0.87 MiB  uncore_imc_1/cas_count_read/
S1        1               0.41 MiB  uncore_imc_1/cas_count_read/

       5.000910939 seconds time elapsed

sudo ./perf stat report --per-socket

 Performance counter stats for '/home/lk/group_read/test/perf/tools/
perf/perf stat -e uncore_imc_1/cas_count_read/ -a --per-socket record 
-- sleep 5':

S0       36             20,973      uncore_imc_1/cas_count_read/
S1       28      <not counted>      uncore_imc_1/cas_count_read/

       5.000910939 seconds time elapsed

>   $ perf stat -e cycles -a -I 1000 record sleep 3
>   #           time             counts unit events
>        1.000223609        703,427,617      cycles
>        2.000443651        609,975,307      cycles
>        3.000569616        668,479,597      cycles
>        3.000735323          1,155,816      cycles
> 
>   $ perf stat report
>   #           time             counts unit events
>        1.000223609        703,427,617      cycles
>        2.000443651        609,975,307      cycles
>        3.000569616        668,479,597      cycles
>        3.000735323          1,155,816      cycles
> 
>   $ perf stat report --per-core
>   #           time core         cpus             counts unit events
>        1.000223609 S0-C0           2        327,612,412      cycles
>        1.000223609 S0-C1           2        375,815,205      cycles
>        2.000443651 S0-C0           2        287,462,177      cycles
>        2.000443651 S0-C1           2        322,513,130      cycles
>        3.000569616 S0-C0           2        271,571,908      cycles
>        3.000569616 S0-C1           2        396,907,689      cycles
>        3.000735323 S0-C0           2            694,977      cycles
>        3.000735323 S0-C1           2            460,839      cycles
> 
>   $ perf stat report --per-socket
>   #           time socket cpus             counts unit events
>        1.000223609 S0        4        703,427,617      cycles
>        2.000443651 S0        4        609,975,307      cycles
>        3.000569616 S0        4        668,479,597      cycles
>        3.000735323 S0        4          1,155,816      cycles
> 
>   $ perf stat report -A
>   #           time CPU                counts unit events
>        1.000223609 CPU0           205,431,505      cycles
>        1.000223609 CPU1           122,180,907      cycles
>        1.000223609 CPU2           176,649,682      cycles
>        1.000223609 CPU3           199,165,523      cycles
>        2.000443651 CPU0           148,447,922      cycles
>        2.000443651 CPU1           139,014,255      cycles
>        2.000443651 CPU2           204,436,559      cycles
>        2.000443651 CPU3           118,076,571      cycles
>        3.000569616 CPU0           149,788,954      cycles
>        3.000569616 CPU1           121,782,954      cycles
>        3.000569616 CPU2           247,277,700      cycles
>        3.000569616 CPU3           149,629,989      cycles
>        3.000735323 CPU0               269,675      cycles
>        3.000735323 CPU1               425,302      cycles
>        3.000735323 CPU2               364,169      cycles
>        3.000735323 CPU3                96,670      cycles
> 
> 
> Cc: Andi Kleen <andi@...stfloor.org>
> Cc: Ulrich Drepper <drepper@...il.com>
> Cc: Will Deacon <will.deacon@....com>
> Cc: Stephane Eranian <eranian@...gle.com>
> Cc: Don Zickus <dzickus@...hat.com>
> ---
> Jiri Olsa (45):
>       perf tools: Add thread_map event
>       perf tools: Add thread_map event sythesize function
>       perf tools: Add thread_map__new_event function
>       perf tools: Add cpu_map event
>       perf tools: Add cpu_map event synthesize function
>       perf tools: Add cpu_map__new_event function
>       perf tools: Add stat config event
>       perf tools: Add stat config event synthesize function
>       perf tools: Add stat config event read function
>       perf tools: Add stat event
>       perf tools: Add stat event synthesize function
>       perf tools: Add stat event read function
>       perf tools: Add stat round event
>       perf tools: Add stat round event synthesize function
>       perf tools: Introduce stat feature
>       perf tools: Move id_offset out of struct perf_evsel union
>       perf stat: Rename perf_stat struct into perf_stat_evsel
>       perf stat: Add AGGR_UNSET mode
>       perf stat record: Add record command
>       perf stat record: Initialize record features
>       perf stat record: Synthesize stat record data
>       perf stat record: Store events IDs in perf data file
>       perf stat record: Add pipe support for record command
>       perf stat record: Write stat events on record
>       perf stat record: Write stat round events on record
>       perf stat record: Do not allow record with multiple runs mode
>       perf tools: Add cpu_map__empty_new interface
>       perf tools: Make cpu_map__build_map global
>       perf tools: Add data arg to cpu_map__build_map callback
>       perf stat report: Cache aggregated map entries in extra cpumap
>       perf stat report: Add report command
>       perf stat report: Process cpu/threads maps
>       perf stat report: Process stat config event
>       perf stat report: Add support to initialize aggr_map from file
>       perf stat report: Process stat and stat round events
>       perf stat report: Move csv_sep initialization before report command
>       perf stat report: Allow to override aggr_mode
>       perf script: Check output fields only for samples
>       perf script: Process cpu/threads maps
>       perf script: Process stat config event
>       perf script: Add process_stat/process_stat_interval scripting interface
>       perf script: Add stat default handlers
>       perf script: Display stat events by default
>       perf script: Add python support for stat events
>       perf script: Add stat-cpi.py script
> 
>  tools/perf/Documentation/perf-stat.txt                 |  34 ++++++
>  tools/perf/builtin-record.c                            |   2 +
>  tools/perf/builtin-script.c                            | 144
> +++++++++++++++++++++++-
>  tools/perf/builtin-stat.c                              | 584
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> ++++++++++++++++++++++++++++++++++++---
>  tools/perf/scripts/python/stat-cpi.py                  |  74 +++++++++++++
>  tools/perf/tests/Build                                 |   2 +
>  tools/perf/tests/builtin-test.c                        |  21 ++++
>  tools/perf/tests/cpumap.c                              |  39 +++++++
>  tools/perf/tests/stat.c                                | 111 +++++++++++++++++++
>  tools/perf/tests/tests.h                               |   6 +
>  tools/perf/tests/thread-map.c                          |  43 +++++++
>  tools/perf/tests/topology.c                            |   4 +-
>  tools/perf/util/cpumap.c                               |  61 ++++++++--
>  tools/perf/util/cpumap.h                               |  11 +-
>  tools/perf/util/event.c                                | 172
> ++++++++++++++++++++++++++++
>  tools/perf/util/event.h                                | 100 ++++++++++++++++-
>  tools/perf/util/evlist.c                               |   6 +-
>  tools/perf/util/evlist.h                               |   3 +
>  tools/perf/util/evsel.h                                |   2 +-
>  tools/perf/util/header.c                               |  14 +++
>  tools/perf/util/header.h                               |   1 +
>  tools/perf/util/scripting-engines/trace-event-python.c | 114
> ++++++++++++++++++-
>  tools/perf/util/session.c                              | 123 +++++++++++++++++++++
>  tools/perf/util/stat.c                                 |  36 +++++-
>  tools/perf/util/stat.h                                 |   9 +-
>  tools/perf/util/thread_map.c                           |  27 +++++
>  tools/perf/util/thread_map.h                           |   3 +
>  tools/perf/util/tool.h                                 |   7 +-
>  tools/perf/util/trace-event.h                          |   4 +
>  29 files changed, 1708 insertions(+), 49 deletions(-)  create mode 100644
> tools/perf/scripts/python/stat-cpi.py
>  create mode 100644 tools/perf/tests/cpumap.c  create mode 100644
> tools/perf/tests/stat.c
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ