lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <37D7C6CF3E00A74B8858931C1DB2F0770196C933@SHSMSX103.ccr.corp.intel.com>
Date:	Thu, 8 Oct 2015 18:37:43 +0000
From:	"Liang, Kan" <kan.liang@...el.com>
To:	Jiri Olsa <jolsa@...nel.org>,
	Arnaldo Carvalho de Melo <acme@...nel.org>
CC:	Andi Kleen <andi@...stfloor.org>,
	Ulrich Drepper <drepper@...il.com>,
	"Will Deacon" <will.deacon@....com>,
	Stephane Eranian <eranian@...gle.com>,
	"Don Zickus" <dzickus@...hat.com>,
	lkml <linux-kernel@...r.kernel.org>,
	"David Ahern" <dsahern@...il.com>, Ingo Molnar <mingo@...nel.org>,
	Namhyung Kim <namhyung@...nel.org>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>
Subject: RE: [PATCHv3 00/56] perf stat: Add scripting support


Hi Jirka

I'm doing some test with the perf/stat_script tree.
Here are the issues I found so far.

1 test case failed to build:

  CC       tests/stat.o
tests/stat.c: In function âtest__synthesize_statâ:
tests/stat.c:82:3: error: missing initializer for field âenaâ of
âstruct <anonymous>â [-Werror=missing-field-initializers]
   .ena = 200,
   ^
In file included from tests/stat.c:5:0:
/home/lk/group_read/test/perf/tools/perf/util/counts.h:10:8:
note: âenaâ declared here
    u64 ena;
        ^
tests/stat.c:83:3: error: missing initializer for field ârunâ of
âstruct <anonymous>â [-Werror=missing-field-initializers]
   .run = 300,
   ^
In file included from tests/stat.c:5:0:
/home/lk/group_read/test/perf/tools/perf/util/counts.h:11:8:
note: ârunâ declared here
    u64 run;
        ^
cc1: all warnings being treated as errors
mv: cannot stat âtests/.stat.o.tmpâ: No such file or directory
make[3]: *** [tests/stat.o] Error 1
make[2]: *** [tests] Error 2
make[1]: *** [perf-in.o] Error 2
make: *** [all] Error 2


2. it still doesn't work well with uncore event.

Here is an example:
sudo ./perf stat -e uncore_imc_1/cas_count_read/ -a --interval-print 100
--per-socket record -- sleep 1
#           time socket cpus             counts unit events
     0.100231142 S0        1               0.04 MiB  uncore_imc_1/cas_count_read/
     0.100231142 S1        1               0.08 MiB  uncore_imc_1/cas_count_read/
     0.200583556 S0        1               0.02 MiB  uncore_imc_1/cas_count_read/
     0.200583556 S1        1               0.02 MiB  uncore_imc_1/cas_count_read/

sudo ./perf stat report --per-socket
#           time socket cpus             counts unit events
     0.100231142 S0       36               0.12 MiB  uncore_imc_1/cas_count_read/
     0.100231142 S1       28      <not counted> 
MiB  uncore_imc_1/cas_count_read/
     0.200583556 S0       36               0.04 MiB  uncore_imc_1/cas_count_read/
     0.200583556 S1       28      <not counted> 
MiB  uncore_imc_1/cas_count_read/

The results are not consistent. 

I checked the perf report -D result. It looks the wrong CPU id is stored.
In my system, the uncore event should be monitored in CPU 0 and CPU 19
But it shows CPU0 and CPU1.

0x520 [0x30]: PERF_RECORD_STAT
... id 22385, cpu 0, thread 0

0x550 [0x30]: PERF_RECORD_STAT
... id 22386, cpu 1, thread 0


3. I think you have already handled new record type for perf stat record.
But the "unhandled!" message still show in the perf report -D.      

0x598 [0x30]: PERF_RECORD_STAT
... id 22385, cpu 0, thread 0
... value 1752, enabled 501473097, running 501473097
: unhandled!      <-------


4. Core dump when applying --per-core -a for stat record.

sudo ./perf stat -e 'cycles' --per-core -a record sleep 1

** Error in `./perf': corrupted double-linked list: 0x00000000036eb560 **
======= Backtrace: =========
/lib64/libc.so.6[0x3002e7bbe7]
/lib64/libc.so.6[0x3002e7d2b5]
./perf(perf_env__exit+0xfd)[0x484d1d]
./perf[0x47c45f]
./perf(main+0x610)[0x421ac0]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x3002e21b45]
./perf[0x421bd9]


5.  perf stat record may not monitor on specific CPU/whole-system.
However, perf stat report can be set to aggregate per core/socket.
If so, there is problem to show the data.
I think we need some warning here.

For example,

sudo ./perf stat -e 'cycles' record sleep 1

 Performance counter stats for 'sleep 1':

           858,928      cycles

       1.001120735 seconds time elapsed

sudo ./perf stat report -A

 Performance counter stats for '/home/lk/group_read/test/perf/
tools/perf/perf stat -e cycles record sleep 1':

CPU-1              858,928      cycles

       1.001120735 seconds time elapsed



Thanks,
Kan

> hi,
> sending another version of stat scripting.
> 
> v3 changes:
>   - added attr update event to handle unit,scale,name for event
>     it fixed the uncore_imc_1/cas_count_read/ record/report
>   - perf report -D now displays stat related events
>   - some minor and changelog fixes
> 
> v2 changes:
>   - rebased to latest Arnaldo's perf/core
>   - patches 1 to 11 already merged in
>   - added --per-core/--per-socket/-A options for perf stat report
>     command to allow custom aggregation in stat report, please
>     check new examples below
>   - couple changelogs changes
> 
> The initial attempt defined its own formula lang and allowed triggering
> user's script on the end of the stat command:
>   http://marc.info/?l=linux-kernel&m=136742146322273&w=2
> 
> This patchset abandons the idea of new formula language and rather adds
> support to:
>   - store stat data into perf.data file
>   - add python support to process stat events
> 
> Basically it allows to store stat data into perf.data and post process it with
> python scripts in a similar way we do for sampling data.
> 
> The stat data are stored in new stat, stat-round, stat-config user events.
>   stat        - stored for each read syscall of the counter
>   stat round  - stored for each interval or end of the command invocation
>   stat config - stores all the config information needed to process data
>                 so report tool could restore the same output as record
> 
> The python script can now define 'stat__<eventname>_<modifier>'
> functions to get stat events data and 'stat__interval' to get stat-round data.
> 
> See CPI script example in scripts/python/stat-cpi.py.
> 
> Also available in:
>   git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git
>   perf/stat_script
> 
> thanks,
> jirka
> 
> 
> Examples:
> 
> - To record data for command stat workload:
> 
>   $ perf stat record kill
>   ...
> 
>    Performance counter stats for 'kill':
> 
>             0.372007      task-clock (msec)         #    0.613 CPUs utilized
>                    3      context-switches          #    0.008 M/sec
>                    0      cpu-migrations            #    0.000 K/sec
>                   62      page-faults               #    0.167 M/sec
>            1,129,973      cycles                    #    3.038 GHz
>      <not supported>      stalled-cycles-frontend
>      <not supported>      stalled-cycles-backend
>              813,313      instructions              #    0.72  insns per cycle
>              166,161      branches                  #  446.661 M/sec
>                8,747      branch-misses             #    5.26% of all branches
> 
>          0.000607287 seconds time elapsed
> 
> - To report perf stat data:
> 
>   $ perf stat report
> 
>    Performance counter stats for '/home/jolsa/bin/perf stat record kill':
> 
>             0.372007      task-clock (msec)         #      inf CPUs utilized
>                    3      context-switches          #    0.008 M/sec
>                    0      cpu-migrations            #    0.000 K/sec
>                   62      page-faults               #    0.167 M/sec
>            1,129,973      cycles                    #    3.038 GHz
>      <not supported>      stalled-cycles-frontend
>      <not supported>      stalled-cycles-backend
>              813,313      instructions              #    0.72  insns per cycle
>              166,161      branches                  #  446.661 M/sec
>                8,747      branch-misses             #    5.26% of all branches
> 
>          0.000000000 seconds time elapsed
> 
> - To store system-wide period stat data:
> 
>   $ perf stat -e cycles:u,instructions:u -a -I 1000 record
>   #           time             counts unit events
>        1.000265471        462,311,482      cycles:u                   (100.00%)
>        1.000265471        590,037,440      instructions:u
>        2.000483453        722,532,336      cycles:u                   (100.00%)
>        2.000483453        848,678,197      instructions:u
>        3.000759876         75,990,880      cycles:u                   (100.00%)
>        3.000759876         86,187,813      instructions:u
>   ^C     3.213960893         85,329,533      cycles:u                   (100.00%)
>        3.213960893        135,954,296      instructions:u
> 
> - To report perf stat data:
> 
>   $ perf stat report
>   #           time             counts unit events
>        1.000265471        462,311,482      cycles:u                   (100.00%)
>        1.000265471        590,037,440      instructions:u
>        2.000483453        722,532,336      cycles:u                   (100.00%)
>        2.000483453        848,678,197      instructions:u
>        3.000759876         75,990,880      cycles:u                   (100.00%)
>        3.000759876         86,187,813      instructions:u
>        3.213960893         85,329,533      cycles:u                   (100.00%)
>        3.213960893        135,954,296      instructions:u
> 
> - To run stat-cpi.py script over perf.data:
> 
>   $ perf script -s scripts/python/stat-cpi.py
>          1.000265: cpu -1, thread -1 -> cpi 0.783529 (462311482/590037440)
>          2.000483: cpu -1, thread -1 -> cpi 0.851362 (722532336/848678197)
>          3.000760: cpu -1, thread -1 -> cpi 0.881689 (75990880/86187813)
>          3.213961: cpu -1, thread -1 -> cpi 0.627634 (85329533/135954296)
> 
> - To pipe data from stat to stat-cpi script:
> 
>   $ perf stat -e cycles:u,instructions:u -A -C 0 -I 1000 record | perf script -s
> scripts/python/stat-cpi.py
>          1.000192: cpu 0, thread -1 -> cpi 0.739535 (23921908/32347236)
>          2.000376: cpu 0, thread -1 -> cpi 1.663482 (2519340/1514498)
>          3.000621: cpu 0, thread -1 -> cpi 1.396308 (16162767/11575362)
>          4.000700: cpu 0, thread -1 -> cpi 1.092246 (20077258/18381624)
>          5.000867: cpu 0, thread -1 -> cpi 0.473816 (45157586/95306156)
>          6.001034: cpu 0, thread -1 -> cpi 0.532792 (43701668/82023818)
>          7.001195: cpu 0, thread -1 -> cpi 1.122059 (29890042/26638561)
> 
> - Raw script stat data output:
> 
>   $ perf stat -e cycles:u,instructions:u -A -C 0 -I 1000 record | perf --no-
> pager script
>   CPU   THREAD             VAL             ENA             RUN            TIME EVENT
>     0       -1        12302059      1000811347      1000810712      1000198821 cycles:u
>     0       -1         2565362      1000823218      1000823218      1000198821
> instructions:u
>     0       -1        14453353      1000812704      1000812704      2000382283 cycles:u
>     0       -1         4600932      1000799342      1000799342      2000382283
> instructions:u
>     0       -1        15245106      1000774425      1000774425      3000538255 cycles:u
>     0       -1         2624324      1000769310      1000769310      3000538255
> instructions:u
> 
> - To display different aggregation in report:
> 
>   $ perf stat -e cycles -a -I 1000 record sleep 3
>   #           time             counts unit events
>        1.000223609        703,427,617      cycles
>        2.000443651        609,975,307      cycles
>        3.000569616        668,479,597      cycles
>        3.000735323          1,155,816      cycles
> 
>   $ perf stat report
>   #           time             counts unit events
>        1.000223609        703,427,617      cycles
>        2.000443651        609,975,307      cycles
>        3.000569616        668,479,597      cycles
>        3.000735323          1,155,816      cycles
> 
>   $ perf stat report --per-core
>   #           time core         cpus             counts unit events
>        1.000223609 S0-C0           2        327,612,412      cycles
>        1.000223609 S0-C1           2        375,815,205      cycles
>        2.000443651 S0-C0           2        287,462,177      cycles
>        2.000443651 S0-C1           2        322,513,130      cycles
>        3.000569616 S0-C0           2        271,571,908      cycles
>        3.000569616 S0-C1           2        396,907,689      cycles
>        3.000735323 S0-C0           2            694,977      cycles
>        3.000735323 S0-C1           2            460,839      cycles
> 
>   $ perf stat report --per-socket
>   #           time socket cpus             counts unit events
>        1.000223609 S0        4        703,427,617      cycles
>        2.000443651 S0        4        609,975,307      cycles
>        3.000569616 S0        4        668,479,597      cycles
>        3.000735323 S0        4          1,155,816      cycles
> 
>   $ perf stat report -A
>   #           time CPU                counts unit events
>        1.000223609 CPU0           205,431,505      cycles
>        1.000223609 CPU1           122,180,907      cycles
>        1.000223609 CPU2           176,649,682      cycles
>        1.000223609 CPU3           199,165,523      cycles
>        2.000443651 CPU0           148,447,922      cycles
>        2.000443651 CPU1           139,014,255      cycles
>        2.000443651 CPU2           204,436,559      cycles
>        2.000443651 CPU3           118,076,571      cycles
>        3.000569616 CPU0           149,788,954      cycles
>        3.000569616 CPU1           121,782,954      cycles
>        3.000569616 CPU2           247,277,700      cycles
>        3.000569616 CPU3           149,629,989      cycles
>        3.000735323 CPU0               269,675      cycles
>        3.000735323 CPU1               425,302      cycles
>        3.000735323 CPU2               364,169      cycles
>        3.000735323 CPU3                96,670      cycles
> 
> 
> Cc: Andi Kleen <andi@...stfloor.org>
> Cc: Ulrich Drepper <drepper@...il.com>
> Cc: Will Deacon <will.deacon@....com>
> Cc: Stephane Eranian <eranian@...gle.com>
> Cc: Don Zickus <dzickus@...hat.com>
> ---
> Jiri Olsa (56):
>       perf tools: Add thread_map event
>       perf tools: Add thread_map event sythesize function
>       perf tools: Add thread_map__new_event function
>       perf tools: Add perf_event__fprintf_thread_map function
>       perf tools: Add cpu_map event
>       perf tools: Add cpu_map event synthesize function
>       perf tools: Add cpu_map__new_event function
>       perf tools: Add perf_event__fprintf_cpu_map function
>       perf tools: Add stat config event
>       perf tools: Add stat config event synthesize function
>       perf tools: Add stat config event read function
>       perf tools: Add stat event
>       perf tools: Add stat event synthesize function
>       perf tools: Add stat event read function
>       perf tools: Add stat round event
>       perf tools: Add stat round event synthesize function
>       perf tools: Add stat events fprintf functions
>       perf tools: Add attr_update event
>       perf tools: Add attr_update event unit type
>       perf tools: Add attr_update event scale type
>       perf tools: Add attr_update event name type
>       perf tools: Add perf_event__fprintf_attr_update function
>       perf report: Display newly added events in raw dump
>       perf tools: Introduce stat feature
>       perf tools: Move id_offset out of struct perf_evsel union
>       perf stat: Rename perf_stat struct into perf_stat_evsel
>       perf stat: Add AGGR_UNSET mode
>       perf stat record: Add record command
>       perf stat record: Initialize record features
>       perf stat record: Synthesize stat record data
>       perf stat record: Store events IDs in perf data file
>       perf stat record: Add pipe support for record command
>       perf stat record: Write stat events on record
>       perf stat record: Write stat round events on record
>       perf stat record: Do not allow record with multiple runs mode
>       perf stat record: Synthesize attr update events
>       perf tools: Add cpu_map__empty_new interface
>       perf tools: Make cpu_map__build_map global
>       perf tools: Add data arg to cpu_map__build_map callback
>       perf stat report: Cache aggregated map entries in extra cpumap
>       perf stat report: Add report command
>       perf stat report: Process cpu/threads maps
>       perf stat report: Process stat config event
>       perf stat report: Add support to initialize aggr_map from file
>       perf stat report: Process stat and stat round events
>       perf stat report: Process attr update events
>       perf stat report: Move csv_sep initialization before report command
>       perf stat report: Allow to override aggr_mode
>       perf script: Check output fields only for samples
>       perf script: Process cpu/threads maps
>       perf script: Process stat config event
>       perf script: Add process_stat/process_stat_interval scripting interface
>       perf script: Add stat default handlers
>       perf script: Display stat events by default
>       perf script: Add python support for stat events
>       perf script: Add stat-cpi.py script
> 
>  tools/perf/Documentation/perf-stat.txt             |  34 ++
>  tools/perf/builtin-record.c                        |   2 +
>  tools/perf/builtin-script.c                        | 144 ++++-
>  tools/perf/builtin-stat.c                          | 636 ++++++++++++++++++++-
>  tools/perf/scripts/python/stat-cpi.py              |  74 +++
>  tools/perf/tests/Build                             |   3 +
>  tools/perf/tests/attr_update.c                     |  82 +++
>  tools/perf/tests/builtin-test.c                    |  25 +
>  tools/perf/tests/cpumap.c                          |  39 ++
>  tools/perf/tests/stat.c                            | 111 ++++
>  tools/perf/tests/tests.h                           |   7 +
>  tools/perf/tests/thread-map.c                      |  43 ++
>  tools/perf/tests/topology.c                        |   4 +-
>  tools/perf/util/cpumap.c                           |  61 +-
>  tools/perf/util/cpumap.h                           |  11 +-
>  tools/perf/util/event.c                            | 205 +++++++
>  tools/perf/util/event.h                            | 122 +++-
>  tools/perf/util/evlist.c                           |   6 +-
>  tools/perf/util/evlist.h                           |   3 +
>  tools/perf/util/evsel.h                            |   2 +-
>  tools/perf/util/header.c                           | 146 +++++
>  tools/perf/util/header.h                           |  14 +
>  .../util/scripting-engines/trace-event-python.c    | 114 +++-
>  tools/perf/util/session.c                          | 163 ++++++
>  tools/perf/util/stat.c                             |  75 ++-
>  tools/perf/util/stat.h                             |  13 +-
>  tools/perf/util/thread_map.c                       |  27 +
>  tools/perf/util/thread_map.h                       |   3 +
>  tools/perf/util/tool.h                             |   8 +-
>  tools/perf/util/trace-event.h                      |   4 +
>  30 files changed, 2132 insertions(+), 49 deletions(-)  create mode 100644
> tools/perf/scripts/python/stat-cpi.py
>  create mode 100644 tools/perf/tests/attr_update.c  create mode 100644
> tools/perf/tests/cpumap.c  create mode 100644 tools/perf/tests/stat.c
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ