linux-kernel - Re: [PATCHv9 0/3] perf stat: Add scripting support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160106141802.GA10415@kernel.org>
Date:	Wed, 6 Jan 2016 11:18:02 -0300
From:	Arnaldo Carvalho de Melo <acme@...nel.org>
To:	Jiri Olsa <jolsa@...nel.org>
Cc:	Andi Kleen <andi@...stfloor.org>,
	Ulrich Drepper <drepper@...il.com>,
	Will Deacon <will.deacon@....com>,
	Stephane Eranian <eranian@...gle.com>,
	Don Zickus <dzickus@...hat.com>,
	lkml <linux-kernel@...r.kernel.org>,
	David Ahern <dsahern@...il.com>,
	Ingo Molnar <mingo@...nel.org>,
	Namhyung Kim <namhyung@...nel.org>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	"Liang, Kan" <kan.liang@...el.com>
Subject: Re: [PATCHv9 0/3] perf stat: Add scripting support

Em Wed, Jan 06, 2016 at 11:49:54AM +0100, Jiri Olsa escreveu:
> hi,
> sending another version of stat scripting.
 
> v9 changes:
>   - rebased on top of accepted patches
>   - desribed CPI metric in changelog [Arnaldo]
>   - fixed cpu conversion [Arnaldo]

Thanks, applied. Testing the cpi script with an endless:

perf stat record -I 1000 -a | perf script -s ~acme/git/linux/tools/perf/scripts/python/stat-cpi.py
     187.151796: cpu -1, thread -1 -> cpi 0.797917 (2568467461/3218963700)
     188.151979: cpu -1, thread -1 -> cpi 0.734628 (2714373371/3694892981)
     189.152212: cpu -1, thread -1 -> cpi 0.753332 (2958819204/3927644236)
     190.152975: cpu -1, thread -1 -> cpi 1.587754 (202360895/127451009)
     191.153486: cpu -1, thread -1 -> cpi 1.558219 (290557309/186467579)

in one monitor while trying various workloads in another, to see how it reacts,
and after a while those raw numbers at the end of each line become just noise,
I think they should be shown only under 'stat-cpy -v' (I think we can pass args
to the 'perf script' scripts, right?

Also the 'cpu', 'thread' and '-> cpi' could be turned into headers, like 'perf
stat' does?

Anyway, you called it 'example script' and we can improve it on top of what I
pushed to perf/core, i.e. this patchkit, unchanged, thanks!

- Arnaldo
 
> v8 changes:
>   - check for stat callbacks properly [Namhyung]
>   - used '#!/usr/bin/env python' for stat-cpi.py [Namhyung]
>   - used tuple_set_u64 for storing u64 into python tuple [Namhyung]
> 
> v7 changes:
>   - perf stat record/report patches already taken,
>     posting the rest of the scripting support
>   - rebased to latest Arnaldo's perf/core
> 
> v6 changes:
>   - several patches from v4 already taken
>   - perf stat record can now place 'record' keyword
>     anywhere within stat options
>   - placed STAT feature checking earlier into record
>     patches so commands processing perf.data recognize
>     stat data and skip sample_type checking
>   - rebased on Arnaldo's perf/stat
>   - added Tested-by: Kan Liang <kan.liang@...el.com>
> 
> v5 changes:
>   - several patches from v4 already taken
>   - using u16 for cpu number in cpu_map_event
>   - renamed PERF_RECORD_HEADER_ATTR_UPDATE to PERF_RECORD_EVENT_UPDATE
>   - moved low hanging fuits patches to the start of the patchset
>   - patchset tested by Kan Liang, thanks!
> 
> v4 changes:
>   - added attr update event for event's cpumask
>   - forbig aggregation on task workloads
>   - some minor reorders and changelog fixes
> 
> v3 changes:
>   - added attr update event to handle unit,scale,name for event
>     it fixed the uncore_imc_1/cas_count_read/ record/report
>   - perf report -D now displays stat related events
>   - some minor and changelog fixes
> 
> v2 changes:
>   - rebased to latest Arnaldo's perf/core
>   - patches 1 to 11 already merged in
>   - added --per-core/--per-socket/-A options for perf stat report
>     command to allow custom aggregation in stat report, please
>     check new examples below
>   - couple changelogs changes
> 
> The initial attempt defined its own formula lang and allowed
> triggering user's script on the end of the stat command:
>   http://marc.info/?l=linux-kernel&m=136742146322273&w=2
> 
> This patchset abandons the idea of new formula language
> and rather adds support to:
>   - store stat data into perf.data file
>   - add python support to process stat events
> 
> Basically it allows to store stat data into perf.data and
> post process it with python scripts in a similar way we
> do for sampling data.
> 
> The stat data are stored in new stat, stat-round, stat-config user events.
>   stat        - stored for each read syscall of the counter
>   stat round  - stored for each interval or end of the command invocation
>   stat config - stores all the config information needed to process data
>                 so report tool could restore the same output as record
> 
> The python script can now define 'stat__<eventname>_<modifier>' functions
> to get stat events data and 'stat__interval' to get stat-round data.
> 
> See CPI script example in scripts/python/stat-cpi.py.
> 
> Also available in:
>   git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git
>   perf/stat_script
> 
> thanks,
> jirka
> 
> Examples:
> 
> - To record data for command stat workload:
> 
>   $ perf stat record kill
>   ...
> 
>    Performance counter stats for 'kill':
> 
>             0.372007      task-clock (msec)         #    0.613 CPUs utilized          
>                    3      context-switches          #    0.008 M/sec                  
>                    0      cpu-migrations            #    0.000 K/sec                  
>                   62      page-faults               #    0.167 M/sec                  
>            1,129,973      cycles                    #    3.038 GHz                    
>      <not supported>      stalled-cycles-frontend  
>      <not supported>      stalled-cycles-backend   
>              813,313      instructions              #    0.72  insns per cycle        
>              166,161      branches                  #  446.661 M/sec                  
>                8,747      branch-misses             #    5.26% of all branches        
> 
>          0.000607287 seconds time elapsed
> 
> - To report perf stat data:
> 
>   $ perf stat report
> 
>    Performance counter stats for '/home/jolsa/bin/perf stat record kill':
> 
>             0.372007      task-clock (msec)         #      inf CPUs utilized          
>                    3      context-switches          #    0.008 M/sec                  
>                    0      cpu-migrations            #    0.000 K/sec                  
>                   62      page-faults               #    0.167 M/sec                  
>            1,129,973      cycles                    #    3.038 GHz                    
>      <not supported>      stalled-cycles-frontend  
>      <not supported>      stalled-cycles-backend   
>              813,313      instructions              #    0.72  insns per cycle        
>              166,161      branches                  #  446.661 M/sec                  
>                8,747      branch-misses             #    5.26% of all branches        
> 
>          0.000000000 seconds time elapsed
> 
> - To store system-wide period stat data:
> 
>   $ perf stat -e cycles:u,instructions:u -a -I 1000 record
>   #           time             counts unit events
>        1.000265471        462,311,482      cycles:u                   (100.00%)
>        1.000265471        590,037,440      instructions:u           
>        2.000483453        722,532,336      cycles:u                   (100.00%)
>        2.000483453        848,678,197      instructions:u           
>        3.000759876         75,990,880      cycles:u                   (100.00%)
>        3.000759876         86,187,813      instructions:u           
>   ^C     3.213960893         85,329,533      cycles:u                   (100.00%)
>        3.213960893        135,954,296      instructions:u           
> 
> - To report perf stat data:
> 
>   $ perf stat report
>   #           time             counts unit events
>        1.000265471        462,311,482      cycles:u                   (100.00%)
>        1.000265471        590,037,440      instructions:u           
>        2.000483453        722,532,336      cycles:u                   (100.00%)
>        2.000483453        848,678,197      instructions:u           
>        3.000759876         75,990,880      cycles:u                   (100.00%)
>        3.000759876         86,187,813      instructions:u           
>        3.213960893         85,329,533      cycles:u                   (100.00%)
>        3.213960893        135,954,296      instructions:u           
> 
> - To run stat-cpi.py script over perf.data:
> 
>   $ perf script -s scripts/python/stat-cpi.py 
>          1.000265: cpu -1, thread -1 -> cpi 0.783529 (462311482/590037440)
>          2.000483: cpu -1, thread -1 -> cpi 0.851362 (722532336/848678197)
>          3.000760: cpu -1, thread -1 -> cpi 0.881689 (75990880/86187813)
>          3.213961: cpu -1, thread -1 -> cpi 0.627634 (85329533/135954296)
> 
> - To pipe data from stat to stat-cpi script:
> 
>   $ perf stat -e cycles:u,instructions:u -A -C 0 -I 1000 record | perf script -s scripts/python/stat-cpi.py 
>          1.000192: cpu 0, thread -1 -> cpi 0.739535 (23921908/32347236)
>          2.000376: cpu 0, thread -1 -> cpi 1.663482 (2519340/1514498)
>          3.000621: cpu 0, thread -1 -> cpi 1.396308 (16162767/11575362)
>          4.000700: cpu 0, thread -1 -> cpi 1.092246 (20077258/18381624)
>          5.000867: cpu 0, thread -1 -> cpi 0.473816 (45157586/95306156)
>          6.001034: cpu 0, thread -1 -> cpi 0.532792 (43701668/82023818)
>          7.001195: cpu 0, thread -1 -> cpi 1.122059 (29890042/26638561)
> 
> - Raw script stat data output:
> 
>   $ perf stat -e cycles:u,instructions:u -A -C 0 -I 1000 record | perf --no-pager script
>   CPU   THREAD             VAL             ENA             RUN            TIME EVENT
>     0       -1        12302059      1000811347      1000810712      1000198821 cycles:u
>     0       -1         2565362      1000823218      1000823218      1000198821 instructions:u
>     0       -1        14453353      1000812704      1000812704      2000382283 cycles:u
>     0       -1         4600932      1000799342      1000799342      2000382283 instructions:u
>     0       -1        15245106      1000774425      1000774425      3000538255 cycles:u
>     0       -1         2624324      1000769310      1000769310      3000538255 instructions:u
> 
> - To display different aggregation in report:
> 
>   $ perf stat -e cycles -a -I 1000 record sleep 3 
>   #           time             counts unit events
>        1.000223609        703,427,617      cycles                   
>        2.000443651        609,975,307      cycles                   
>        3.000569616        668,479,597      cycles                   
>        3.000735323          1,155,816      cycles                 
> 
>   $ perf stat report
>   #           time             counts unit events
>        1.000223609        703,427,617      cycles                   
>        2.000443651        609,975,307      cycles                   
>        3.000569616        668,479,597      cycles                   
>        3.000735323          1,155,816      cycles                   
> 
>   $ perf stat report --per-core
>   #           time core         cpus             counts unit events
>        1.000223609 S0-C0           2        327,612,412      cycles                   
>        1.000223609 S0-C1           2        375,815,205      cycles                   
>        2.000443651 S0-C0           2        287,462,177      cycles                   
>        2.000443651 S0-C1           2        322,513,130      cycles                   
>        3.000569616 S0-C0           2        271,571,908      cycles                   
>        3.000569616 S0-C1           2        396,907,689      cycles                   
>        3.000735323 S0-C0           2            694,977      cycles                   
>        3.000735323 S0-C1           2            460,839      cycles                   
> 
>   $ perf stat report --per-socket
>   #           time socket cpus             counts unit events
>        1.000223609 S0        4        703,427,617      cycles                   
>        2.000443651 S0        4        609,975,307      cycles                   
>        3.000569616 S0        4        668,479,597      cycles                   
>        3.000735323 S0        4          1,155,816      cycles                   
> 
>   $ perf stat report -A
>   #           time CPU                counts unit events
>        1.000223609 CPU0           205,431,505      cycles                   
>        1.000223609 CPU1           122,180,907      cycles                   
>        1.000223609 CPU2           176,649,682      cycles                   
>        1.000223609 CPU3           199,165,523      cycles                   
>        2.000443651 CPU0           148,447,922      cycles                   
>        2.000443651 CPU1           139,014,255      cycles                   
>        2.000443651 CPU2           204,436,559      cycles                   
>        2.000443651 CPU3           118,076,571      cycles                   
>        3.000569616 CPU0           149,788,954      cycles                   
>        3.000569616 CPU1           121,782,954      cycles                   
>        3.000569616 CPU2           247,277,700      cycles                   
>        3.000569616 CPU3           149,629,989      cycles                   
>        3.000735323 CPU0               269,675      cycles                   
>        3.000735323 CPU1               425,302      cycles                   
>        3.000735323 CPU2               364,169      cycles                   
>        3.000735323 CPU3                96,670      cycles                   
> 
> 
> Cc: Andi Kleen <andi@...stfloor.org>
> Cc: Ulrich Drepper <drepper@...il.com>
> Cc: Will Deacon <will.deacon@....com>
> Cc: Stephane Eranian <eranian@...gle.com>
> Cc: Don Zickus <dzickus@...hat.com>
> Tested-by: Kan Liang <kan.liang@...el.com>
> ---
>  tools/perf/builtin-script.c           | 36 ++++++++++++++++++++++++++++++++++++
>  tools/perf/scripts/python/stat-cpi.py | 77 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  tools/perf/util/cpumap.c              | 13 +++++++++++--
>  3 files changed, 124 insertions(+), 2 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/