lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20171117214300.32746-1-andi@firstfloor.org>
Date:   Fri, 17 Nov 2017 13:42:57 -0800
From:   Andi Kleen <andi@...stfloor.org>
To:     acme@...nel.org
Cc:     jolsa@...nel.org, linux-kernel@...r.kernel.org
Subject: Add fine grained sampled metrics for perf script

This patch kit adds perf script support for computing metrics for
sampled groups. This allows much more fine grained metrics
measurement than perf stat allows, because the metrics
can be at PMI granularity instead of a slow timer.

Also the kernel does the sampling in this case which has
much less overhead than perf stat regularly querying
counters.

This allows things like fine grained IPC or TopDown tracking.

Note that the metric is still averaged over the sampling period,
it is not just for the sampling point.

For example to sample IPC:

$ perf record -e '{ref-cycles,cycles,instructions}:S' -a sleep 1
$ perf script -F metric,ip,sym,time,cpu,comm
...
 alsa-sink-ALC32 [000] 42815.856074:      7fd65937d6cc [unknown]
 alsa-sink-ALC32 [000] 42815.856074:      7fd65937d6cc [unknown]
 alsa-sink-ALC32 [000] 42815.856074:      7fd65937d6cc [unknown]
 alsa-sink-ALC32 [000] 42815.856074:    metric:    0.13  insn per cycle
         swapper [000] 42815.857961:  ffffffff81655df0 __schedule
         swapper [000] 42815.857961:  ffffffff81655df0 __schedule
 :1
        swapper [000] 42815.857961:  ffffffff81655df0 __schedule
         swapper [000] 42815.857961:    metric:    0.23  insn per cycle
 qemu-system-x86 [000] 42815.858130:  ffffffff8165ad0e
_raw_spin_unlock_irqrestore
 qemu-system-x86 [000] 42815.858130:  ffffffff8165ad0e
_raw_spin_unlock_irqrestore
 qemu-system-x86 [000] 42815.858130:  ffffffff8165ad0e
_raw_spin_unlock_irqrestore
 qemu-system-x86 [000] 42815.858130:    metric:    0.46  insn per cycle
           :4972 [000] 42815.858312:  ffffffffa080e5f2 vmx_vcpu_run
           :4972 [000] 42815.858312:  ffffffffa080e5f2 vmx_vcpu_run
           :4972 [000] 42815.858312:  ffffffffa080e5f2 vmx_vcpu_run
           :4972 [000] 42815.858312:    metric:    0.45  insn per cycle

TopDown:

Note TopDown requires disabling SMT if you have it enabled (e.g. by offlining
the extra CPUs), because SMT would require sampling per core, which is not supported.

$ perf record -e '{ref-cycles,topdown-fetch-bubbles,topdown-recovery-bubbles,\
topdown-slots-retired,topdown-total-slots,topdown-slots-issued}:S' -a sleep 1
$ perf script --header -I -F cpu,ip,sym,event,metric,period
...
[000]     121108               ref-cycles:  ffffffff8165222e copy_user_enhanced_fast_string
[000]     190350    topdown-fetch-bubbles:  ffffffff8165222e copy_user_enhanced_fast_string
[000]       2055 topdown-recovery-bubbles:  ffffffff8165222e copy_user_enhanced_fast_string
[000]     148729    topdown-slots-retired:  ffffffff8165222e copy_user_enhanced_fast_string
[000]     144324      topdown-total-slots:  ffffffff8165222e copy_user_enhanced_fast_string
[000]     160852     topdown-slots-issued:  ffffffff8165222e copy_user_enhanced_fast_string
[000]   metric:     33.0% frontend bound
[000]   metric:      3.5% bad speculation
[000]   metric:     25.8% retiring
[000]   metric:     37.7% backend bound
[000]     112112               ref-cycles:  ffffffff8165aec8 _raw_spin_lock_irqsave
[000]     357222    topdown-fetch-bubbles:  ffffffff8165aec8 _raw_spin_lock_irqsave
[000]       3325 topdown-recovery-bubbles:  ffffffff8165aec8 _raw_spin_lock_irqsave
[000]     323553    topdown-slots-retired:  ffffffff8165aec8 _raw_spin_lock_irqsave
[000]     270507      topdown-total-slots:  ffffffff8165aec8 _raw_spin_lock_irqsave
[000]     341226     topdown-slots-issued:  ffffffff8165aec8 _raw_spin_lock_irqsave
[000]   metric:     33.0% frontend bound
[000]   metric:      2.9% bad speculation
[000]   metric:     29.9% retiring
[000]   metric:     34.2% backend bound


Git tree:
git://git.kernel.org/pub/scm/limux/kernel/git/ak/linux-misc.git perf/script-metric-3


v1: Initial post
v2: 
Remove already merged patches.
Use evsel->priv for new fields
Port to new base line, support fp output.
Handle stats in ->stats, not ->priv
Minor cleanups
v3:
Enable EVENT_UPDATE in perf record, and record unit/scale/cpu map/thread map
Drop the previous zero cpu map hack.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ