lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 30 Nov 2018 11:21:54 +0100
From:   Jiri Olsa <jolsa@...hat.com>
To:     Leo Yan <leo.yan@...aro.org>
Cc:     Arnaldo Carvalho de Melo <acme@...nel.org>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        Andi Kleen <ak@...ux.intel.com>, linux-kernel@...r.kernel.org,
        Daniel Thompson <daniel.thompson@...aro.org>
Subject: Re: [PATCH] perf evsel: Correct clock unit to nanosecond

On Fri, Nov 30, 2018 at 06:06:05PM +0800, Leo Yan wrote:
> Since commit 0aa802a79469 ("perf stat: Get rid of extra clock display
> function"), the cpu and task clock unit has been changed from
> nanosecond value to millisecond value.  This introduces confusion for
> CPU run time statistics, we can see in below flow the clock value is
> scaled from nanosecond value to millisecond value; but this is
> contradiction with statistics type 'STAT_NSECS', which always takes
> clock as nanosecond unit.
> 
>   perf_stat__update_shadow_stats()
>     `-> count *= counter->scale;
>         update_runtime_stat(st, STAT_NSECS, 0, cpu, count);
> 
> And when 'perf stat' calculates frequencies for cpu cycles and other
> events, it uses event count to divide clock and it expects divisor unit
> has nanosecond value, but actually the clock value has been scaled to
> millisecond value.  At the end we can exaggerate values in below output
> result (e.g. 500M/sec for context-switches, cycle is 1079636.500 GHz).
> 
>   # perf stat -- sleep 1
> 
>    Performance counter stats for 'sleep 1':
> 
>                 2.29 msec task-clock                #    0.002 CPUs utilized
>                    1      context-switches          #  500.000 M/sec
>                    1      cpu-migrations            #  500.000 M/sec
>                   53      page-faults               # 26500.000 M/sec
>            2,159,273      cycles                    # 1079636.500 GHz
>            1,281,566      instructions              #    0.59  insn per cycle
>              171,153      branches                  # 85576500.000 M/sec
>               17,870      branch-misses             #   10.44% of all branches
> 
> To fix this issue, this patch changes back clock unit to nanosecond
> level, as result we can see the correct frequency calculation:
> 
>   # perf stat -- sleep 1
> 
>    Performance counter stats for 'sleep 1':
> 
>            2,461,180 nsec task-clock                #    0.002 CPUs utilized
>                    1      context-switches          #    0.406 K/sec
>                    0      cpu-migrations            #    0.000 K/sec
>                   54      page-faults               #    0.022 M/sec
>            2,320,634      cycles                    #    0.943 GHz
>            1,284,442      instructions              #    0.55  insn per cycle
>              171,425      branches                  #   69.652 M/sec
>               17,970      branch-misses             #   10.48% of all branches
> 
> This patch has side effect for changing unit field for perf command.
> This is tested with below two cases: one is for normal output format and
> another is for cvs output format.
> 
> Before:
> 
>   # perf stat  -e cpu-clock,task-clock -C 0 sleep 3
> 
>    Performance counter stats for 'CPU(s) 0':
> 
>             3,003.50 msec cpu-clock                 #    1.000 CPUs utilized
>             3,003.50 msec task-clock                #    1.000 CPUs utilized
> 
>          3.003697880 seconds time elapsed
> 
> Now:
> 
>   # perf stat  -e cpu-clock,task-clock -C 0 sleep 3
> 
>    Performance counter stats for 'CPU(s) 0':
> 
>        3,003,684,880 nsec cpu-clock                 #    1.000 CPUs utilized
>        3,003,682,780 nsec task-clock                #    1.000 CPUs utilized
> 
>          3.003881982 seconds time elapsed
> 
> Before:
> 
>   # perf stat  -e cpu-clock,task-clock -C 0 -x, sleep 3
>   3003.64,msec,cpu-clock,3003644320,100.00,1.000,CPUs utilized
>   3003.64,msec,task-clock,3003643300,100.00,1.000,CPUs utilized
> 
> Now:
> 
>   # perf stat  -e cpu-clock,task-clock -C 0 -x, sleep 3
>   3003501460,nsec,cpu-clock,3003501660,100.00,1.000,CPUs utilized
>   3003499500,nsec,task-clock,3003499500,100.00,1.000,CPUs utilized
> 
> Cc: Daniel Thompson <daniel.thompson@...aro.org>
> Signed-off-by: Leo Yan <leo.yan@...aro.org>

there was fix for this recently, could you please check
if it's working for you:
  6e269c85dcea perf stat: Fix shadow stats for clock events

thanks,
jirka

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ