linux-kernel - [lkp] [perf powerpc] 18d1796d0b: [No primary change]

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20161025064013.GB2726@yexl-desktop>
Date:   Tue, 25 Oct 2016 14:40:13 +0800
From:   kernel test robot <xiaolong.ye@...el.com>
To:     Jiri Olsa <jolsa@...hat.com>
Cc:     Michael Neuling <mikey@...ling.org>,
        Paul Mackerras <paulus@...ba.org>,
        Jiri Olsa <jolsa@...nel.org>,
        lkml <linux-kernel@...r.kernel.org>,
        Ingo Molnar <mingo@...nel.org>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        Jan Stancek <jstancek@...hat.com>, lkp@...org
Subject: [lkp] [perf powerpc]  18d1796d0b: [No primary change]

[will-it-scale] perf-stat.branch-miss-rate +7.4% regression 
Reply-To: kernel test robot <xiaolong.ye@...el.com>
User-Agent: Heirloom mailx 12.5 6/20/10


FYI, we noticed a +7.4% regression of perf-stat.branch-miss-rate due to commit:

commit 18d1796d0b45762ec6f58c5ed2ad3f7510ffbaa9 ("perf powerpc: Don't call perf_event_disable from atomic context")
https://github.com/0day-ci/linux Jiri-Olsa/perf-powerpc-Don-t-call-perf_event_disable-from-atomic-context/20161006-203500

in testcase: will-it-scale
on test machine: 32 threads Intel(R) Xeon(R) CPU E5-2680 0 @ 2.70GHz with 64G memory
with following parameters:

	test: poll2
	cpufreq_governor: performance

Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.


Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.

Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
        cd lkp-tests
        bin/lkp install job.yaml  # job file is attached in this email
        bin/lkp run     job.yaml

=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase:
  gcc-6/performance/x86_64-rhel-7.2/debian-x86_64-2016-08-31.cgz/lkp-sb03/poll2/will-it-scale

commit: 
  41aad2a6d4 (" perf/core improvements and fixes:")
  18d1796d0b ("perf powerpc: Don't call perf_event_disable from atomic context")

41aad2a6d4fcdda8 18d1796d0b45762ec6f58c5ed2 
---------------- -------------------------- 
       fail:runs  %reproduction    fail:runs
           |             |             |    
         %stddev     %change         %stddev
             \          |                \  
      0.19 ±  0%      +7.4%       0.21 ±  0%  perf-stat.branch-miss-rate%
 9.591e+09 ±  1%      +9.1%  1.047e+10 ±  0%  perf-stat.branch-misses
 1.962e+09 ±  0%      +2.3%  2.008e+09 ±  1%  perf-stat.cache-references
     51.18 ±  2%      +5.6%      54.06 ±  1%  perf-stat.iTLB-load-miss-rate%
  46430577 ±  5%      -6.9%   43241506 ±  2%  perf-stat.iTLB-loads
      9.90 ±  4%      +9.3%      10.82 ±  4%  turbostat.Pkg%pc2
     62066 ± 24%     +34.7%      83582 ± 11%  numa-meminfo.node1.Active
     49531 ± 30%     +42.9%      70778 ± 13%  numa-meminfo.node1.Active(anon)
     27883 ±100%    -100.0%       0.00 ± -1%  latency_stats.avg.proc_cgroup_show.proc_single_show.seq_read.__vfs_read.vfs_read.SyS_read.do_syscall_64.return_from_SYSCALL_64
     27883 ±100%    -100.0%       0.00 ± -1%  latency_stats.max.proc_cgroup_show.proc_single_show.seq_read.__vfs_read.vfs_read.SyS_read.do_syscall_64.return_from_SYSCALL_64
     32685 ± 38%     +88.5%      61603 ±147%  latency_stats.sum.call_rwsem_down_write_failed.path_openat.do_filp_open.do_sys_open.SyS_open.entry_SYSCALL_64_fastpath
     27883 ±100%    -100.0%       0.00 ± -1%  latency_stats.sum.proc_cgroup_show.proc_single_show.seq_read.__vfs_read.vfs_read.SyS_read.do_syscall_64.return_from_SYSCALL_64
     92795 ±  4%      -8.6%      84853 ±  6%  numa-vmstat.node0.numa_hit
     92782 ±  4%      -8.5%      84851 ±  6%  numa-vmstat.node0.numa_local
     12381 ± 30%     +42.9%      17694 ± 13%  numa-vmstat.node1.nr_active_anon
     12381 ± 30%     +42.9%      17694 ± 13%  numa-vmstat.node1.nr_zone_active_anon
     21.80 ± 59%     -69.8%       6.58 ± 83%  sched_debug.cpu.clock.stddev
     21.80 ± 59%     -69.8%       6.58 ± 83%  sched_debug.cpu.clock_task.stddev
      0.00 ± 23%     -34.3%       0.00 ± 20%  sched_debug.cpu.next_balance.stddev
     35829 ±  9%     -18.4%      29221 ±  6%  sched_debug.cpu.nr_switches.max
      8361 ±  6%     -13.4%       7243 ±  7%  sched_debug.cpu.nr_switches.stddev
      8.43 ± 11%     -25.2%       6.30 ± 12%  sched_debug.cpu.nr_uninterruptible.stddev
     18057 ±  6%     -14.3%      15482 ±  8%  sched_debug.cpu.sched_count.stddev



                             perf-stat.branch-miss-rate_

   0.22 ++------------------------------------------------------------------+
        |            O                                                      |
  0.215 ++      O OO                                                        |
        O  O O                                                              |
        | O   O                                                             |
   0.21 ++                                                                  |
        |               O      O                                            |
  0.205 ++             O  OO O  O O                                         |
        |                                                                   |
    0.2 ++                                                                  |
        |                                                                   |
        |                                                                   |
  0.195 ++         *.*. *. *.*.*   .* .**.   *.**.   *.*.*                  |
        *.**.**.*.*    *  *     *.*  *    *.*     *.*     *.**.*. *.*.**. *.*
   0.19 ++-------------------------------------------------------*-------*--+

	[*] bisect-good sample
	[O] bisect-bad  sample




Thanks,
Xiaolong

View attachment "config-4.8.0-rc7-00165-g18d1796" of type "text/plain" (152565 bytes)

View attachment "job-script" of type "text/plain" (6572 bytes)

View attachment "job.yaml" of type "text/plain" (4168 bytes)

View attachment "reproduce" of type "text/plain" (143 bytes)