lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Date:   Thu, 29 Mar 2018 15:11:08 +0800
From:   kernel test robot <xiaolong.ye@...el.com>
To:     "Rafael J. Wysocki" <rafael.j.wysocki@...el.com>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        LKML <linux-kernel@...r.kernel.org>,
        "Rafael J. Wysocki" <rafael.j.wysocki@...el.com>,
        linux-acpi@...r.kernel.org, devel@...ica.org,
        linux-pm@...r.kernel.org, lkp@...org
Subject: [lkp-robot] [cpuidle]  a97056a6fa:  aim9.exec_test.ops_per_sec
 -11.9% regression


Greeting,

FYI, we noticed a -11.9% regression of aim9.exec_test.ops_per_sec due to commit:


commit: a97056a6fab541e1661fed9ced0f793bda34b717 ("cpuidle: poll_state: Add time limit to poll_idle()")
https://git.kernel.org/cgit/linux/kernel/git/rafael/linux-pm.git poll-idle

in testcase: aim9
on test machine: 72 threads Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz with 128G memory
with following parameters:

	testtime: 5s
	test: all
	cpufreq_governor: performance

test-description: Suite IX is the "AIM Independent Resource Benchmark:" the famous synthetic benchmark.
test-url: https://sourceforge.net/projects/aimbench/files/aim-suite9/


Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp install job.yaml  # job file is attached in this email
        bin/lkp run     job.yaml

=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase/testtime:
  gcc-7/performance/x86_64-rhel-7.2/debian-x86_64-2016-08-31.cgz/lkp-hsw-ep4/all/aim9/5s

commit: 
  v4.16-rc5
  a97056a6fa ("cpuidle: poll_state: Add time limit to poll_idle()")

       v4.16-rc5 a97056a6fab541e1661fed9ced 
---------------- -------------------------- 
         %stddev     %change         %stddev
             \          |                \  
      2010           -11.9%       1770        aim9.exec_test.ops_per_sec
    619320 ±  2%      +2.6%     635500        aim9.creat-clo.ops_per_sec
   1335376            +1.3%    1353216        aim9.disk_wrt.ops_per_sec
      4976            -6.9%       4633        aim9.fork_test.ops_per_sec
 4.555e+08            +1.4%   4.62e+08        aim9.fun_cal1.ops_per_sec
 2.498e+08 ±  5%      +6.8%  2.668e+08        aim9.fun_cal15.ops_per_sec
  48323640 ±  2%      +2.0%   49295400        aim9.jmp_test.ops_per_sec
    413494            +2.7%     424689        aim9.link_test.ops_per_sec
    509182            +1.2%     515440        aim9.page_test.ops_per_sec
    396.68           -12.2%     348.15 ±  9%  aim9.shell_rtns_1.ops_per_sec
    701520 ±  2%      +2.6%     720100        aim9.signal_test.ops_per_sec
    619952            +1.6%     630083        aim9.sync_disk_rw.ops_per_sec
      9496 ±  9%     -29.1%       6730 ± 13%  aim9.time.involuntary_context_switches
   6666695            -5.1%    6328538 ±  3%  aim9.time.minor_page_faults
    117287            -8.4%     107404 ±  3%  aim9.time.voluntary_context_switches
      5573 ± 62%    +108.4%      11614 ± 34%  numa-numastat.node0.other_node
      2172            -5.4%       2054        vmstat.system.cs
  49345778 ±  4%    +307.8%  2.012e+08 ± 74%  cpuidle.C3.time
 4.561e+08 ±  9%    -100.0%     123156 ± 62%  cpuidle.POLL.time
    106130 ±183%     -97.4%       2761 ±124%  numa-meminfo.node1.Inactive
    106067 ±183%     -97.5%       2612 ±131%  numa-meminfo.node1.Inactive(anon)
    132349 ±175%     -95.0%       6681 ± 44%  numa-meminfo.node1.Shmem
      2573 ± 17%     +25.6%       3232 ± 13%  numa-vmstat.node0.nr_mapped
      5961 ± 52%     +98.4%      11828 ± 33%  numa-vmstat.node0.numa_other
     26515 ±183%     -97.5%     651.75 ±132%  numa-vmstat.node1.nr_inactive_anon
     33087 ±175%     -95.0%       1665 ± 44%  numa-vmstat.node1.nr_shmem
     26515 ±183%     -97.5%     651.75 ±132%  numa-vmstat.node1.nr_zone_inactive_anon
      1349 ± 62%    +119.0%       2955 ± 11%  slabinfo.dmaengine-unmap-16.active_objs
      1368 ± 62%    +116.1%       2955 ± 11%  slabinfo.dmaengine-unmap-16.num_objs
    731.60 ±  8%     +21.1%     886.25 ±  4%  slabinfo.ip6_dst_cache.active_objs
    731.60 ±  8%     +21.1%     886.25 ±  4%  slabinfo.ip6_dst_cache.num_objs
      3184 ±  6%     -13.4%       2758 ±  3%  slabinfo.mm_struct.active_objs
      3184 ±  6%     -13.0%       2771 ±  3%  slabinfo.mm_struct.num_objs
    166.80 ±  3%     -36.2%     106.50        turbostat.Avg_MHz
      5.94 ±  2%      -1.1        4.84 ±  5%  turbostat.Busy%
      2812           -21.2%       2216 ±  4%  turbostat.Bzy_MHz
     27.11           -10.2%      24.34        turbostat.CPU%c1
     19.80 ±  9%     +81.7%      35.98        turbostat.Pkg%pc2
      0.04 ± 90%    +250.0%       0.14 ± 29%  turbostat.Pkg%pc6
    117.44 ±  3%     -15.9%      98.79        turbostat.PkgWatt
      8.64 ±  5%     -11.1%       7.68 ±  3%  turbostat.RAMWatt
 2.794e+11 ± 10%     -23.1%   2.15e+11 ±  4%  perf-stat.branch-instructions
      1.46 ±  5%      +0.4        1.87 ±  6%  perf-stat.branch-miss-rate%
      0.93 ±  5%      -0.1        0.84 ±  2%  perf-stat.cache-miss-rate%
  90185384 ±  2%      -6.9%   83996045 ±  2%  perf-stat.cache-misses
    653470            -5.7%     616530        perf-stat.context-switches
      2.28 ± 10%     -18.9%       1.85 ±  4%  perf-stat.cpi
 3.505e+12 ±  4%     -35.2%  2.273e+12 ±  4%  perf-stat.cpu-cycles
     31896            -6.3%      29892 ±  2%  perf-stat.cpu-migrations
      0.12 ± 18%      +0.0        0.17 ± 17%  perf-stat.dTLB-load-miss-rate%
 4.953e+11 ± 12%     -29.4%  3.497e+11 ± 10%  perf-stat.dTLB-loads
      0.44 ±  9%     +22.3%       0.54 ±  4%  perf-stat.ipc
   7355724            -4.6%    7019933 ±  3%  perf-stat.minor-faults
   7355724            -4.6%    7019932 ±  3%  perf-stat.page-faults
      2713 ±  6%     +20.4%       3268 ±  5%  sched_debug.cfs_rq:/.exec_clock.avg
     73594 ±  6%     +26.1%      92766 ± 10%  sched_debug.cfs_rq:/.exec_clock.max
     12795 ±  5%     +21.7%      15577 ±  6%  sched_debug.cfs_rq:/.exec_clock.stddev
     18587 ± 10%     +16.2%      21602 ±  7%  sched_debug.cfs_rq:/.min_vruntime.avg
    177486 ±  7%     +21.7%     215979 ±  2%  sched_debug.cfs_rq:/.min_vruntime.max
     28721 ±  5%     +18.6%      34065 ±  4%  sched_debug.cfs_rq:/.min_vruntime.stddev
      1.97 ± 33%     -57.6%       0.83        sched_debug.cfs_rq:/.nr_spread_over.max
      0.25 ± 62%    -100.0%       0.00        sched_debug.cfs_rq:/.nr_spread_over.stddev
      4.66 ± 35%     -65.4%       1.61 ± 87%  sched_debug.cfs_rq:/.removed.util_avg.avg
     25.10 ± 25%     -57.0%      10.79 ± 83%  sched_debug.cfs_rq:/.removed.util_avg.stddev
    166666 ±  8%     +22.4%     203936 ±  4%  sched_debug.cfs_rq:/.spread0.max
     28721 ±  5%     +18.6%      34066 ±  4%  sched_debug.cfs_rq:/.spread0.stddev
    649.62 ±  6%      -9.4%     588.34 ±  4%  sched_debug.cpu.curr->pid.avg
      4637 ±  3%      -9.0%       4220 ±  2%  sched_debug.cpu.curr->pid.stddev
      1954 ±  8%     +13.5%       2218 ± 10%  sched_debug.cpu.sched_goidle.stddev
    518.80 ± 24%     -35.4%     335.12 ± 36%  sched_debug.cpu.ttwu_count.min


                                                                                
                            aim9.exec_test.ops_per_sec                          
                                                                                
  2500 +-+------------------------------------------------------------------+   
       |                                                                    |   
       |                                                                    |   
  2000 +-++..+..+...+..+..+..+..+..+..+...+..+..+..+..+..+..+..+...+..+..+..|   
       O  O  O  O      O  O  O  O  O  O   O  O  O  O  O  O  O  O   O  O  O  O   
       |                                                                    |   
  1500 +-+                                                                  |   
       |                                                                    |   
  1000 +-+                                                                  |   
       |                                                                    |   
       |                                                                    |   
   500 +-+                                                                  |   
       |                                                                    |   
       |                                                                    |   
     0 +-+----------O-------------------------------------------------------+   
                                                                                
                                                                                
                                                                                
[*] bisect-good sample
[O] bisect-bad  sample


Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Xiaolong

View attachment "config-4.16.0-rc5-00001-ga97056a6" of type "text/plain" (165932 bytes)

View attachment "job-script" of type "text/plain" (6671 bytes)

View attachment "job.yaml" of type "text/plain" (4361 bytes)

View attachment "reproduce" of type "text/plain" (254 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ