lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Date:   Sun, 20 May 2018 10:28:58 +0800
From:   kernel test robot <xiaolong.ye@...el.com>
To:     Mel Gorman <mgorman@...hsingularity.net>
Cc:     linux-kernel@...r.kernel.org, lkp@...org
Subject: [lkp-robot] [sched/numa]  789ba28013:  pxz.throughput -5.8%
 regression


Greeting,

FYI, we noticed a -5.8% regression of pxz.throughput due to commit:


commit: 789ba28013ce23dbf5e9f5f014f4233b35523bf3 ("Revert "sched/numa: Delay retrying placement for automatic NUMA balance after wake_affine()"")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

in testcase: pxz
on test machine: 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 64G memory
with following parameters:

	nr_threads: 25%
	cpufreq_governor: performance



Details are as below:
-------------------------------------------------------------------------------------------------->

=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/testcase:
  gcc-7/performance/x86_64-rhel-7.2/25%/debian-x86_64-2016-08-31.cgz/lkp-bdw-ep3/pxz

commit: 
  94d7dbf108 (" - A stable fix for DM integrity to use kvfree.")
  789ba28013 ("Revert "sched/numa: Delay retrying placement for automatic NUMA balance after wake_affine()"")

94d7dbf108813ea4 789ba28013ce23dbf5e9f5f014 
---------------- -------------------------- 
         %stddev     %change         %stddev
             \          |                \  
 1.113e+08            -5.8%  1.048e+08        pxz.throughput
      8327 ±  3%      +9.2%       9094 ±  2%  pxz.time.involuntary_context_switches
   1872519 ±  3%     +25.8%    2356056 ±  3%  pxz.time.minor_page_faults
      1958            -3.9%       1881        pxz.time.percent_of_cpu_this_job_got
     69.87            -1.9%      68.55        pxz.time.system_time
      5976            -3.3%       5779        pxz.time.user_time
   2800729 ±  3%     +21.9%    3412916 ±  4%  interrupts.CAL:Function_call_interrupts
     98846            +1.4%     100214        vmstat.system.in
    493.25 ±  6%     +21.4%     599.00 ±  5%  slabinfo.skbuff_fclone_cache.active_objs
    493.25 ±  6%     +21.4%     599.00 ±  5%  slabinfo.skbuff_fclone_cache.num_objs
     28978 ± 16%     +34.5%      38972 ±  4%  numa-meminfo.node0.SReclaimable
     90527 ± 10%     +18.7%     107450 ±  2%  numa-meminfo.node0.Slab
     13699 ± 14%     -25.7%      10176        numa-meminfo.node1.Mapped
     36266 ± 13%     -27.3%      26383 ±  6%  numa-meminfo.node1.SReclaimable
     26839 ± 22%     -47.2%      14178 ± 39%  numa-meminfo.node1.Shmem
      2902 ± 19%     +31.9%       3828 ±  2%  numa-vmstat.node0.nr_mapped
      7244 ± 16%     +34.5%       9743 ±  4%  numa-vmstat.node0.nr_slab_reclaimable
      3497 ± 15%     -26.1%       2586 ±  2%  numa-vmstat.node1.nr_mapped
      6727 ± 22%     -47.1%       3560 ± 39%  numa-vmstat.node1.nr_shmem
      9067 ± 13%     -27.3%       6596 ±  6%  numa-vmstat.node1.nr_slab_reclaimable
   1186310 ±  5%     +34.2%    1591738 ±  4%  proc-vmstat.numa_hint_faults
   1103415 ±  6%     +27.5%    1406754 ±  4%  proc-vmstat.numa_hint_faults_local
     40480 ±  3%     +29.4%      52378 ±  3%  proc-vmstat.numa_huge_pte_updates
   1052001 ±  8%     +41.4%    1487662 ±  4%  proc-vmstat.numa_pages_migrated
  21884436 ±  4%     +29.7%   28383097 ±  3%  proc-vmstat.numa_pte_updates
   2667563 ±  2%     +18.4%    3158089 ±  2%  proc-vmstat.pgfault
   1052001 ±  8%     +41.4%    1487662 ±  4%  proc-vmstat.pgmigrate_success
      1278 ±  2%     -19.8%       1025 ±  2%  proc-vmstat.thp_split_pmd
      3.38 ±  7%      -2.0        1.38 ±101%  perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.do_idle.cpu_startup_entry.start_kernel
      3.43 ±  7%      -1.9        1.56 ± 84%  perf-profile.calltrace.cycles-pp.cpuidle_enter_state.do_idle.cpu_startup_entry.start_kernel.secondary_startup_64
      3.44 ±  7%      -1.7        1.72 ± 68%  perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_kernel.secondary_startup_64
      3.44 ±  7%      -1.7        1.72 ± 68%  perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_kernel.secondary_startup_64
      3.44 ±  7%      -1.7        1.72 ± 68%  perf-profile.calltrace.cycles-pp.start_kernel.secondary_startup_64
      3.44 ±  7%      -1.7        1.72 ± 68%  perf-profile.children.cycles-pp.start_kernel
      0.08 ± 23%      +0.0        0.11 ±  7%  perf-profile.children.cycles-pp.anon_pipe_buf_release
      0.35 ±  4%      +0.0        0.39 ±  8%  perf-profile.children.cycles-pp.ktime_get
      0.12 ±  5%      +0.0        0.14 ±  9%  perf-profile.self.cycles-pp.__hrtimer_run_queues
      0.15 ±  8%      +0.0        0.18 ±  2%  perf-profile.self.cycles-pp._raw_spin_lock
      0.08 ± 23%      +0.0        0.11 ±  7%  perf-profile.self.cycles-pp.anon_pipe_buf_release
     17.47 ±  6%     -12.1%      15.36 ±  8%  sched_debug.cfs_rq:/.runnable_load_avg.avg
      9.62 ±  2%     -15.4%       8.13 ±  6%  sched_debug.cpu.cpu_load[0].avg
      9.47 ±  5%     -12.6%       8.28 ±  3%  sched_debug.cpu.cpu_load[1].avg
     11487 ±  8%     -15.3%       9724 ±  4%  sched_debug.cpu.load.avg
    386.67 ± 22%     +43.7%     555.50 ± 15%  sched_debug.cpu.nr_switches.min
      5325 ±  5%     -16.6%       4441 ±  4%  sched_debug.cpu.nr_switches.stddev
      0.00 ± 57%    +100.0%       0.01 ± 11%  sched_debug.cpu.nr_uninterruptible.avg
    119.04 ± 44%     +71.9%     204.67 ± 24%  sched_debug.cpu.sched_count.min
     12503 ± 16%     -22.4%       9698 ± 13%  sched_debug.cpu.sched_goidle.max
     55.83 ± 44%     +78.0%      99.38 ± 23%  sched_debug.cpu.sched_goidle.min
      2214 ±  4%     -23.0%       1705 ±  7%  sched_debug.cpu.sched_goidle.stddev
     13757 ± 15%     -18.9%      11158 ± 12%  sched_debug.cpu.ttwu_count.max
     72.42 ± 16%     +59.4%     115.46 ± 23%  sched_debug.cpu.ttwu_count.min
      2213 ±  6%     -18.6%       1802 ±  7%  sched_debug.cpu.ttwu_count.stddev
     41.67 ± 10%     +36.6%      56.92 ± 14%  sched_debug.cpu.ttwu_local.min
 1.829e+12            -5.1%  1.736e+12        perf-stat.branch-instructions
 4.612e+10            -4.7%  4.396e+10        perf-stat.branch-misses
     43.24            -0.7       42.51        perf-stat.cache-miss-rate%
 9.154e+10            -6.2%  8.584e+10        perf-stat.cache-misses
 2.117e+11            -4.6%  2.019e+11        perf-stat.cache-references
      1.52            +2.5%       1.56        perf-stat.cpi
 1.774e+13            -2.8%  1.725e+13        perf-stat.cpu-cycles
    863.00 ± 15%    +118.4%       1884 ±  2%  perf-stat.cpu-migrations
      0.02            +0.0        0.02 ±  2%  perf-stat.dTLB-load-miss-rate%
 3.512e+12            -4.5%  3.356e+12        perf-stat.dTLB-loads
      0.02            +0.0        0.02 ±  2%  perf-stat.dTLB-store-miss-rate%
  2.11e+08           +12.4%  2.373e+08 ±  2%  perf-stat.dTLB-store-misses
 1.393e+12            -5.5%  1.316e+12        perf-stat.dTLB-stores
 1.164e+13            -5.1%  1.104e+13        perf-stat.instructions
      0.66            -2.5%       0.64        perf-stat.ipc
   2642298 ±  2%     +18.6%    3133154 ±  2%  perf-stat.minor-faults
      2.09 ± 29%      +4.5        6.55 ±  6%  perf-stat.node-load-miss-rate%
 1.387e+09 ± 31%    +170.7%  3.753e+09 ± 11%  perf-stat.node-load-misses
 6.456e+10 ±  7%     -17.3%  5.337e+10 ±  6%  perf-stat.node-loads
      2.59 ±  7%      +3.8        6.44 ± 11%  perf-stat.node-store-miss-rate%
 1.674e+08 ±  7%    +132.0%  3.883e+08 ± 14%  perf-stat.node-store-misses
 6.285e+09           -10.6%  5.619e+09 ±  3%  perf-stat.node-stores
   2642302 ±  2%     +18.6%    3133157 ±  2%  perf-stat.page-faults


                                                                                
                                    pxz.throughput                              
                                                                                
  1.13e+08 +-+--------------------------------------------------------------+   
  1.12e+08 +-+       .+.+.+.    .+.+.+  +        + +            .+.+..    + |   
           | +.+..+.+       +..+         +..+.+.+   +.+..+.+.+.+      +. + +|   
  1.11e+08 +-+                                                          +   |   
   1.1e+08 +-+                                                              |   
  1.09e+08 +-+                                                              |   
  1.08e+08 +-+                                                              |   
           |                                                                |   
  1.07e+08 +-+      O            O       O                                  |   
  1.06e+08 O-+ O  O     O                                                   |   
  1.05e+08 +-O                                                              |   
  1.04e+08 +-+            O O  O   O O O                                    |   
           |                                                                |   
  1.03e+08 +-+        O                                                     |   
  1.02e+08 +-+--------------------------------------------------------------+   
                                                                                                                                                                




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Xiaolong

View attachment "config-4.17.0-rc4-00046-g789ba28" of type "text/plain" (164340 bytes)

View attachment "job-script" of type "text/plain" (6610 bytes)

View attachment "job.yaml" of type "text/plain" (4304 bytes)

View attachment "reproduce" of type "text/plain" (254 bytes)

Powered by blists - more mailing lists