lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 26 Mar 2020 13:57:23 +0800
From:   kernel test robot <rong.a.chen@...el.com>
To:     Jann Horn <jannh@...gle.com>
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>,
        LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org
Subject: [mm] fd4d9c7d0c: stress-ng.switch.ops_per_sec -30.5% regression

Greeting,

FYI, we noticed a -30.5% regression of stress-ng.switch.ops_per_sec due to commit:


commit: fd4d9c7d0c71866ec0c2825189ebd2ce35bd95b8 ("mm: slub: add missing TID bump in kmem_cache_alloc_bulk()")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

in testcase: stress-ng
on test machine: 96 threads Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 192G memory
with following parameters:

	nr_threads: 100%
	disk: 1HDD
	testtime: 30s
	test: switch
	cpufreq_governor: performance
	ucode: 0x500002c




If you fix the issue, kindly add following tag
Reported-by: kernel test robot <rong.a.chen@...el.com>


Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp install job.yaml  # job file is attached in this email
        bin/lkp run     job.yaml

=========================================================================================
compiler/cpufreq_governor/disk/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime/ucode:
  gcc-7/performance/1HDD/x86_64-rhel-7.6/100%/debian-x86_64-20191114.cgz/lkp-csl-2sp5/switch/stress-ng/30s/0x500002c

commit: 
  ac309e7744 ("Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid")
  fd4d9c7d0c ("mm: slub: add missing TID bump in kmem_cache_alloc_bulk()")

ac309e7744bee222 fd4d9c7d0c71866ec0c2825189e 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
  69076693           -30.5%   47993323        stress-ng.switch.ops
   2302520           -30.5%    1599758        stress-ng.switch.ops_per_sec
     26.79            -9.0%      24.37        stress-ng.time.user_time
      9242 ± 13%     -16.2%       7749 ±  2%  numa-meminfo.node0.KernelStack
      2.86 ±100%    -100.0%       0.00        iostat.sdb.await.max
      2.86 ±100%    -100.0%       0.00        iostat.sdb.r_await.max
      9243 ± 13%     -16.2%       7748 ±  2%  numa-vmstat.node0.nr_kernel_stack
    157380 ±  9%     -60.3%      62515 ± 90%  numa-vmstat.node0.numa_other
     22499 ± 28%     -41.5%      13173 ± 34%  sched_debug.cfs_rq:/.spread0.max
     -3319          +252.7%     -11706        sched_debug.cfs_rq:/.spread0.min
    -53.25           -45.1%     -29.25        sched_debug.cpu.nr_uninterruptible.min
     10425 ±  7%     +13.3%      11813 ±  5%  interrupts.CPU41.RES:Rescheduling_interrupts
     10605 ±  2%     +31.9%      13993 ± 23%  interrupts.CPU46.RES:Rescheduling_interrupts
     10804 ±  8%     +13.0%      12211 ±  8%  interrupts.CPU82.RES:Rescheduling_interrupts
     10708 ±  3%     +30.1%      13930 ± 22%  interrupts.CPU94.RES:Rescheduling_interrupts
      5456 ± 15%     +71.7%       9369 ± 20%  softirqs.CPU0.RCU
     18494 ±  4%      +6.9%      19771 ±  6%  softirqs.CPU0.TIMER
     20484 ± 14%     -22.5%      15866 ±  9%  softirqs.CPU27.TIMER
      5114 ± 10%     +64.9%       8433 ± 28%  softirqs.CPU5.RCU
      4841 ±  5%     +45.6%       7047 ± 32%  softirqs.CPU53.RCU
     17421 ±  3%      -9.3%      15796 ±  8%  softirqs.CPU53.TIMER
     18295 ±  4%     -11.7%      16155 ±  7%  softirqs.CPU59.TIMER
     19446 ± 10%     -13.6%      16803 ±  9%  softirqs.CPU7.TIMER
      4847 ±  7%     +62.3%       7866 ± 43%  softirqs.CPU8.RCU
     18.36            +5.3%      19.33        perf-stat.i.MPKI
      2.48 ±  3%      +0.2        2.63 ±  2%  perf-stat.i.cache-miss-rate%
  17934024 ±  4%     +10.0%   19730768        perf-stat.i.cache-misses
      4.13            +4.9%       4.33        perf-stat.i.cpi
      9504            -7.7%       8776        perf-stat.i.cycles-between-cache-misses
      0.02 ±  3%      +0.0        0.02 ±  5%  perf-stat.i.dTLB-store-miss-rate%
     58.48            -1.5       57.02        perf-stat.i.iTLB-load-miss-rate%
      0.25 ±  2%      -5.3%       0.23        perf-stat.i.ipc
     94.99            -1.0       94.02        perf-stat.i.node-load-miss-rate%
   6984752 ±  3%      +8.0%    7545390        perf-stat.i.node-load-misses
    336707 ±  4%     +36.2%     458652 ±  2%  perf-stat.i.node-loads
   5585196 ±  3%      +5.5%    5893365        perf-stat.i.node-store-misses
     18.76            +4.2%      19.55        perf-stat.overall.MPKI
      2.32            +0.2        2.52 ±  2%  perf-stat.overall.cache-miss-rate%
      4.21            +4.2%       4.38        perf-stat.overall.cpi
      9662            -8.0%       8891        perf-stat.overall.cycles-between-cache-misses
      0.02 ±  3%      +0.0        0.02 ±  5%  perf-stat.overall.dTLB-store-miss-rate%
     58.68            -1.6       57.07        perf-stat.overall.iTLB-load-miss-rate%
    987.32            +2.2%       1009        perf-stat.overall.instructions-per-iTLB-miss
      0.24            -4.0%       0.23        perf-stat.overall.ipc
     95.40            -1.1       94.27        perf-stat.overall.node-load-miss-rate%
  17353488 ±  4%     +10.0%   19087092        perf-stat.ps.cache-misses
 4.863e+09 ±  3%      -6.2%  4.562e+09        perf-stat.ps.dTLB-stores
   6758402 ±  3%      +8.0%    7299061        perf-stat.ps.node-load-misses
    325857 ±  4%     +36.2%     443722 ±  2%  perf-stat.ps.node-loads
   5404193 ±  3%      +5.5%    5700934        perf-stat.ps.node-store-misses
 1.275e+12            -6.1%  1.197e+12 ±  2%  perf-stat.total.instructions
     45.82 ± 36%     -27.5       18.30 ± 60%  perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.cpuidle_enter.do_idle.cpu_startup_entry
     49.70 ± 31%     -25.4       24.31 ± 41%  perf-profile.calltrace.cycles-pp.secondary_startup_64
     49.70 ± 31%     -25.4       24.31 ± 41%  perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64
     49.70 ± 31%     -25.4       24.31 ± 41%  perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64
     49.70 ± 31%     -25.4       24.31 ± 41%  perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64
     49.13 ± 32%     -24.8       24.31 ± 41%  perf-profile.calltrace.cycles-pp.cpuidle_enter.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64
     48.65 ± 31%     -24.3       24.31 ± 41%  perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.do_idle.cpu_startup_entry.start_secondary
     17.04 ± 85%     +26.6       43.60 ± 25%  perf-profile.calltrace.cycles-pp.ret_from_fork
     17.04 ± 85%     +26.6       43.60 ± 25%  perf-profile.calltrace.cycles-pp.kthread.ret_from_fork
     14.96 ±100%     +28.6       43.60 ± 25%  perf-profile.calltrace.cycles-pp.worker_thread.kthread.ret_from_fork
     14.67 ±103%     +28.9       43.60 ± 25%  perf-profile.calltrace.cycles-pp.process_one_work.worker_thread.kthread.ret_from_fork
     12.30 ±133%     +30.0       42.32 ± 29%  perf-profile.calltrace.cycles-pp.memcpy_erms.drm_fb_helper_dirty_work.process_one_work.worker_thread.kthread
     12.59 ±130%     +31.3       43.88 ± 24%  perf-profile.calltrace.cycles-pp.drm_fb_helper_dirty_work.process_one_work.worker_thread.kthread.ret_from_fork
     45.82 ± 36%     -27.5       18.30 ± 60%  perf-profile.children.cycles-pp.intel_idle
     49.70 ± 31%     -25.4       24.31 ± 41%  perf-profile.children.cycles-pp.secondary_startup_64
     49.70 ± 31%     -25.4       24.31 ± 41%  perf-profile.children.cycles-pp.start_secondary
     49.70 ± 31%     -25.4       24.31 ± 41%  perf-profile.children.cycles-pp.cpu_startup_entry
     49.70 ± 31%     -25.4       24.31 ± 41%  perf-profile.children.cycles-pp.do_idle
     49.13 ± 32%     -24.8       24.31 ± 41%  perf-profile.children.cycles-pp.cpuidle_enter
     49.13 ± 32%     -24.8       24.31 ± 41%  perf-profile.children.cycles-pp.cpuidle_enter_state
     17.04 ± 85%     +26.6       43.60 ± 25%  perf-profile.children.cycles-pp.ret_from_fork
     17.04 ± 85%     +26.6       43.60 ± 25%  perf-profile.children.cycles-pp.kthread
     14.96 ±100%     +28.6       43.60 ± 25%  perf-profile.children.cycles-pp.worker_thread
     14.67 ±103%     +28.9       43.60 ± 25%  perf-profile.children.cycles-pp.process_one_work
     12.59 ±130%     +31.0       43.60 ± 25%  perf-profile.children.cycles-pp.drm_fb_helper_dirty_work
     12.59 ±130%     +31.0       43.60 ± 25%  perf-profile.children.cycles-pp.memcpy_erms
     45.82 ± 36%     -27.5       18.30 ± 60%  perf-profile.self.cycles-pp.intel_idle
     12.13 ±128%     +31.5       43.60 ± 25%  perf-profile.self.cycles-pp.memcpy_erms


                                                                                
                                stress-ng.switch.ops                            
                                                                                
  8e+07 +-------------------------------------------------------------------+   
        |                                                                   |   
  7e+07 |-+...+....+                      +.....+....+.....+                |   
  6e+07 |..         :                     :                                 |   
        |           :                    :                                  |   
  5e+07 |-+   O      :         O         :                            O     |   
        |          O :   O          O   : O     O    O     O     O          |   
  4e+07 |-+           :                 :                                   |   
        |             :                :                                    |   
  3e+07 |-+            :               :                                    |   
  2e+07 |-+            :              :                                     |   
        |               :             :                                     |   
  1e+07 |-+             :            :                                      |   
        |                :           :                                      |   
      0 +-------------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                             stress-ng.switch.ops_per_sec                       
                                                                                
  2.5e+06 +-----------------------------------------------------------------+   
          |  ...+....+                     +.....+....+.....+               |   
          |..        :                     :                                |   
    2e+06 |-+         :                   :                                 |   
          |           :                   :                                 |   
          |     O    O :   O    O     O  : O     O    O     O    O     O    |   
  1.5e+06 |-+          :                 :                                  |   
          |             :                :                                  |   
    1e+06 |-+           :               :                                   |   
          |              :              :                                   |   
          |              :              :                                   |   
   500000 |-+             :            :                                    |   
          |               :            :                                    |   
          |                :          :                                     |   
        0 +-----------------------------------------------------------------+   
                                                                                
                                                                                
[*] bisect-good sample
[O] bisect-bad  sample



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Rong Chen


View attachment "config-5.6.0-rc6-00010-gfd4d9c7d0c718" of type "text/plain" (203570 bytes)

View attachment "job-script" of type "text/plain" (7779 bytes)

View attachment "job.yaml" of type "text/plain" (5449 bytes)

View attachment "reproduce" of type "text/plain" (339 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ