[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200326055723.GL11705@shao2-debian>
Date: Thu, 26 Mar 2020 13:57:23 +0800
From: kernel test robot <rong.a.chen@...el.com>
To: Jann Horn <jannh@...gle.com>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org
Subject: [mm] fd4d9c7d0c: stress-ng.switch.ops_per_sec -30.5% regression
Greeting,
FYI, we noticed a -30.5% regression of stress-ng.switch.ops_per_sec due to commit:
commit: fd4d9c7d0c71866ec0c2825189ebd2ce35bd95b8 ("mm: slub: add missing TID bump in kmem_cache_alloc_bulk()")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
in testcase: stress-ng
on test machine: 96 threads Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 192G memory
with following parameters:
nr_threads: 100%
disk: 1HDD
testtime: 30s
test: switch
cpufreq_governor: performance
ucode: 0x500002c
If you fix the issue, kindly add following tag
Reported-by: kernel test robot <rong.a.chen@...el.com>
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
=========================================================================================
compiler/cpufreq_governor/disk/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime/ucode:
gcc-7/performance/1HDD/x86_64-rhel-7.6/100%/debian-x86_64-20191114.cgz/lkp-csl-2sp5/switch/stress-ng/30s/0x500002c
commit:
ac309e7744 ("Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid")
fd4d9c7d0c ("mm: slub: add missing TID bump in kmem_cache_alloc_bulk()")
ac309e7744bee222 fd4d9c7d0c71866ec0c2825189e
---------------- ---------------------------
%stddev %change %stddev
\ | \
69076693 -30.5% 47993323 stress-ng.switch.ops
2302520 -30.5% 1599758 stress-ng.switch.ops_per_sec
26.79 -9.0% 24.37 stress-ng.time.user_time
9242 ± 13% -16.2% 7749 ± 2% numa-meminfo.node0.KernelStack
2.86 ±100% -100.0% 0.00 iostat.sdb.await.max
2.86 ±100% -100.0% 0.00 iostat.sdb.r_await.max
9243 ± 13% -16.2% 7748 ± 2% numa-vmstat.node0.nr_kernel_stack
157380 ± 9% -60.3% 62515 ± 90% numa-vmstat.node0.numa_other
22499 ± 28% -41.5% 13173 ± 34% sched_debug.cfs_rq:/.spread0.max
-3319 +252.7% -11706 sched_debug.cfs_rq:/.spread0.min
-53.25 -45.1% -29.25 sched_debug.cpu.nr_uninterruptible.min
10425 ± 7% +13.3% 11813 ± 5% interrupts.CPU41.RES:Rescheduling_interrupts
10605 ± 2% +31.9% 13993 ± 23% interrupts.CPU46.RES:Rescheduling_interrupts
10804 ± 8% +13.0% 12211 ± 8% interrupts.CPU82.RES:Rescheduling_interrupts
10708 ± 3% +30.1% 13930 ± 22% interrupts.CPU94.RES:Rescheduling_interrupts
5456 ± 15% +71.7% 9369 ± 20% softirqs.CPU0.RCU
18494 ± 4% +6.9% 19771 ± 6% softirqs.CPU0.TIMER
20484 ± 14% -22.5% 15866 ± 9% softirqs.CPU27.TIMER
5114 ± 10% +64.9% 8433 ± 28% softirqs.CPU5.RCU
4841 ± 5% +45.6% 7047 ± 32% softirqs.CPU53.RCU
17421 ± 3% -9.3% 15796 ± 8% softirqs.CPU53.TIMER
18295 ± 4% -11.7% 16155 ± 7% softirqs.CPU59.TIMER
19446 ± 10% -13.6% 16803 ± 9% softirqs.CPU7.TIMER
4847 ± 7% +62.3% 7866 ± 43% softirqs.CPU8.RCU
18.36 +5.3% 19.33 perf-stat.i.MPKI
2.48 ± 3% +0.2 2.63 ± 2% perf-stat.i.cache-miss-rate%
17934024 ± 4% +10.0% 19730768 perf-stat.i.cache-misses
4.13 +4.9% 4.33 perf-stat.i.cpi
9504 -7.7% 8776 perf-stat.i.cycles-between-cache-misses
0.02 ± 3% +0.0 0.02 ± 5% perf-stat.i.dTLB-store-miss-rate%
58.48 -1.5 57.02 perf-stat.i.iTLB-load-miss-rate%
0.25 ± 2% -5.3% 0.23 perf-stat.i.ipc
94.99 -1.0 94.02 perf-stat.i.node-load-miss-rate%
6984752 ± 3% +8.0% 7545390 perf-stat.i.node-load-misses
336707 ± 4% +36.2% 458652 ± 2% perf-stat.i.node-loads
5585196 ± 3% +5.5% 5893365 perf-stat.i.node-store-misses
18.76 +4.2% 19.55 perf-stat.overall.MPKI
2.32 +0.2 2.52 ± 2% perf-stat.overall.cache-miss-rate%
4.21 +4.2% 4.38 perf-stat.overall.cpi
9662 -8.0% 8891 perf-stat.overall.cycles-between-cache-misses
0.02 ± 3% +0.0 0.02 ± 5% perf-stat.overall.dTLB-store-miss-rate%
58.68 -1.6 57.07 perf-stat.overall.iTLB-load-miss-rate%
987.32 +2.2% 1009 perf-stat.overall.instructions-per-iTLB-miss
0.24 -4.0% 0.23 perf-stat.overall.ipc
95.40 -1.1 94.27 perf-stat.overall.node-load-miss-rate%
17353488 ± 4% +10.0% 19087092 perf-stat.ps.cache-misses
4.863e+09 ± 3% -6.2% 4.562e+09 perf-stat.ps.dTLB-stores
6758402 ± 3% +8.0% 7299061 perf-stat.ps.node-load-misses
325857 ± 4% +36.2% 443722 ± 2% perf-stat.ps.node-loads
5404193 ± 3% +5.5% 5700934 perf-stat.ps.node-store-misses
1.275e+12 -6.1% 1.197e+12 ± 2% perf-stat.total.instructions
45.82 ± 36% -27.5 18.30 ± 60% perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.cpuidle_enter.do_idle.cpu_startup_entry
49.70 ± 31% -25.4 24.31 ± 41% perf-profile.calltrace.cycles-pp.secondary_startup_64
49.70 ± 31% -25.4 24.31 ± 41% perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64
49.70 ± 31% -25.4 24.31 ± 41% perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64
49.70 ± 31% -25.4 24.31 ± 41% perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64
49.13 ± 32% -24.8 24.31 ± 41% perf-profile.calltrace.cycles-pp.cpuidle_enter.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64
48.65 ± 31% -24.3 24.31 ± 41% perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.do_idle.cpu_startup_entry.start_secondary
17.04 ± 85% +26.6 43.60 ± 25% perf-profile.calltrace.cycles-pp.ret_from_fork
17.04 ± 85% +26.6 43.60 ± 25% perf-profile.calltrace.cycles-pp.kthread.ret_from_fork
14.96 ±100% +28.6 43.60 ± 25% perf-profile.calltrace.cycles-pp.worker_thread.kthread.ret_from_fork
14.67 ±103% +28.9 43.60 ± 25% perf-profile.calltrace.cycles-pp.process_one_work.worker_thread.kthread.ret_from_fork
12.30 ±133% +30.0 42.32 ± 29% perf-profile.calltrace.cycles-pp.memcpy_erms.drm_fb_helper_dirty_work.process_one_work.worker_thread.kthread
12.59 ±130% +31.3 43.88 ± 24% perf-profile.calltrace.cycles-pp.drm_fb_helper_dirty_work.process_one_work.worker_thread.kthread.ret_from_fork
45.82 ± 36% -27.5 18.30 ± 60% perf-profile.children.cycles-pp.intel_idle
49.70 ± 31% -25.4 24.31 ± 41% perf-profile.children.cycles-pp.secondary_startup_64
49.70 ± 31% -25.4 24.31 ± 41% perf-profile.children.cycles-pp.start_secondary
49.70 ± 31% -25.4 24.31 ± 41% perf-profile.children.cycles-pp.cpu_startup_entry
49.70 ± 31% -25.4 24.31 ± 41% perf-profile.children.cycles-pp.do_idle
49.13 ± 32% -24.8 24.31 ± 41% perf-profile.children.cycles-pp.cpuidle_enter
49.13 ± 32% -24.8 24.31 ± 41% perf-profile.children.cycles-pp.cpuidle_enter_state
17.04 ± 85% +26.6 43.60 ± 25% perf-profile.children.cycles-pp.ret_from_fork
17.04 ± 85% +26.6 43.60 ± 25% perf-profile.children.cycles-pp.kthread
14.96 ±100% +28.6 43.60 ± 25% perf-profile.children.cycles-pp.worker_thread
14.67 ±103% +28.9 43.60 ± 25% perf-profile.children.cycles-pp.process_one_work
12.59 ±130% +31.0 43.60 ± 25% perf-profile.children.cycles-pp.drm_fb_helper_dirty_work
12.59 ±130% +31.0 43.60 ± 25% perf-profile.children.cycles-pp.memcpy_erms
45.82 ± 36% -27.5 18.30 ± 60% perf-profile.self.cycles-pp.intel_idle
12.13 ±128% +31.5 43.60 ± 25% perf-profile.self.cycles-pp.memcpy_erms
stress-ng.switch.ops
8e+07 +-------------------------------------------------------------------+
| |
7e+07 |-+...+....+ +.....+....+.....+ |
6e+07 |.. : : |
| : : |
5e+07 |-+ O : O : O |
| O : O O : O O O O O |
4e+07 |-+ : : |
| : : |
3e+07 |-+ : : |
2e+07 |-+ : : |
| : : |
1e+07 |-+ : : |
| : : |
0 +-------------------------------------------------------------------+
stress-ng.switch.ops_per_sec
2.5e+06 +-----------------------------------------------------------------+
| ...+....+ +.....+....+.....+ |
|.. : : |
2e+06 |-+ : : |
| : : |
| O O : O O O : O O O O O O |
1.5e+06 |-+ : : |
| : : |
1e+06 |-+ : : |
| : : |
| : : |
500000 |-+ : : |
| : : |
| : : |
0 +-----------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Rong Chen
View attachment "config-5.6.0-rc6-00010-gfd4d9c7d0c718" of type "text/plain" (203570 bytes)
View attachment "job-script" of type "text/plain" (7779 bytes)
View attachment "job.yaml" of type "text/plain" (5449 bytes)
View attachment "reproduce" of type "text/plain" (339 bytes)
Powered by blists - more mailing lists