[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210428071653.GC13086@xsang-OptiPlex-9020>
Date: Wed, 28 Apr 2021 15:16:53 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Frederic Weisbecker <frederic@...nel.org>
Cc: Ingo Molnar <mingo@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org,
lkp@...el.com, ying.huang@...el.com, feng.tang@...el.com,
zhengjun.xing@...el.com
Subject: [entry] 47b8ff194c: will-it-scale.per_process_ops -3.0% regression
Greeting,
FYI, we noticed a -3.0% regression of will-it-scale.per_process_ops due to commit:
commit: 47b8ff194c1fd73d58dc339b597d466fe48c8958 ("entry: Explicitly flush pending rcuog wakeup before last rescheduling point")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
in testcase: will-it-scale
on test machine: 192 threads Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz with 192G memory
with following parameters:
nr_task: 100%
mode: process
test: futex3
cpufreq_governor: performance
ucode: 0x5003006
test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
test-url: https://github.com/antonblanchard/will-it-scale
If you fix the issue, kindly add following tag
Reported-by: kernel test robot <oliver.sang@...el.com>
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml
bin/lkp run compatible-job.yaml
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
gcc-9/performance/x86_64-rhel-8.3/process/100%/debian-10.4-x86_64-20200603.cgz/lkp-csl-2ap2/futex3/will-it-scale/0x5003006
commit:
f8bb5cae96 ("rcu/nocb: Trigger self-IPI on late deferred wake up before user resume")
47b8ff194c ("entry: Explicitly flush pending rcuog wakeup before last rescheduling point")
f8bb5cae9616224a 47b8ff194c1fd73d58dc339b597
---------------- ---------------------------
%stddev %change %stddev
\ | \
1.25e+09 -3.0% 1.212e+09 will-it-scale.192.processes
6508984 -3.0% 6314032 will-it-scale.per_process_ops
1.25e+09 -3.0% 1.212e+09 will-it-scale.workload
68.00 -1.5% 67.00 vmstat.cpu.sy
30.00 +3.3% 31.00 vmstat.cpu.us
8.622e+10 +1.2% 8.728e+10 perf-stat.i.branch-instructions
0.38 -0.0 0.36 perf-stat.i.branch-miss-rate%
3.206e+08 -3.7% 3.088e+08 perf-stat.i.branch-misses
0.98 +1.1% 0.99 perf-stat.i.cpi
263518 +2.3% 269550 perf-stat.i.dTLB-store-misses
1.135e+11 -1.9% 1.113e+11 perf-stat.i.dTLB-stores
3.257e+08 -4.8% 3.1e+08 perf-stat.i.iTLB-load-misses
5.718e+11 -1.1% 5.656e+11 perf-stat.i.instructions
1758 +3.9% 1827 perf-stat.i.instructions-per-iTLB-miss
1.03 -1.1% 1.01 perf-stat.i.ipc
0.37 -0.0 0.35 perf-stat.overall.branch-miss-rate%
0.97 +1.1% 0.99 perf-stat.overall.cpi
0.00 +0.0 0.00 perf-stat.overall.dTLB-store-miss-rate%
1755 +3.9% 1824 perf-stat.overall.instructions-per-iTLB-miss
1.03 -1.1% 1.02 perf-stat.overall.ipc
138016 +2.0% 140712 perf-stat.overall.path-length
8.592e+10 +1.2% 8.698e+10 perf-stat.ps.branch-instructions
3.195e+08 -3.7% 3.078e+08 perf-stat.ps.branch-misses
262973 +2.3% 269022 perf-stat.ps.dTLB-store-misses
1.131e+11 -1.9% 1.109e+11 perf-stat.ps.dTLB-stores
3.246e+08 -4.8% 3.09e+08 perf-stat.ps.iTLB-load-misses
5.698e+11 -1.1% 5.637e+11 perf-stat.ps.instructions
1.725e+14 -1.1% 1.706e+14 perf-stat.total.instructions
32.11 -1.0 31.08 perf-profile.calltrace.cycles-pp.__entry_text_start.syscall
36.13 -0.3 35.81 perf-profile.calltrace.cycles-pp.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
39.88 -0.3 39.58 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
30.75 -0.2 30.57 perf-profile.calltrace.cycles-pp.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
2.22 -0.1 2.16 perf-profile.calltrace.cycles-pp.testcase
2.15 -0.1 2.09 perf-profile.calltrace.cycles-pp.syscall_enter_from_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
2.21 -0.0 2.17 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_safe_stack.syscall
6.22 +0.1 6.32 perf-profile.calltrace.cycles-pp.get_futex_key.futex_wake.do_futex.__x64_sys_futex.do_syscall_64
1.17 +0.1 1.31 perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.syscall
52.27 +1.4 53.62 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.syscall
3.53 +1.5 5.00 perf-profile.calltrace.cycles-pp.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.syscall
0.00 +1.5 1.55 perf-profile.calltrace.cycles-pp.rcu_nocb_flush_deferred_wakeup.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.syscall
5.58 +1.9 7.47 perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.syscall
20.72 -0.6 20.09 perf-profile.children.cycles-pp.__entry_text_start
17.34 -0.5 16.87 perf-profile.children.cycles-pp.syscall_return_via_sysret
40.05 -0.3 39.75 perf-profile.children.cycles-pp.do_syscall_64
36.43 -0.2 36.20 perf-profile.children.cycles-pp.__x64_sys_futex
31.18 -0.2 30.98 perf-profile.children.cycles-pp.do_futex
2.36 -0.1 2.30 perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
2.41 -0.1 2.34 perf-profile.children.cycles-pp.testcase
2.19 -0.1 2.12 perf-profile.children.cycles-pp.syscall_enter_from_user_mode
97.88 +0.1 97.94 perf-profile.children.cycles-pp.syscall
6.46 +0.1 6.58 perf-profile.children.cycles-pp.get_futex_key
1.19 +0.1 1.33 perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare
52.69 +1.4 54.05 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
0.00 +1.6 1.60 perf-profile.children.cycles-pp.rcu_nocb_flush_deferred_wakeup
3.58 +1.7 5.33 perf-profile.children.cycles-pp.exit_to_user_mode_prepare
6.16 +1.9 8.07 perf-profile.children.cycles-pp.syscall_exit_to_user_mode
15.88 -0.5 15.33 perf-profile.self.cycles-pp.syscall
17.22 -0.5 16.75 perf-profile.self.cycles-pp.syscall_return_via_sysret
6.00 -0.3 5.70 perf-profile.self.cycles-pp.do_futex
9.33 -0.2 9.09 perf-profile.self.cycles-pp.__entry_text_start
6.58 -0.2 6.35 perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
8.29 -0.1 8.22 perf-profile.self.cycles-pp.hash_futex
2.36 -0.1 2.29 perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
1.86 -0.1 1.80 perf-profile.self.cycles-pp.syscall_enter_from_user_mode
1.96 -0.0 1.91 perf-profile.self.cycles-pp.testcase
1.14 +0.1 1.27 perf-profile.self.cycles-pp.syscall_exit_to_user_mode_prepare
6.04 +0.2 6.19 perf-profile.self.cycles-pp.get_futex_key
3.29 +0.4 3.65 perf-profile.self.cycles-pp.exit_to_user_mode_prepare
0.00 +1.4 1.39 perf-profile.self.cycles-pp.rcu_nocb_flush_deferred_wakeup
will-it-scale.192.processes
1.4e+09 +-----------------------------------------------------------------+
| .+.+ ++.++.+.+ +.|
1.2e+09 |.++.++.+O +.++.++.++.+.++.++.++.+.++.++.++ : : +.++.+ |
| : : : : |
1e+09 |-+ : : : : |
| : : : : |
8e+08 |-+ : : : : |
| :: : : |
6e+08 |-+ :: :: |
| : :: |
4e+08 |-+ : :: |
| + : |
2e+08 |-+ : |
| : |
0 +-----------------------------------------------------------------+
will-it-scale.per_process_ops
7e+06 +-------------------------------------------------------------------+
|.++.++.+ O+.+.++.++.+.++.+.++.++.+.++.++.+.++.+ +.++.+.++.+.++.++.|
6e+06 |-OO OO : : : : |
| : : : : |
5e+06 |-+ : : : : |
| : : : : |
4e+06 |-+ : : : : |
| :: : : |
3e+06 |-+ :: :: |
| : :: |
2e+06 |-+ : :: |
| + : |
1e+06 |-+ : |
| : |
0 +-------------------------------------------------------------------+
will-it-scale.workload
1.4e+09 +-----------------------------------------------------------------+
| .+.+ ++.++.+.+ +.|
1.2e+09 |.++.++.+O +.++.++.++.+.++.++.++.+.++.++.++ : : +.++.+ |
| : : : : |
1e+09 |-+ : : : : |
| : : : : |
8e+08 |-+ : : : : |
| :: : : |
6e+08 |-+ :: :: |
| : :: |
4e+08 |-+ : :: |
| + : |
2e+08 |-+ : |
| : |
0 +-----------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
---
0DAY/LKP+ Test Infrastructure Open Source Technology Center
https://lists.01.org/hyperkitty/list/lkp@lists.01.org Intel Corporation
Thanks,
Oliver Sang
View attachment "config-5.11.0-00050-g47b8ff194c1f" of type "text/plain" (172429 bytes)
View attachment "job-script" of type "text/plain" (7692 bytes)
View attachment "job.yaml" of type "text/plain" (5240 bytes)
View attachment "reproduce" of type "text/plain" (339 bytes)
Powered by blists - more mailing lists