[<prev] [next>] [day] [month] [year] [list]
Message-ID: <YtlRGqgzrqsgRYmR@xsang-OptiPlex-9020>
Date: Thu, 21 Jul 2022 21:14:02 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Peter Zijlstra <peterz@...radead.org>
CC: Borislav Petkov <bp@...e.de>, Josh Poimboeuf <jpoimboe@...nel.org>,
LKML <linux-kernel@...r.kernel.org>, <lkp@...ts.01.org>,
<lkp@...el.com>, <ying.huang@...el.com>, <feng.tang@...el.com>,
<zhengjun.xing@...ux.intel.com>, <fengwei.yin@...el.com>,
<tim.c.chen@...el.com>
Subject: [x86/bugs] 6ad0ad2bf8: will-it-scale.per_process_ops -33.5%
regression
Greeting,
FYI, we noticed a -33.5% regression of will-it-scale.per_process_ops due to commit:
commit: 6ad0ad2bf8a67e27d1f9d006a1dabb0e1c360cc3 ("x86/bugs: Report Intel retbleed vulnerability")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
in testcase: will-it-scale
on test machine: 104 threads 2 sockets Skylake with 192G memory
with following parameters:
nr_task: 16
mode: process
test: futex3
cpufreq_governor: performance
ucode: 0x2006c0a
test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
test-url: https://github.com/antonblanchard/will-it-scale
In addition to that, the commit also has significant impact on the following tests:
+------------------+----------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_process_ops -31.4% regression |
| test machine | 104 threads 2 sockets Skylake with 192G memory |
| test parameters | cpufreq_governor=performance |
| | mode=process |
| | nr_task=50% |
| | test=futex4 |
| | ucode=0x2006c0a |
+------------------+----------------------------------------------------------------+
If you fix the issue, kindly add following tag
Reported-by: kernel test robot <oliver.sang@...el.com>
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
sudo bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
sudo bin/lkp run generated-yaml-file
# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
gcc-11/performance/x86_64-rhel-8.3/process/16/debian-11.1-x86_64-20220510.cgz/lkp-skl-fpga01/futex3/will-it-scale/0x2006c0a
commit:
166115c08a ("x86/bugs: Split spectre_v2_select_mitigation() and spectre_v2_user_select_mitigation()")
6ad0ad2bf8 ("x86/bugs: Report Intel retbleed vulnerability")
166115c08a9b0b84 6ad0ad2bf8a67e27d1f9d006a1d
---------------- ---------------------------
%stddev %change %stddev
\ | \
30835845 -33.5% 20516991 will-it-scale.16.processes
1927239 -33.5% 1282311 will-it-scale.per_process_ops
30835845 -33.5% 20516991 will-it-scale.workload
8.59 ± 17% -1.1 7.44 ± 18% mpstat.cpu.all.usr%
0.09 ± 4% -32.1% 0.06 turbostat.IPC
226.11 ± 7% -4.9% 215.05 ± 6% turbostat.PkgWatt
1.875e+09 ± 16% -37.4% 1.174e+09 ± 15% perf-stat.i.branch-instructions
58612335 ± 18% -32.1% 39806456 ± 14% perf-stat.i.branch-misses
3.39 ± 6% +40.4% 4.76 ± 11% perf-stat.i.cpi
26700443 ± 17% -36.2% 17025890 ± 15% perf-stat.i.dTLB-load-misses
3.194e+09 ± 16% -34.1% 2.105e+09 ± 15% perf-stat.i.dTLB-loads
2.298e+09 ± 16% -36.9% 1.451e+09 ± 16% perf-stat.i.dTLB-stores
26226991 ± 17% -36.4% 16676481 ± 16% perf-stat.i.iTLB-load-misses
1.26e+10 ± 16% -33.6% 8.37e+09 ± 15% perf-stat.i.instructions
0.30 ± 4% -27.9% 0.22 ± 7% perf-stat.i.ipc
70.83 ± 16% -35.8% 45.47 ± 15% perf-stat.i.metric.M/sec
3.22 +43.0% 4.60 ± 2% perf-stat.overall.cpi
0.31 -30.0% 0.22 ± 2% perf-stat.overall.ipc
145784 +6.1% 154711 perf-stat.overall.path-length
1.873e+09 ± 15% -37.5% 1.171e+09 ± 15% perf-stat.ps.branch-instructions
58546994 ± 18% -32.2% 39721505 ± 14% perf-stat.ps.branch-misses
26668535 ± 17% -36.3% 16989617 ± 15% perf-stat.ps.dTLB-load-misses
3.19e+09 ± 16% -34.2% 2.1e+09 ± 15% perf-stat.ps.dTLB-loads
2.295e+09 ± 16% -36.9% 1.448e+09 ± 16% perf-stat.ps.dTLB-stores
26197064 ± 17% -36.5% 16641071 ± 16% perf-stat.ps.iTLB-load-misses
1.258e+10 ± 16% -33.6% 8.352e+09 ± 15% perf-stat.ps.instructions
4.495e+12 -29.4% 3.174e+12 perf-stat.total.instructions
34.16 ± 12% -11.6 22.54 ± 10% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
26.91 ± 12% -10.2 16.68 ± 10% perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
14.09 ± 12% -5.1 8.96 ± 10% perf-profile.calltrace.cycles-pp.__entry_text_start.syscall
5.76 ± 12% -2.0 3.79 ± 10% perf-profile.calltrace.cycles-pp.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
4.26 ± 12% -1.6 2.66 ± 10% perf-profile.calltrace.cycles-pp.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
3.74 ± 12% -1.4 2.35 ± 10% perf-profile.calltrace.cycles-pp.futex_wake.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe
3.38 ± 13% -1.2 2.15 ± 9% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_safe_stack.syscall
1.75 ± 12% -0.6 1.11 ± 11% perf-profile.calltrace.cycles-pp.futex_hash.futex_wake.do_futex.__x64_sys_futex.do_syscall_64
1.04 ± 14% -0.4 0.65 ± 11% perf-profile.calltrace.cycles-pp.testcase
12.45 ± 11% +6.2 18.66 ± 10% perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.syscall
34.69 ± 12% -11.9 22.82 ± 10% perf-profile.children.cycles-pp.do_syscall_64
27.06 ± 12% -10.3 16.76 ± 10% perf-profile.children.cycles-pp.syscall_exit_to_user_mode
13.51 ± 12% -4.9 8.57 ± 10% perf-profile.children.cycles-pp.__entry_text_start
5.81 ± 12% -2.0 3.82 ± 10% perf-profile.children.cycles-pp.__x64_sys_futex
4.30 ± 12% -1.6 2.69 ± 10% perf-profile.children.cycles-pp.do_futex
3.86 ± 12% -1.4 2.42 ± 11% perf-profile.children.cycles-pp.futex_wake
2.20 ± 12% -0.8 1.38 ± 9% perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
2.19 ± 13% -0.8 1.40 ± 9% perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
1.75 ± 12% -0.6 1.11 ± 11% perf-profile.children.cycles-pp.futex_hash
1.04 ± 14% -0.4 0.65 ± 11% perf-profile.children.cycles-pp.testcase
0.75 ± 13% -0.3 0.46 ± 12% perf-profile.children.cycles-pp.get_futex_key
0.32 ± 13% -0.2 0.12 ± 9% perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare
0.39 ± 11% +0.2 0.55 ± 13% perf-profile.children.cycles-pp.syscall_enter_from_user_mode
12.74 ± 12% +6.1 18.84 ± 10% perf-profile.children.cycles-pp.syscall_return_via_sysret
25.92 ± 12% -9.8 16.16 ± 10% perf-profile.self.cycles-pp.syscall_exit_to_user_mode
11.73 ± 13% -4.3 7.41 ± 10% perf-profile.self.cycles-pp.__entry_text_start
3.49 ± 12% -1.3 2.23 ± 9% perf-profile.self.cycles-pp.syscall
1.93 ± 12% -0.7 1.22 ± 10% perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
1.66 ± 12% -0.6 1.06 ± 11% perf-profile.self.cycles-pp.futex_hash
1.43 ± 11% -0.5 0.89 ± 10% perf-profile.self.cycles-pp.futex_wake
0.98 ± 13% -0.4 0.63 ± 10% perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
0.72 ± 15% -0.3 0.45 ± 10% perf-profile.self.cycles-pp.testcase
0.69 ± 14% -0.3 0.43 ± 12% perf-profile.self.cycles-pp.get_futex_key
0.43 ± 14% -0.2 0.27 ± 9% perf-profile.self.cycles-pp.do_futex
0.28 ± 13% -0.2 0.12 ± 9% perf-profile.self.cycles-pp.syscall_exit_to_user_mode_prepare
0.28 ± 11% +0.1 0.39 ± 13% perf-profile.self.cycles-pp.syscall_enter_from_user_mode
12.68 ± 12% +6.1 18.81 ± 10% perf-profile.self.cycles-pp.syscall_return_via_sysret
1.62 ± 12% +9.3 10.96 ± 10% perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
***************************************************************************************************
lkp-skl-fpga01: 104 threads 2 sockets Skylake with 192G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
gcc-11/performance/x86_64-rhel-8.3/process/50%/debian-11.1-x86_64-20220510.cgz/lkp-skl-fpga01/futex4/will-it-scale/0x2006c0a
commit:
166115c08a ("x86/bugs: Split spectre_v2_select_mitigation() and spectre_v2_user_select_mitigation()")
6ad0ad2bf8 ("x86/bugs: Report Intel retbleed vulnerability")
166115c08a9b0b84 6ad0ad2bf8a67e27d1f9d006a1d
---------------- ---------------------------
%stddev %change %stddev
\ | \
90733512 -31.4% 62258677 will-it-scale.52.processes
1744874 -31.4% 1197281 will-it-scale.per_process_ops
90733512 -31.4% 62258677 will-it-scale.workload
655664 ± 10% -22.6% 507216 ± 5% meminfo.DirectMap4k
0.04 ± 9% +0.0 0.05 ± 6% mpstat.cpu.all.soft%
3.85 +14.0% 4.39 ± 4% sched_debug.cpu.clock.stddev
0.11 ± 3% -35.4% 0.07 turbostat.IPC
96.34 ± 2% -96.3 0.00 turbostat.PKG_%
5838 ± 40% -48.7% 2993 ± 13% turbostat.POLL
371.79 -4.2% 356.19 turbostat.PkgWatt
8.115e+09 -33.8% 5.372e+09 perf-stat.i.branch-instructions
2.31 +0.1 2.41 perf-stat.i.branch-miss-rate%
1.865e+08 -30.5% 1.296e+08 perf-stat.i.branch-misses
11.17 ± 3% +1.2 12.38 ± 3% perf-stat.i.cache-miss-rate%
580470 ± 8% +15.5% 670300 ± 6% perf-stat.i.cache-misses
2.69 +42.6% 3.84 perf-stat.i.cpi
310120 ± 8% -17.0% 257300 ± 8% perf-stat.i.cycles-between-cache-misses
0.62 -0.0 0.62 perf-stat.i.dTLB-load-miss-rate%
90617994 -31.3% 62233060 perf-stat.i.dTLB-load-misses
1.443e+10 -30.5% 1.003e+10 perf-stat.i.dTLB-loads
70201 -5.3% 66506 perf-stat.i.dTLB-store-misses
1.106e+10 -32.1% 7.505e+09 perf-stat.i.dTLB-stores
91097784 -31.2% 62707380 perf-stat.i.iTLB-load-misses
5.418e+10 -29.8% 3.801e+10 perf-stat.i.instructions
596.50 +1.9% 608.12 perf-stat.i.instructions-per-iTLB-miss
0.37 -29.9% 0.26 perf-stat.i.ipc
323.07 -31.8% 220.28 perf-stat.i.metric.M/sec
120820 ± 5% +14.4% 138242 ± 4% perf-stat.i.node-load-misses
22505 ± 5% +13.4% 25514 ± 3% perf-stat.i.node-store-misses
8944 ± 6% +20.5% 10779 ± 9% perf-stat.i.node-stores
0.09 ± 6% +50.0% 0.14 ± 4% perf-stat.overall.MPKI
2.30 +0.1 2.41 perf-stat.overall.branch-miss-rate%
11.51 ± 2% +1.1 12.62 ± 2% perf-stat.overall.cache-miss-rate%
2.69 +42.8% 3.84 perf-stat.overall.cpi
249231 ± 7% -13.4% 215874 ± 6% perf-stat.overall.cycles-between-cache-misses
0.62 -0.0 0.62 perf-stat.overall.dTLB-load-miss-rate%
0.00 +0.0 0.00 perf-stat.overall.dTLB-store-miss-rate%
594.79 +1.9% 606.25 perf-stat.overall.instructions-per-iTLB-miss
0.37 -30.0% 0.26 perf-stat.overall.ipc
180499 +2.2% 184469 perf-stat.overall.path-length
8.088e+09 -33.8% 5.355e+09 perf-stat.ps.branch-instructions
1.86e+08 -30.5% 1.292e+08 perf-stat.ps.branch-misses
586425 ± 8% +15.4% 676887 ± 6% perf-stat.ps.cache-misses
90306393 -31.3% 62021744 perf-stat.ps.dTLB-load-misses
1.438e+10 -30.5% 1e+10 perf-stat.ps.dTLB-loads
70004 -5.2% 66332 perf-stat.ps.dTLB-store-misses
1.102e+10 -32.1% 7.48e+09 perf-stat.ps.dTLB-stores
90782034 -31.2% 62496607 perf-stat.ps.iTLB-load-misses
5.4e+10 -29.8% 3.789e+10 perf-stat.ps.instructions
121778 ± 5% +14.9% 139936 ± 4% perf-stat.ps.node-load-misses
22472 ± 6% +13.4% 25477 ± 3% perf-stat.ps.node-store-misses
9020 ± 6% +20.4% 10864 ± 9% perf-stat.ps.node-stores
1.638e+13 -29.9% 1.148e+13 perf-stat.total.instructions
40.11 -11.3 28.83 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
25.35 -8.1 17.26 perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
13.72 -4.3 9.41 perf-profile.calltrace.cycles-pp.__entry_text_start.syscall
13.44 -3.8 9.65 perf-profile.calltrace.cycles-pp.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
12.01 -3.6 8.44 perf-profile.calltrace.cycles-pp.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
11.57 -3.4 8.17 perf-profile.calltrace.cycles-pp.futex_wait.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe
10.06 -2.9 7.12 perf-profile.calltrace.cycles-pp.futex_wait_setup.futex_wait.do_futex.__x64_sys_futex.do_syscall_64
41.56 -1.7 39.84 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.syscall
4.32 -1.4 2.95 perf-profile.calltrace.cycles-pp.futex_q_lock.futex_wait_setup.futex_wait.do_futex.__x64_sys_futex
3.38 -1.1 2.31 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_safe_stack.syscall
3.03 -0.9 2.14 perf-profile.calltrace.cycles-pp.futex_get_value_locked.futex_wait_setup.futex_wait.do_futex.__x64_sys_futex
2.58 ± 2% -0.8 1.78 perf-profile.calltrace.cycles-pp.__get_user_nocheck_4.futex_get_value_locked.futex_wait_setup.futex_wait.do_futex
1.53 -0.5 1.06 perf-profile.calltrace.cycles-pp.futex_hash.futex_q_lock.futex_wait_setup.futex_wait.do_futex
1.23 ± 2% -0.4 0.84 perf-profile.calltrace.cycles-pp._raw_spin_lock.futex_q_lock.futex_wait_setup.futex_wait.do_futex
1.29 -0.3 0.98 perf-profile.calltrace.cycles-pp.futex_q_unlock.futex_wait_setup.futex_wait.do_futex.__x64_sys_futex
0.76 ± 2% -0.2 0.54 perf-profile.calltrace.cycles-pp.get_futex_key.futex_wait_setup.futex_wait.do_futex.__x64_sys_futex
0.00 +0.6 0.57 perf-profile.calltrace.cycles-pp.syscall_enter_from_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
12.10 +7.8 19.90 perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.syscall
40.40 -11.3 29.11 perf-profile.children.cycles-pp.do_syscall_64
25.50 -8.1 17.37 perf-profile.children.cycles-pp.syscall_exit_to_user_mode
13.14 -4.1 9.02 perf-profile.children.cycles-pp.__entry_text_start
13.53 -3.8 9.71 perf-profile.children.cycles-pp.__x64_sys_futex
12.09 -3.6 8.50 perf-profile.children.cycles-pp.do_futex
11.62 -3.4 8.21 perf-profile.children.cycles-pp.futex_wait
10.28 -3.0 7.25 perf-profile.children.cycles-pp.futex_wait_setup
41.78 -1.4 40.35 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
4.46 -1.4 3.05 perf-profile.children.cycles-pp.futex_q_lock
3.07 -0.9 2.18 perf-profile.children.cycles-pp.futex_get_value_locked
2.85 -0.9 1.98 perf-profile.children.cycles-pp.__get_user_nocheck_4
2.22 -0.7 1.52 perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
2.14 -0.7 1.46 perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
1.53 -0.5 1.06 perf-profile.children.cycles-pp.futex_hash
1.28 ± 2% -0.4 0.88 perf-profile.children.cycles-pp._raw_spin_lock
1.29 -0.3 0.98 perf-profile.children.cycles-pp.futex_q_unlock
0.80 ± 2% -0.2 0.57 perf-profile.children.cycles-pp.get_futex_key
0.25 -0.1 0.17 ± 3% perf-profile.children.cycles-pp.exit_to_user_mode_prepare
0.23 ± 2% -0.1 0.16 ± 4% perf-profile.children.cycles-pp.testcase
0.15 ± 2% -0.0 0.10 ± 4% perf-profile.children.cycles-pp.futex_setup_timer
0.09 ± 5% -0.0 0.06 ± 9% perf-profile.children.cycles-pp.syscall@plt
0.10 ± 5% -0.0 0.07 ± 8% perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare
0.38 ± 3% +0.2 0.62 perf-profile.children.cycles-pp.syscall_enter_from_user_mode
12.37 +7.7 20.09 perf-profile.children.cycles-pp.syscall_return_via_sysret
25.21 -8.0 17.17 perf-profile.self.cycles-pp.syscall_exit_to_user_mode
11.39 -3.6 7.82 perf-profile.self.cycles-pp.__entry_text_start
4.00 -1.2 2.75 perf-profile.self.cycles-pp.syscall
2.79 ± 2% -0.8 1.94 perf-profile.self.cycles-pp.__get_user_nocheck_4
1.86 -0.6 1.28 perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
1.56 -0.5 1.07 perf-profile.self.cycles-pp.futex_q_lock
1.47 -0.4 1.02 perf-profile.self.cycles-pp.futex_hash
1.23 ± 2% -0.4 0.85 perf-profile.self.cycles-pp._raw_spin_lock
1.20 ± 2% -0.4 0.84 perf-profile.self.cycles-pp.futex_wait
1.07 -0.3 0.72 ± 2% perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
1.23 -0.3 0.94 ± 2% perf-profile.self.cycles-pp.futex_q_unlock
0.88 -0.3 0.60 perf-profile.self.cycles-pp.futex_wait_setup
1.43 -0.2 1.20 perf-profile.self.cycles-pp.__x64_sys_futex
0.79 ± 3% -0.2 0.57 perf-profile.self.cycles-pp.get_futex_key
0.45 ± 2% -0.2 0.30 ± 2% perf-profile.self.cycles-pp.do_futex
0.23 ± 2% -0.1 0.16 ± 5% perf-profile.self.cycles-pp.testcase
0.19 ± 2% -0.1 0.13 ± 4% perf-profile.self.cycles-pp.exit_to_user_mode_prepare
0.26 ± 7% -0.0 0.22 ± 5% perf-profile.self.cycles-pp.futex_get_value_locked
0.10 -0.0 0.07 ± 5% perf-profile.self.cycles-pp.futex_setup_timer
0.28 ± 3% +0.2 0.43 ± 2% perf-profile.self.cycles-pp.syscall_enter_from_user_mode
0.88 +0.2 1.04 perf-profile.self.cycles-pp.do_syscall_64
12.31 +7.8 20.06 perf-profile.self.cycles-pp.syscall_return_via_sysret
1.47 +10.1 11.58 perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://01.org/lkp
View attachment "config-5.19.0-rc4-00026-g6ad0ad2bf8a6" of type "text/plain" (164055 bytes)
View attachment "job-script" of type "text/plain" (7757 bytes)
View attachment "job.yaml" of type "text/plain" (5267 bytes)
View attachment "reproduce" of type "text/plain" (346 bytes)
Powered by blists - more mailing lists