[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20200708082849.GM3874@shao2-debian>
Date: Wed, 8 Jul 2020 16:28:49 +0800
From: kernel test robot <rong.a.chen@...el.com>
To: Thomas Gleixner <tglx@...utronix.de>
Cc: Alexandre Chartre <alexandre.chartre@...cle.com>,
Peter Zijlstra <peterz@...radead.org>,
LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org
Subject: [x86/entry/common] 8f159f1dfa: will-it-scale.per_process_ops -6.4%
regression
Greeting,
FYI, we noticed a -6.4% regression of will-it-scale.per_process_ops due to commit:
commit: 8f159f1dfa1ea29d70a84335fe6a8bd501a9eecd ("x86/entry/common: Protect against instrumentation")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
in testcase: will-it-scale
on test machine: 192 threads Cooper Lake with 128G memory
with following parameters:
nr_task: 100%
mode: process
test: lseek1
cpufreq_governor: performance
ucode: 0x86000017
test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
test-url: https://github.com/antonblanchard/will-it-scale
If you fix the issue, kindly add following tag
Reported-by: kernel test robot <rong.a.chen@...el.com>
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
gcc-9/performance/x86_64-rhel-7.6/process/100%/debian-x86_64-20191114.cgz/lkp-cpx-4s1/lseek1/will-it-scale/0x86000017
commit:
1723be30e4 ("x86/entry: Mark enter_from_user_mode() noinstr")
8f159f1dfa ("x86/entry/common: Protect against instrumentation")
1723be30e46fbda0 8f159f1dfa1ea29d70a84335fe6
---------------- ---------------------------
%stddev %change %stddev
\ | \
9977836 -6.4% 9334708 will-it-scale.per_process_ops
1.916e+09 -6.4% 1.792e+09 will-it-scale.workload
1612 -75.1% 401.75 ±173% meminfo.Mlocked
38.00 +2.6% 39.00 vmstat.cpu.us
30383 ± 27% +51.8% 46125 ± 12% numa-meminfo.node1.KReclaimable
30383 ± 27% +51.8% 46125 ± 12% numa-meminfo.node1.SReclaimable
26574 ± 23% -29.2% 18820 ± 14% numa-meminfo.node2.KReclaimable
26574 ± 23% -29.2% 18820 ± 14% numa-meminfo.node2.SReclaimable
82840 ± 12% -14.1% 71200 ± 4% numa-meminfo.node2.SUnreclaim
100.00 ± 26% -79.0% 21.00 ±173% numa-vmstat.node1.nr_mlock
7595 ± 27% +51.8% 11531 ± 12% numa-vmstat.node1.nr_slab_reclaimable
115.50 ± 26% -81.8% 21.00 ±173% numa-vmstat.node2.nr_mlock
6643 ± 23% -29.2% 4704 ± 14% numa-vmstat.node2.nr_slab_reclaimable
20710 ± 12% -14.1% 17799 ± 4% numa-vmstat.node2.nr_slab_unreclaimable
15093 ± 4% +14.5% 17281 ± 4% sched_debug.cpu.sched_count.max
7918 ± 11% +14.3% 9046 ± 4% sched_debug.cpu.ttwu_count.max
0.49 ± 56% -96.8% 0.02 ±160% sched_debug.rt_rq:/.rt_time.avg
93.99 ± 56% -96.8% 3.00 ±160% sched_debug.rt_rq:/.rt_time.max
6.77 ± 56% -96.8% 0.22 ±160% sched_debug.rt_rq:/.rt_time.stddev
296.75 ± 23% +263.2% 1077 ± 67% interrupts.32:PCI-MSI.524290-edge.eth0-TxRx-1
296.75 ± 23% +263.2% 1077 ± 67% interrupts.CPU10.32:PCI-MSI.524290-edge.eth0-TxRx-1
899.00 ± 7% -10.5% 805.00 interrupts.CPU141.CAL:Function_call_interrupts
1204 ± 36% +83.6% 2211 ± 38% interrupts.CPU170.CAL:Function_call_interrupts
1324 ± 28% -30.8% 916.00 ± 23% interrupts.CPU2.CAL:Function_call_interrupts
3042 ± 36% -53.3% 1419 ± 27% interrupts.CPU24.CAL:Function_call_interrupts
1061 ± 24% +83.7% 1950 ± 32% interrupts.CPU72.CAL:Function_call_interrupts
77.25 ±165% -97.1% 2.25 ± 19% interrupts.CPU93.TLB:TLB_shootdowns
769.00 ± 23% -36.9% 485.00 ± 11% interrupts.TLB:TLB_shootdowns
21833 ± 3% +18.8% 25926 ± 7% softirqs.CPU0.RCU
20599 ± 4% +13.5% 23371 ± 8% softirqs.CPU107.RCU
22896 ± 11% +21.8% 27893 ± 5% softirqs.CPU125.RCU
21380 ± 6% +18.5% 25341 ± 7% softirqs.CPU163.RCU
21890 ± 9% +15.1% 25191 ± 6% softirqs.CPU166.RCU
20047 ± 5% +17.0% 23453 ± 8% softirqs.CPU176.RCU
21786 ± 3% +16.2% 25318 ± 8% softirqs.CPU25.RCU
23213 ± 4% +14.6% 26602 ± 6% softirqs.CPU35.RCU
21272 ± 5% +17.4% 24975 ± 8% softirqs.CPU71.RCU
20159 ± 4% +16.1% 23400 ± 7% softirqs.CPU76.RCU
1.176e+11 +2.7% 1.208e+11 perf-stat.i.branch-instructions
1.65 -0.1 1.51 perf-stat.i.branch-miss-rate%
1.934e+09 -6.5% 1.808e+09 perf-stat.i.branch-misses
1.26 -5.7% 1.19 perf-stat.i.cpi
0.00 ± 5% +0.0 0.00 ± 5% perf-stat.i.dTLB-load-miss-rate%
441221 ± 6% +606.9% 3119036 ± 5% perf-stat.i.dTLB-load-misses
1.7e+11 +7.2% 1.823e+11 perf-stat.i.dTLB-loads
16104 ± 2% -4.2% 15432 ± 3% perf-stat.i.dTLB-store-misses
9.743e+10 +17.4% 1.144e+11 perf-stat.i.dTLB-stores
2.243e+09 -24.3% 1.697e+09 ± 2% perf-stat.i.iTLB-load-misses
46888822 +9.4% 51286197 perf-stat.i.iTLB-loads
5.555e+11 +5.5% 5.861e+11 perf-stat.i.instructions
257.71 +37.7% 354.92 perf-stat.i.instructions-per-iTLB-miss
0.80 +6.0% 0.84 perf-stat.i.ipc
1.04 -5.9% 0.98 ± 3% perf-stat.i.metric.K/sec
2005 +8.4% 2174 perf-stat.i.metric.M/sec
0.03 -5.8% 0.03 ± 2% perf-stat.overall.MPKI
1.64 -0.1 1.50 perf-stat.overall.branch-miss-rate%
1.26 -5.7% 1.18 perf-stat.overall.cpi
0.00 ± 8% +0.0 0.00 ± 6% perf-stat.overall.dTLB-load-miss-rate%
0.00 ± 2% -0.0 0.00 ± 3% perf-stat.overall.dTLB-store-miss-rate%
247.68 +39.5% 345.39 ± 2% perf-stat.overall.instructions-per-iTLB-miss
0.80 +6.0% 0.84 perf-stat.overall.ipc
87575 +12.6% 98570 perf-stat.overall.path-length
1.172e+11 +2.7% 1.204e+11 perf-stat.ps.branch-instructions
1.927e+09 -6.5% 1.802e+09 perf-stat.ps.branch-misses
473764 ± 8% +561.3% 3132830 ± 6% perf-stat.ps.dTLB-load-misses
1.694e+11 +7.2% 1.817e+11 perf-stat.ps.dTLB-loads
9.71e+10 +17.4% 1.14e+11 perf-stat.ps.dTLB-stores
2.235e+09 -24.3% 1.692e+09 ± 2% perf-stat.ps.iTLB-load-misses
46727836 +9.5% 51154009 perf-stat.ps.iTLB-loads
5.537e+11 +5.5% 5.841e+11 perf-stat.ps.instructions
1.678e+14 +5.3% 1.767e+14 perf-stat.total.instructions
39.88 -3.8 36.04 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.lseek64
40.97 -2.5 38.46 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.lseek64
22.10 -1.1 20.95 perf-profile.calltrace.cycles-pp.ksys_lseek.do_syscall_64.entry_SYSCALL_64_after_hwframe.lseek64
8.75 -0.5 8.27 perf-profile.calltrace.cycles-pp.__fdget_pos.ksys_lseek.do_syscall_64.entry_SYSCALL_64_after_hwframe.lseek64
7.52 -0.5 7.07 perf-profile.calltrace.cycles-pp.__fget_light.__fdget_pos.ksys_lseek.do_syscall_64.entry_SYSCALL_64_after_hwframe
6.44 -0.4 6.05 perf-profile.calltrace.cycles-pp.shmem_file_llseek.ksys_lseek.do_syscall_64.entry_SYSCALL_64_after_hwframe.lseek64
6.69 -0.2 6.54 perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.lseek64
1.98 -0.1 1.88 perf-profile.calltrace.cycles-pp.generic_file_llseek_size.ksys_lseek.do_syscall_64.entry_SYSCALL_64_after_hwframe.lseek64
98.92 +0.2 99.08 perf-profile.calltrace.cycles-pp.lseek64
2.30 +0.6 2.94 perf-profile.calltrace.cycles-pp.__x64_sys_lseek.do_syscall_64.entry_SYSCALL_64_after_hwframe.lseek64
0.00 +0.8 0.75 perf-profile.calltrace.cycles-pp.enter_from_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.lseek64
0.00 +1.7 1.69 perf-profile.calltrace.cycles-pp.fpregs_assert_state_consistent.__prepare_exit_to_usermode.do_syscall_64.entry_SYSCALL_64_after_hwframe.lseek64
0.00 +2.3 2.33 perf-profile.calltrace.cycles-pp.__syscall_return_slowpath.do_syscall_64.entry_SYSCALL_64_after_hwframe.lseek64
47.50 +4.1 51.62 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.lseek64
0.00 +4.9 4.85 perf-profile.calltrace.cycles-pp.__prepare_exit_to_usermode.do_syscall_64.entry_SYSCALL_64_after_hwframe.lseek64
0.00 +5.9 5.87 perf-profile.calltrace.cycles-pp.exit_to_user_mode.entry_SYSCALL_64_after_hwframe.lseek64
40.57 -4.0 36.60 perf-profile.children.cycles-pp.do_syscall_64
27.37 -1.6 25.76 perf-profile.children.cycles-pp.entry_SYSCALL_64
22.66 -1.1 21.53 perf-profile.children.cycles-pp.ksys_lseek
19.98 -0.9 19.10 perf-profile.children.cycles-pp.syscall_return_via_sysret
9.42 -0.5 8.92 perf-profile.children.cycles-pp.__fdget_pos
7.52 -0.5 7.07 perf-profile.children.cycles-pp.__fget_light
6.46 -0.4 6.07 perf-profile.children.cycles-pp.shmem_file_llseek
2.27 -0.1 2.17 perf-profile.children.cycles-pp.generic_file_llseek_size
1.18 -0.1 1.11 perf-profile.children.cycles-pp.__x86_indirect_thunk_rax
0.43 ± 2% -0.0 0.41 ± 2% perf-profile.children.cycles-pp.lseek@plt
1.97 +0.1 2.02 perf-profile.children.cycles-pp.fpregs_assert_state_consistent
2.43 +0.6 3.06 perf-profile.children.cycles-pp.__x64_sys_lseek
0.00 +0.8 0.80 perf-profile.children.cycles-pp.enter_from_user_mode
0.00 +2.3 2.33 perf-profile.children.cycles-pp.__syscall_return_slowpath
47.97 +2.8 50.81 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
0.00 +5.1 5.06 perf-profile.children.cycles-pp.__prepare_exit_to_usermode
0.00 +7.2 7.18 perf-profile.children.cycles-pp.exit_to_user_mode
13.07 -9.5 3.55 perf-profile.self.cycles-pp.do_syscall_64
18.01 -1.2 16.84 perf-profile.self.cycles-pp.lseek64
19.84 -0.9 18.93 perf-profile.self.cycles-pp.syscall_return_via_sysret
13.61 -0.7 12.89 perf-profile.self.cycles-pp.entry_SYSCALL_64
7.03 -0.4 6.59 perf-profile.self.cycles-pp.__fget_light
6.10 -0.4 5.73 perf-profile.self.cycles-pp.shmem_file_llseek
7.57 -0.3 7.25 perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
2.24 -0.1 2.14 perf-profile.self.cycles-pp.generic_file_llseek_size
4.57 -0.1 4.47 perf-profile.self.cycles-pp.ksys_lseek
2.12 -0.1 2.05 perf-profile.self.cycles-pp.__fdget_pos
0.62 -0.0 0.59 perf-profile.self.cycles-pp.__x86_indirect_thunk_rax
0.43 ± 2% -0.0 0.40 ± 3% perf-profile.self.cycles-pp.lseek@plt
2.11 +0.6 2.75 perf-profile.self.cycles-pp.__x64_sys_lseek
0.00 +0.7 0.68 perf-profile.self.cycles-pp.enter_from_user_mode
0.00 +2.3 2.28 perf-profile.self.cycles-pp.__syscall_return_slowpath
0.00 +3.1 3.09 perf-profile.self.cycles-pp.__prepare_exit_to_usermode
0.00 +7.1 7.08 perf-profile.self.cycles-pp.exit_to_user_mode
will-it-scale.per_process_ops
1e+07 +-----------------------------------------------------------------+
| + .+.+ +.++.+.+.+.+.+ + + .+.+.+ + .++ +.+ |
9.8e+06 |-+ + + + : + + + + + |
| +.+ : .+ +.+ |
| + |
9.6e+06 |-+ |
| |
9.4e+06 |-+ |
| O OO O O O O O O OO |
9.2e+06 |-+ O O O O O |
| O |
| O O O O O O |
9e+06 |-+ O |
| O O |
8.8e+06 +-----------------------------------------------------------------+
will-it-scale.workload
1.95e+09 +----------------------------------------------------------------+
| |
|.+. .+.+ .+. +. .+.+. .+. .+.+ +. .+.+.+.|
1.9e+09 |-+ + + + +.+.+.+.+.+ + + + + + : + |
| + + + + +.+ |
| + +.+ |
1.85e+09 |-+ |
| |
1.8e+09 |-+ |
| O O O OO O O O O O O |
| O O O O O |
1.75e+09 |-+ O O O |
| O OO O O |
| O O |
1.7e+09 +----------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Rong Chen
View attachment "config-5.7.0-14050-g8f159f1dfa1ea" of type "text/plain" (206218 bytes)
View attachment "job-script" of type "text/plain" (7721 bytes)
View attachment "job.yaml" of type "text/plain" (5323 bytes)
View attachment "reproduce" of type "text/plain" (339 bytes)
Powered by blists - more mailing lists