[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20210521080338.GF25531@xsang-OptiPlex-9020>
Date: Fri, 21 May 2021 16:03:38 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Andy Lutomirski <luto@...nel.org>
Cc: Mark Rutland <mark.rutland@....com>,
LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org,
lkp@...el.com, ying.huang@...el.com, feng.tang@...el.com,
zhengjun.xing@...el.com
Subject: [kentry] 5c61d03b2b: will-it-scale.per_thread_ops -5.5% regression
Greeting,
FYI, we noticed a -5.5% regression of will-it-scale.per_thread_ops due to commit:
commit: 5c61d03b2b823992b0e8eba73d2be61947f00323 ("kentry: Simplify the common syscall API")
https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git x86/kentry
in testcase: will-it-scale
on test machine: 88 threads 2 sockets Intel(R) Xeon(R) Gold 6238M CPU @ 2.10GHz with 128G memory
with following parameters:
nr_task: 50%
mode: thread
test: lseek2
cpufreq_governor: performance
ucode: 0x5003006
test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
test-url: https://github.com/antonblanchard/will-it-scale
If you fix the issue, kindly add following tag
Reported-by: kernel test robot <oliver.sang@...el.com>
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
bin/lkp run generated-yaml-file
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
gcc-9/performance/x86_64-rhel-8.3/thread/50%/debian-10.4-x86_64-20200603.cgz/lkp-csl-2sp9/lseek2/will-it-scale/0x5003006
commit:
a27da10c7e ("x86/entry: Convert ret_from_fork to C")
5c61d03b2b ("kentry: Simplify the common syscall API")
a27da10c7ee27727 5c61d03b2b823992b0e8eba73d2
---------------- ---------------------------
%stddev %change %stddev
\ | \
3.22e+08 -5.5% 3.042e+08 will-it-scale.44.threads
7318218 -5.5% 6912604 will-it-scale.per_thread_ops
3.22e+08 -5.5% 3.042e+08 will-it-scale.workload
273.17 -1.5% 269.04 turbostat.PkgWatt
1474 ± 7% +14.8% 1692 ± 7% slabinfo.dmaengine-unmap-16.active_objs
1474 ± 7% +14.8% 1692 ± 7% slabinfo.dmaengine-unmap-16.num_objs
1488 ± 6% -20.3% 1187 ± 16% interrupts.CPU10.CAL:Function_call_interrupts
183.33 ± 37% -68.8% 57.17 ± 61% interrupts.CPU53.RES:Rescheduling_interrupts
1208 ± 13% -18.7% 982.33 ± 13% interrupts.CPU7.CAL:Function_call_interrupts
602.33 ± 22% -49.3% 305.50 ± 47% interrupts.CPU84.TLB:TLB_shootdowns
6010 ± 20% -34.8% 3918 ± 39% interrupts.CPU87.NMI:Non-maskable_interrupts
6010 ± 20% -34.8% 3918 ± 39% interrupts.CPU87.PMI:Performance_monitoring_interrupts
10028 ± 18% -26.4% 7379 ± 16% softirqs.CPU44.SCHED
14260 ± 12% -21.1% 11254 ± 10% softirqs.CPU47.RCU
15354 ± 12% -24.3% 11624 ± 13% softirqs.CPU53.RCU
19973 ± 45% +80.6% 36071 ± 11% softirqs.CPU53.SCHED
15299 ± 10% -21.3% 12044 ± 17% softirqs.CPU69.RCU
15325 ± 17% -23.8% 11671 ± 11% softirqs.CPU70.RCU
24871 ± 36% -64.6% 8796 ± 45% softirqs.CPU9.SCHED
0.01 ± 19% -100.0% 0.00 perf-sched.sch_delay.max.ms.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.03 ± 8% -100.0% 0.00 perf-sched.wait_and_delay.avg.ms.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe
518.50 ± 10% -100.0% 0.00 perf-sched.wait_and_delay.count.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe
1.20 ±102% -100.0% 0.00 perf-sched.wait_and_delay.max.ms.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.03 ± 26% -100.0% 0.00 perf-sched.wait_time.avg.ms.exit_to_user_mode_prepare.kentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
0.03 ± 8% -100.0% 0.00 perf-sched.wait_time.avg.ms.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.19 ± 88% -100.0% 0.00 perf-sched.wait_time.max.ms.exit_to_user_mode_prepare.kentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
1.20 ±102% -100.0% 0.00 perf-sched.wait_time.max.ms.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.03 ± 2% +4.5% 0.03 ± 2% perf-stat.i.MPKI
3.195e+10 -5.6% 3.016e+10 perf-stat.i.branch-instructions
2.367e+08 ± 3% -7.8% 2.182e+08 perf-stat.i.branch-misses
13.98 -0.4 13.59 perf-stat.i.cache-miss-rate%
0.86 +4.1% 0.90 perf-stat.i.cpi
4.292e+10 -3.6% 4.138e+10 perf-stat.i.dTLB-loads
0.00 -0.0 0.00 perf-stat.i.dTLB-store-miss-rate%
46888 -7.1% 43557 ± 2% perf-stat.i.dTLB-store-misses
2.867e+10 -2.4% 2.799e+10 perf-stat.i.dTLB-stores
2.247e+08 ± 3% -6.8% 2.094e+08 perf-stat.i.iTLB-load-misses
1.434e+11 -3.9% 1.378e+11 perf-stat.i.instructions
1.16 -3.9% 1.12 perf-stat.i.ipc
1176 -3.9% 1131 perf-stat.i.metric.M/sec
14.27 -0.4 13.89 perf-stat.overall.cache-miss-rate%
0.86 +4.1% 0.90 perf-stat.overall.cpi
0.00 -0.0 0.00 ± 2% perf-stat.overall.dTLB-store-miss-rate%
1.16 -3.9% 1.12 perf-stat.overall.ipc
133998 +1.8% 136413 perf-stat.overall.path-length
3.184e+10 -5.6% 3.006e+10 perf-stat.ps.branch-instructions
2.36e+08 ± 3% -7.8% 2.176e+08 perf-stat.ps.branch-misses
4.278e+10 -3.6% 4.124e+10 perf-stat.ps.dTLB-loads
46753 -7.1% 43435 ± 2% perf-stat.ps.dTLB-store-misses
2.857e+10 -2.4% 2.789e+10 perf-stat.ps.dTLB-stores
2.24e+08 ± 3% -6.8% 2.087e+08 perf-stat.ps.iTLB-load-misses
1.429e+11 -3.9% 1.373e+11 perf-stat.ps.instructions
4.315e+13 -3.8% 4.149e+13 perf-stat.total.instructions
1.47 ± 12% -0.8 0.72 ± 18% perf-profile.calltrace.cycles-pp.shmem_file_llseek.ksys_lseek.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_lseek64
1.32 ± 10% -0.6 0.67 ± 13% perf-profile.calltrace.cycles-pp.___might_sleep.mutex_lock.__fdget_pos.ksys_lseek.do_syscall_64
0.92 ± 11% -0.2 0.70 ± 11% perf-profile.calltrace.cycles-pp.generic_file_llseek_size.ksys_lseek.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_lseek64
0.00 +0.8 0.85 ± 11% perf-profile.calltrace.cycles-pp.kentry_syscall_begin.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_lseek64
0.00 +1.7 1.71 ± 11% perf-profile.calltrace.cycles-pp.kentry_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_lseek64
7.02 ± 11% +3.5 10.52 ± 9% perf-profile.calltrace.cycles-pp.__fget_light.__fdget_pos.ksys_lseek.do_syscall_64.entry_SYSCALL_64_after_hwframe
1.49 ± 12% -0.8 0.73 ± 18% perf-profile.children.cycles-pp.shmem_file_llseek
1.32 ± 10% -0.6 0.67 ± 13% perf-profile.children.cycles-pp.___might_sleep
0.95 ± 11% -0.2 0.72 ± 12% perf-profile.children.cycles-pp.generic_file_llseek_size
0.43 ± 12% -0.1 0.31 ± 10% perf-profile.children.cycles-pp.__f_unlock_pos
0.35 ± 13% +0.1 0.47 ± 12% perf-profile.children.cycles-pp.__x64_sys_lseek
0.00 +0.3 0.33 ± 11% perf-profile.children.cycles-pp.kentry_enter_from_user_mode
0.00 +0.5 0.50 ± 13% perf-profile.children.cycles-pp.kentry_syscall_end
0.00 +0.9 0.85 ± 11% perf-profile.children.cycles-pp.kentry_syscall_begin
0.00 +1.7 1.71 ± 11% perf-profile.children.cycles-pp.kentry_exit_to_user_mode
7.27 ± 11% +3.6 10.85 ± 9% perf-profile.children.cycles-pp.__fget_light
1.46 ± 12% -0.7 0.72 ± 18% perf-profile.self.cycles-pp.shmem_file_llseek
1.31 ± 10% -0.6 0.67 ± 12% perf-profile.self.cycles-pp.___might_sleep
0.94 ± 11% -0.2 0.71 ± 11% perf-profile.self.cycles-pp.generic_file_llseek_size
0.22 ± 12% -0.1 0.15 ± 9% perf-profile.self.cycles-pp.__f_unlock_pos
0.35 ± 12% +0.1 0.47 ± 12% perf-profile.self.cycles-pp.__x64_sys_lseek
0.17 ± 12% +0.2 0.33 ± 12% perf-profile.self.cycles-pp.rcu_nocb_flush_deferred_wakeup
0.00 +0.2 0.17 ± 11% perf-profile.self.cycles-pp.kentry_enter_from_user_mode
0.00 +0.5 0.49 ± 12% perf-profile.self.cycles-pp.kentry_syscall_end
0.00 +0.7 0.68 ± 10% perf-profile.self.cycles-pp.kentry_syscall_begin
0.00 +1.2 1.21 ± 11% perf-profile.self.cycles-pp.kentry_exit_to_user_mode
1.25 ± 12% +1.4 2.62 ± 10% perf-profile.self.cycles-pp.do_syscall_64
1.11 ± 10% +2.6 3.76 ± 11% perf-profile.self.cycles-pp.__fget_light
will-it-scale.44.threads
3.25e+08 +----------------------------------------------------------------+
| + + .+.++.+ + +. +.+.++.+. +. .+ .+ .|
|+ +. .+ + + ++. .++. : ++.+ + +.+ +.++ |
3.2e+08 |-+ + + + + |
| |
| |
3.15e+08 |-+ |
| |
3.1e+08 |-+ |
| |
| |
3.05e+08 |-+ O OO O OO OO O O OO OO |
| OO O OO OO O OO O O |
| OO O OO OO |
3e+08 +----------------------------------------------------------------+
will-it-scale.per_thread_ops
7.4e+06 +-----------------------------------------------------------------+
| + + +.+.+ .++.+.++ +.++.+. +. .|
7.3e+06 |+++. .+ + .+ + + .+ .+ + ++.+.++.+ +.++.+.++ |
| + + + +.+ + |
| |
7.2e+06 |-+ |
| |
7.1e+06 |-+ |
| |
7e+06 |-+ |
| |
| O O OO OO O OO O OO O O |
6.9e+06 |-+ O O OO O OO O OO O OO |
| OO O O O O O |
6.8e+06 +-----------------------------------------------------------------+
will-it-scale.workload
3.25e+08 +----------------------------------------------------------------+
| + + .+.++.+ + +. +.+.++.+. +. .+ .+ .|
|+ +. .+ + + ++. .++. : ++.+ + +.+ +.++ |
3.2e+08 |-+ + + + + |
| |
| |
3.15e+08 |-+ |
| |
3.1e+08 |-+ |
| |
| |
3.05e+08 |-+ O OO O OO OO O O OO OO |
| OO O OO OO O OO O O |
| OO O OO OO |
3e+08 +----------------------------------------------------------------+
0.055 +-------------------------------------------------------------------+
| + |
0.05 |-+ : |
| : |
0.045 |-+ : : |
| : : |
0.04 |-+ : : |
| : : |
0.035 |-+ : : |
| : : + |
0.03 |-+ + : : : : +. + |
| :+ +. : : : : +. : +.+ +|
0.025 |.++.+. .++. : +.+ + +.++.+.++.+.++.+ ++ ++.+.+ + + |
| + +.+ + +.+ |
0.02 +-------------------------------------------------------------------+
0.012 +-------------------------------------------------------------------+
| : : : : : :: : |
| : : : : :: :: : : : |
0.011 |.+ : : + + + +.+ :: :+.+.++ + + + ::|
| : : : : : : : : :: : : :: : : ::|
| : : : :: : : : : : : : : :: : : ::|
0.01 |-: + + : : : : : : : +.+ +.+ + : : + : + : ::|
| : : : : : : :: : : : : : : : : : : : ::|
0.009 |-: : : : :: :: :: : : : : : : : :: : :: :|
| :: : : : :: :: :: :: : : ::: : :: |
| :: :: : : :: : : : : : :: :: |
0.008 |-+:: :: + + :: + + :: + :: :: |
| :: : : : : : |
| : : : : : : |
0.007 +-------------------------------------------------------------------+
600 +---------------------------------------------------------------------+
| .+ +.+ +.|
550 |-+ : +. : : .+ : |
| : : + : + .+ + + : : + : +.+.+ |
500 |:+ : :: : + + : + +.+ +: ::: +.+ : : : |
|: : :: : : : + : : : :: ::: : : :: |
450 |:+ + + : : : : : :: : : : : : + : : :: |
|: : : : : + : : : : : : : : +.: : |
400 |-+ : : : : : : : : : : : : + + |
| + : : : : :: : : : : |
350 |-+ + : : : : :: : : |
| :: : : : : |
300 |-+ :: : + + : |
| + + + |
250 +---------------------------------------------------------------------+
0.055 +-------------------------------------------------------------------+
| + |
0.05 |-+ : |
| : |
0.045 |-+ : : |
| : : |
0.04 |-+ : : |
| : : |
0.035 |-+ : : |
| : : + |
0.03 |-+ + : : : : +. + |
| :+ +. : : : : +. : +.+ +|
0.025 |.++.+. .++. : +.+ + +.++.+.++.+.++.+ ++ ++.+.+ + + |
| + +.+ + +.+ |
0.02 +-------------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
---
0DAY/LKP+ Test Infrastructure Open Source Technology Center
https://lists.01.org/hyperkitty/list/lkp@lists.01.org Intel Corporation
Thanks,
Oliver Sang
View attachment "config-5.13.0-rc1-00237-g5c61d03b2b82" of type "text/plain" (174118 bytes)
View attachment "job-script" of type "text/plain" (8003 bytes)
View attachment "job.yaml" of type "text/plain" (5507 bytes)
View attachment "reproduce" of type "text/plain" (337 bytes)
Powered by blists - more mailing lists