[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20210420063842.GD31773@xsang-OptiPlex-9020>
Date: Tue, 20 Apr 2021 14:38:42 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Stefan Metzmacher <metze@...ba.org>
Cc: Jens Axboe <axboe@...nel.dk>, LKML <linux-kernel@...r.kernel.org>,
lkp@...ts.01.org, lkp@...el.com, ying.huang@...el.com,
feng.tang@...el.com, zhengjun.xing@...el.com
Subject: [io_uring] 7c30f36a98: will-it-scale.per_thread_ops 9.1%
improvement
Greeting,
FYI, we noticed a 9.1% improvement of will-it-scale.per_thread_ops due to commit:
commit: 7c30f36a98ae488741178d69662e4f2baa53e7f6 ("io_uring: run __io_sq_thread() with the initial creds from io_uring_setup()")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
in testcase: will-it-scale
on test machine: 144 threads Intel(R) Xeon(R) CPU E7-8890 v3 @ 2.50GHz with 512G memory
with following parameters:
nr_task: 50%
mode: thread
test: unix1
cpufreq_governor: performance
ucode: 0x16
test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
test-url: https://github.com/antonblanchard/will-it-scale
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml
bin/lkp run compatible-job.yaml
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
gcc-9/performance/x86_64-rhel-8.3/thread/50%/debian-10.4-x86_64-20200603.cgz/lkp-hsw-4ex1/unix1/will-it-scale/0x16
commit:
678eeba481 ("io-wq: warn on creating manager while exiting")
7c30f36a98 ("io_uring: run __io_sq_thread() with the initial creds from io_uring_setup()")
678eeba481d8c161 7c30f36a98ae488741178d69662
---------------- ---------------------------
%stddev %change %stddev
\ | \
30824092 +9.1% 33623774 will-it-scale.72.threads
428111 +9.1% 466996 will-it-scale.per_thread_ops
30824092 +9.1% 33623774 will-it-scale.workload
314351 ± 4% -8.6% 287222 numa-meminfo.node0.Unevictable
0.04 ±116% +27922.2% 9.90 ±123% perf-sched.sch_delay.max.ms.__traceiter_sched_switch.__traceiter_sched_switch.schedule_timeout.rcu_gp_kthread.kthread
15.00 +6.7% 16.00 vmstat.cpu.us
78587 ± 4% -8.6% 71805 numa-vmstat.node0.nr_unevictable
78587 ± 4% -8.6% 71805 numa-vmstat.node0.nr_zone_unevictable
1769 -12.7% 1544 syscalls.sys_read.med
1780 -9.4% 1613 syscalls.sys_write.med
19842 ± 3% -20.6% 15756 ± 12% softirqs.CPU11.RCU
12942 ± 8% +17.0% 15137 ± 10% softirqs.CPU134.RCU
13720 ± 11% +19.2% 16356 ± 10% softirqs.CPU55.RCU
36667 ± 8% -41.0% 21647 ± 38% softirqs.CPU83.SCHED
266.33 ± 8% -47.4% 140.00 ± 58% interrupts.CPU11.RES:Rescheduling_interrupts
1118 ± 19% -47.1% 592.00 ± 50% interrupts.CPU11.TLB:TLB_shootdowns
992.50 ± 14% -35.1% 643.67 ± 39% interrupts.CPU120.TLB:TLB_shootdowns
1914 ± 35% +136.0% 4518 ± 43% interrupts.CPU129.NMI:Non-maskable_interrupts
1914 ± 35% +136.0% 4518 ± 43% interrupts.CPU129.PMI:Performance_monitoring_interrupts
36.17 ± 71% +206.9% 111.00 ± 44% interrupts.CPU131.RES:Rescheduling_interrupts
1159 ± 18% +72.2% 1996 ± 32% interrupts.CPU134.CAL:Function_call_interrupts
374.83 ± 61% +139.0% 895.67 ± 40% interrupts.CPU134.TLB:TLB_shootdowns
2810 ± 37% +134.0% 6578 ± 33% interrupts.CPU45.NMI:Non-maskable_interrupts
2810 ± 37% +134.0% 6578 ± 33% interrupts.CPU45.PMI:Performance_monitoring_interrupts
1605 ± 19% +76.6% 2836 ± 48% interrupts.CPU52.CAL:Function_call_interrupts
2231 ± 27% -37.2% 1400 ± 22% interrupts.CPU62.CAL:Function_call_interrupts
6880 ± 25% -46.7% 3669 ± 57% interrupts.CPU62.NMI:Non-maskable_interrupts
6880 ± 25% -46.7% 3669 ± 57% interrupts.CPU62.PMI:Performance_monitoring_interrupts
226.50 ± 18% -47.5% 119.00 ± 63% interrupts.CPU62.RES:Rescheduling_interrupts
1169 ± 18% -44.4% 650.83 ± 52% interrupts.CPU62.TLB:TLB_shootdowns
235.00 ± 13% -59.4% 95.33 ± 65% interrupts.CPU63.RES:Rescheduling_interrupts
384.17 ± 64% +120.0% 845.33 ± 30% interrupts.CPU84.TLB:TLB_shootdowns
1870 ± 8% -26.7% 1370 ± 29% interrupts.CPU93.CAL:Function_call_interrupts
1092 ± 16% -45.3% 597.33 ± 66% interrupts.CPU93.TLB:TLB_shootdowns
3.702e+10 +9.1% 4.038e+10 perf-stat.i.branch-instructions
4.711e+08 +9.0% 5.134e+08 perf-stat.i.branch-misses
1.13 -8.4% 1.04 perf-stat.i.cpi
5.421e+10 +9.0% 5.909e+10 perf-stat.i.dTLB-loads
61939091 +8.9% 67468763 perf-stat.i.dTLB-store-misses
3.777e+10 +8.9% 4.112e+10 perf-stat.i.dTLB-stores
64979413 ± 2% +9.4% 71098260 ± 2% perf-stat.i.iTLB-load-misses
1.703e+08 ± 3% +14.3% 1.947e+08 ± 15% perf-stat.i.iTLB-loads
1.857e+11 +9.1% 2.026e+11 perf-stat.i.instructions
0.89 +9.2% 0.97 perf-stat.i.ipc
896.93 +9.0% 978.06 perf-stat.i.metric.M/sec
22535 ± 7% +13.9% 25662 ± 5% perf-stat.i.node-loads
0.07 -9.1% 0.06 ± 2% perf-stat.overall.MPKI
1.13 -8.4% 1.03 perf-stat.overall.cpi
0.89 +9.1% 0.97 perf-stat.overall.ipc
3.687e+10 +9.1% 4.023e+10 perf-stat.ps.branch-instructions
4.693e+08 +9.0% 5.116e+08 perf-stat.ps.branch-misses
5.399e+10 +9.1% 5.888e+10 perf-stat.ps.dTLB-loads
61667162 +9.0% 67198925 perf-stat.ps.dTLB-store-misses
3.761e+10 +8.9% 4.097e+10 perf-stat.ps.dTLB-stores
64692296 ± 2% +9.5% 70834658 ± 2% perf-stat.ps.iTLB-load-misses
1.695e+08 ± 3% +14.4% 1.939e+08 ± 15% perf-stat.ps.iTLB-loads
1.849e+11 +9.1% 2.018e+11 perf-stat.ps.instructions
23463 ± 8% +13.9% 26730 ± 8% perf-stat.ps.node-loads
5.594e+13 +9.1% 6.104e+13 perf-stat.total.instructions
31.07 -2.3 28.80 ± 9% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__libc_read
37.16 -2.2 35.01 ± 9% perf-profile.calltrace.cycles-pp.__libc_read
20.02 -1.5 18.51 ± 9% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_read
17.67 ± 2% -1.5 16.19 ± 9% perf-profile.calltrace.cycles-pp.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_write
20.78 ± 2% -1.4 19.34 ± 9% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_write
16.56 -1.4 15.14 ± 9% perf-profile.calltrace.cycles-pp.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_write
15.50 -1.3 14.21 ± 9% perf-profile.calltrace.cycles-pp.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_read
14.88 ± 2% -1.3 13.62 ± 9% perf-profile.calltrace.cycles-pp.new_sync_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
16.59 -1.2 15.38 ± 9% perf-profile.calltrace.cycles-pp.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_read
13.54 -1.2 12.38 ± 9% perf-profile.calltrace.cycles-pp.new_sync_read.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
13.18 -1.2 12.02 ± 9% perf-profile.calltrace.cycles-pp.sock_read_iter.new_sync_read.vfs_read.ksys_read.do_syscall_64
14.42 ± 2% -1.1 13.28 ± 9% perf-profile.calltrace.cycles-pp.sock_write_iter.new_sync_write.vfs_write.ksys_write.do_syscall_64
13.50 ± 2% -1.0 12.52 ± 9% perf-profile.calltrace.cycles-pp.sock_sendmsg.sock_write_iter.new_sync_write.vfs_write.ksys_write
12.25 ± 2% -0.9 11.36 ± 9% perf-profile.calltrace.cycles-pp.unix_stream_sendmsg.sock_sendmsg.sock_write_iter.new_sync_write.vfs_write
3.14 ± 2% -0.8 2.31 ± 9% perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.__libc_read
10.57 -0.8 9.81 ± 9% perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.__libc_read
1.70 ± 2% -0.5 1.17 ± 9% perf-profile.calltrace.cycles-pp.sock_recvmsg.sock_read_iter.new_sync_read.vfs_read.ksys_read
2.15 ± 2% -0.4 1.72 ± 9% perf-profile.calltrace.cycles-pp.skb_copy_datagram_from_iter.unix_stream_sendmsg.sock_sendmsg.sock_write_iter.new_sync_write
1.05 ± 2% -0.2 0.81 ± 9% perf-profile.calltrace.cycles-pp.__check_object_size.skb_copy_datagram_from_iter.unix_stream_sendmsg.sock_sendmsg.sock_write_iter
0.60 -0.2 0.44 ± 44% perf-profile.calltrace.cycles-pp.unix_write_space.sock_wfree.unix_destruct_scm.skb_release_head_state.skb_release_all
1.52 ± 2% -0.2 1.36 ± 8% perf-profile.calltrace.cycles-pp.unix_destruct_scm.skb_release_head_state.skb_release_all.consume_skb.unix_stream_read_generic
1.61 -0.2 1.46 ± 8% perf-profile.calltrace.cycles-pp.skb_release_head_state.skb_release_all.consume_skb.unix_stream_read_generic.unix_stream_recvmsg
1.64 -0.2 1.49 ± 9% perf-profile.calltrace.cycles-pp.skb_release_all.consume_skb.unix_stream_read_generic.unix_stream_recvmsg.sock_read_iter
40.84 -2.9 37.89 ± 9% perf-profile.children.cycles-pp.do_syscall_64
37.53 -2.1 35.38 ± 9% perf-profile.children.cycles-pp.__libc_read
17.70 ± 2% -1.5 16.22 ± 9% perf-profile.children.cycles-pp.ksys_write
16.60 -1.4 15.19 ± 9% perf-profile.children.cycles-pp.vfs_write
15.55 -1.3 14.26 ± 9% perf-profile.children.cycles-pp.vfs_read
14.91 ± 2% -1.3 13.65 ± 9% perf-profile.children.cycles-pp.new_sync_write
16.62 -1.2 15.41 ± 9% perf-profile.children.cycles-pp.ksys_read
13.22 -1.2 12.06 ± 9% perf-profile.children.cycles-pp.sock_read_iter
13.58 -1.2 12.42 ± 9% perf-profile.children.cycles-pp.new_sync_read
14.47 ± 2% -1.1 13.33 ± 9% perf-profile.children.cycles-pp.sock_write_iter
13.52 ± 2% -1.0 12.55 ± 9% perf-profile.children.cycles-pp.sock_sendmsg
12.36 ± 2% -0.9 11.42 ± 9% perf-profile.children.cycles-pp.unix_stream_sendmsg
5.51 ± 2% -0.8 4.72 ± 8% perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare
1.71 ± 2% -0.5 1.20 ± 9% perf-profile.children.cycles-pp.sock_recvmsg
2.18 ± 2% -0.4 1.74 ± 9% perf-profile.children.cycles-pp.skb_copy_datagram_from_iter
1.22 ± 2% -0.3 0.88 ± 8% perf-profile.children.cycles-pp.__x86_retpoline_rax
0.52 -0.2 0.31 ± 9% perf-profile.children.cycles-pp.exit_to_user_mode_prepare
0.93 -0.2 0.72 ± 9% perf-profile.children.cycles-pp.fsnotify
2.25 ± 2% -0.2 2.06 ± 9% perf-profile.children.cycles-pp.__check_object_size
1.55 ± 2% -0.2 1.40 ± 8% perf-profile.children.cycles-pp.unix_destruct_scm
1.64 -0.2 1.49 ± 9% perf-profile.children.cycles-pp.skb_release_all
1.62 -0.2 1.47 ± 8% perf-profile.children.cycles-pp.skb_release_head_state
0.58 -0.1 0.44 ± 8% perf-profile.children.cycles-pp.__virt_addr_valid
0.47 ± 4% -0.1 0.35 ± 9% perf-profile.children.cycles-pp.wait_for_unix_gc
0.61 ± 2% -0.1 0.51 ± 8% perf-profile.children.cycles-pp.unix_write_space
0.40 -0.1 0.31 ± 10% perf-profile.children.cycles-pp.__x64_sys_read
0.63 -0.1 0.55 ± 9% perf-profile.children.cycles-pp.__might_sleep
0.35 ± 3% -0.1 0.29 ± 8% perf-profile.children.cycles-pp.apparmor_socket_getpeersec_dgram
0.13 -0.0 0.08 ± 8% perf-profile.children.cycles-pp.check_stack_object
0.13 ± 5% -0.0 0.09 ± 10% perf-profile.children.cycles-pp.unix_scm_to_skb
0.45 ± 3% -0.0 0.42 ± 8% perf-profile.children.cycles-pp.security_socket_getpeersec_dgram
0.09 ± 5% -0.0 0.06 ± 11% perf-profile.children.cycles-pp.rcu_nocb_flush_deferred_wakeup
0.57 ± 2% +0.1 0.71 ± 12% perf-profile.children.cycles-pp.__ksize
1.12 ± 2% -0.8 0.37 ± 9% perf-profile.self.cycles-pp.syscall_exit_to_user_mode_prepare
0.50 ± 2% -0.4 0.09 ± 12% perf-profile.self.cycles-pp.sock_recvmsg
1.13 -0.4 0.73 ± 9% perf-profile.self.cycles-pp.sock_read_iter
0.98 ± 3% -0.3 0.68 ± 9% perf-profile.self.cycles-pp.__x86_retpoline_rax
0.36 -0.2 0.13 ± 8% perf-profile.self.cycles-pp.skb_copy_datagram_from_iter
0.89 -0.2 0.69 ± 9% perf-profile.self.cycles-pp.fsnotify
0.29 ± 3% -0.2 0.10 ± 9% perf-profile.self.cycles-pp.security_socket_recvmsg
0.93 ± 2% -0.2 0.74 ± 9% perf-profile.self.cycles-pp.sock_write_iter
0.92 ± 4% -0.2 0.76 ± 10% perf-profile.self.cycles-pp.ftrace_syscall_exit
0.40 ± 2% -0.2 0.25 ± 10% perf-profile.self.cycles-pp.exit_to_user_mode_prepare
1.19 ± 2% -0.1 1.04 ± 9% perf-profile.self.cycles-pp.unix_stream_sendmsg
0.56 ± 2% -0.1 0.42 ± 9% perf-profile.self.cycles-pp.__virt_addr_valid
0.46 ± 6% -0.1 0.34 ± 9% perf-profile.self.cycles-pp.syscall_trace_enter
0.37 ± 2% -0.1 0.26 ± 8% perf-profile.self.cycles-pp.new_sync_write
0.25 ± 3% -0.1 0.14 ± 11% perf-profile.self.cycles-pp.alloc_skb_with_frags
0.34 -0.1 0.24 ± 10% perf-profile.self.cycles-pp.__x64_sys_read
0.59 -0.1 0.50 ± 8% perf-profile.self.cycles-pp.unix_write_space
0.40 ± 2% -0.1 0.31 ± 9% perf-profile.self.cycles-pp.unix_destruct_scm
0.55 ± 2% -0.1 0.47 ± 9% perf-profile.self.cycles-pp.__alloc_skb
0.28 ± 2% -0.1 0.20 ± 9% perf-profile.self.cycles-pp.ksys_write
0.48 ± 2% -0.1 0.41 ± 10% perf-profile.self.cycles-pp.vfs_read
0.16 ± 6% -0.1 0.09 ± 13% perf-profile.self.cycles-pp.skb_copy_datagram_iter
0.49 ± 2% -0.1 0.43 ± 9% perf-profile.self.cycles-pp.unix_stream_recvmsg
0.13 ± 5% -0.1 0.08 ± 11% perf-profile.self.cycles-pp.wait_for_unix_gc
0.13 ± 5% -0.1 0.07 ± 10% perf-profile.self.cycles-pp.unix_scm_to_skb
0.21 ± 5% -0.1 0.16 ± 11% perf-profile.self.cycles-pp.sock_alloc_send_pskb
0.48 -0.1 0.43 ± 10% perf-profile.self.cycles-pp.vfs_write
0.08 ± 7% -0.0 0.03 ± 70% perf-profile.self.cycles-pp.rcu_nocb_flush_deferred_wakeup
0.11 ± 4% -0.0 0.06 ± 11% perf-profile.self.cycles-pp.check_stack_object
0.30 ± 3% -0.0 0.26 ± 8% perf-profile.self.cycles-pp.apparmor_socket_getpeersec_dgram
0.14 ± 4% -0.0 0.10 ± 9% perf-profile.self.cycles-pp.sock_sendmsg
0.22 ± 4% -0.0 0.18 ± 9% perf-profile.self.cycles-pp.do_syscall_64
0.21 ± 2% -0.0 0.18 ± 12% perf-profile.self.cycles-pp.__skb_datagram_iter
0.27 ± 2% -0.0 0.23 ± 9% perf-profile.self.cycles-pp.__x64_sys_write
0.23 ± 4% -0.0 0.19 ± 8% perf-profile.self.cycles-pp.__x86_indirect_thunk_rax
0.18 ± 3% +0.1 0.24 ± 8% perf-profile.self.cycles-pp.security_file_permission
0.23 ± 2% +0.1 0.31 ± 8% perf-profile.self.cycles-pp.ksys_read
0.22 ± 3% +0.1 0.30 ± 9% perf-profile.self.cycles-pp.apparmor_socket_recvmsg
0.55 +0.1 0.70 ± 12% perf-profile.self.cycles-pp.__ksize
will-it-scale.per_thread_ops
480000 +------------------------------------------------------------------+
475000 |-+ OO O O O O |
| O O O O O O O O O |
470000 |-O O O O O O O O O |
465000 |-+ O O O O O O O O O |
460000 |-+ .+.+.+.+.+. .+.+. .+.+ |
455000 |.+.+ +.+.+ +.+.+ +. |
| +.+ |
450000 |-+ : |
445000 |-+ : |
440000 |-+ : |
435000 |-+ : |
| : |
430000 |-+ +.+.+.+.+.+ |
425000 +------------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
---
0DAY/LKP+ Test Infrastructure Open Source Technology Center
https://lists.01.org/hyperkitty/list/lkp@lists.01.org Intel Corporation
Thanks,
Oliver Sang
View attachment "config-5.12.0-rc2-00011-g7c30f36a98ae" of type "text/plain" (172883 bytes)
View attachment "job-script" of type "text/plain" (7797 bytes)
View attachment "job.yaml" of type "text/plain" (5335 bytes)
View attachment "reproduce" of type "text/plain" (336 bytes)
Powered by blists - more mailing lists