[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20161013013557.GA24130@yexl-desktop>
Date: Thu, 13 Oct 2016 09:35:57 +0800
From: kernel test robot <xiaolong.ye@...el.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Ingo Molnar <mingo@...nel.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Mike Galbraith <efault@....de>,
Thomas Gleixner <tglx@...utronix.de>,
LKML <linux-kernel@...r.kernel.org>, lkp@...org
Subject: [lkp] [sched/core] 0e369d7575: nuttcp.throughput_Mbps 9.3%
improvement
FYI, we noticed a 9.3% improvement of nuttcp.throughput_Mbps due to commit:
commit 0e369d757578b23ac50b893f920aa50fdbc45fb6 ("sched/core: Replace sd_busy/nr_busy_cpus with sched_domain_shared")
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
in testcase: nuttcp
on test machine: 4 threads Intel(R) Core(TM) i3-3220 CPU @ 3.30GHz with 4G memory
with following parameters:
runtime: 300s
cluster: cs-localhost
cpufreq_governor: performance
nuttcp is a network performance measurement tool intended for use
by network and system managers.
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
=========================================================================================
cluster/compiler/cpufreq_governor/kconfig/rootfs/runtime/tbox_group/testcase:
cs-localhost/gcc-6/performance/x86_64-rhel-7.2/debian-x86_64-2016-08-31.cgz/300s/lkp-ivb-d04/nuttcp
commit:
24fc7edb92 ("sched/core: Introduce 'struct sched_domain_shared'")
0e369d7575 ("sched/core: Replace sd_busy/nr_busy_cpus with sched_domain_shared")
24fc7edb92eea059 0e369d757578b23ac50b893f92
---------------- --------------------------
%stddev %change %stddev
\ | \
30321 ± 1% +9.3% 33141 ± 0% nuttcp.throughput_Mbps
5878 ± 45% +186.7% 16854 ± 47% nuttcp.time.involuntary_context_switches
298926 ± 2% +6.1% 317150 ± 1% nuttcp.time.voluntary_context_switches
6281 ± 85% -75.5% 1540 ± 48% latency_stats.sum.pipe_wait.wait_for_partner.fifo_open.do_dentry_open.vfs_open.path_openat.do_filp_open.do_sys_open.SyS_open.entry_SYSCALL_64_fastpath
118278 ± 4% +29.6% 153276 ± 2% softirqs.SCHED
17592029 ± 11% +24.1% 21829957 ± 2% cpuidle.C1E-IVB.time
49718031 ± 5% -35.2% 32195246 ± 2% cpuidle.C3-IVB.time
170850 ± 4% -37.9% 106016 ± 3% cpuidle.C3-IVB.usage
670986 ± 1% +21.0% 812210 ± 0% cpuidle.C6-IVB.usage
21.23 ± 11% +99.1% 42.26 ± 1% turbostat.CPU%c1
2.83 ± 27% +288.7% 10.99 ± 6% turbostat.CPU%c3
29.88 ± 10% -96.5% 1.05 ± 15% turbostat.CPU%c6
11.99 ± 1% +15.6% 13.86 ± 0% turbostat.CorWatt
30.15 ± 0% +6.6% 32.14 ± 0% turbostat.PkgWatt
43288 ± 12% -40.8% 25626 ± 30% sched_debug.cfs_rq:/.exec_clock.min
16229 ± 30% +86.8% 30320 ± 23% sched_debug.cfs_rq:/.exec_clock.stddev
102248 ± 14% -41.9% 59367 ± 26% sched_debug.cfs_rq:/.min_vruntime.min
0.24 ± 11% +49.4% 0.35 ± 17% sched_debug.cfs_rq:/.nr_running.stddev
28.17 ± 20% -55.8% 12.46 ± 91% sched_debug.cfs_rq:/.runnable_load_avg.min
13618 ±192% -489.9% -53102 ±-79% sched_debug.cfs_rq:/.spread0.avg
53522 ± 64% -58.2% 22370 ±152% sched_debug.cfs_rq:/.spread0.max
-27026 ±-101% +363.0% -125124 ±-39% sched_debug.cfs_rq:/.spread0.min
109619 ± 1% +18.0% 129351 ± 7% sched_debug.cpu.nr_load_updates.max
80596 ± 4% -11.6% 71229 ± 5% sched_debug.cpu.nr_load_updates.min
12139 ± 10% +84.3% 22367 ± 20% sched_debug.cpu.nr_load_updates.stddev
6.876e+10 ± 4% +16.8% 8.031e+10 ± 2% perf-stat.branch-instructions
1.37 ± 3% -10.3% 1.23 ± 0% perf-stat.branch-miss-rate%
23.20 ± 6% +13.7% 26.39 ± 2% perf-stat.cache-miss-rate%
1.228e+10 ± 10% +34.0% 1.646e+10 ± 2% perf-stat.cache-misses
5.298e+10 ± 8% +17.8% 6.238e+10 ± 2% perf-stat.cache-references
1.665e+12 ± 6% +14.8% 1.912e+12 ± 0% perf-stat.cpu-cycles
0.43 ± 14% -64.6% 0.15 ± 21% perf-stat.dTLB-load-miss-rate%
1.097e+09 ± 14% -58.9% 4.509e+08 ± 20% perf-stat.dTLB-load-misses
2.549e+11 ± 3% +16.4% 2.967e+11 ± 2% perf-stat.dTLB-loads
0.05 ± 27% -54.2% 0.02 ± 17% perf-stat.dTLB-store-miss-rate%
1.204e+08 ± 29% -52.2% 57536054 ± 17% perf-stat.dTLB-store-misses
74316106 ± 12% -18.6% 60529852 ± 1% perf-stat.iTLB-load-misses
3.751e+11 ± 4% +18.3% 4.437e+11 ± 1% perf-stat.instructions
5131 ± 13% +42.8% 7330 ± 1% perf-stat.instructions-per-iTLB-miss
287090 ± 0% -1.2% 283555 ± 0% perf-stat.minor-faults
287090 ± 0% -1.2% 283555 ± 0% perf-stat.page-faults
1.50 ± 8% -13.9% 1.29 ± 4% perf-profile.calltrace.cycles-pp.__alloc_pages_nodemask.alloc_pages_current.skb_page_frag_refill.sk_page_frag_refill.tcp_sendmsg
1.27 ± 12% -19.1% 1.03 ± 16% perf-profile.calltrace.cycles-pp.__free_pages_ok.free_compound_page.__put_compound_page.__put_page.skb_release_data
2.74 ± 8% -19.0% 2.22 ± 13% perf-profile.calltrace.cycles-pp.__kfree_skb.tcp_recvmsg.inet_recvmsg.sock_recvmsg.sock_read_iter
5.50 ± 6% -27.5% 3.99 ± 11% perf-profile.calltrace.cycles-pp.__local_bh_enable_ip.ip_finish_output2.ip_finish_output.ip_output.ip_local_out
4.67 ± 8% -26.8% 3.42 ± 13% perf-profile.calltrace.cycles-pp.__netif_receive_skb.process_backlog.net_rx_action.__softirqentry_text_start.do_softirq_own_stack
4.63 ± 8% -26.7% 3.40 ± 13% perf-profile.calltrace.cycles-pp.__netif_receive_skb_core.__netif_receive_skb.process_backlog.net_rx_action.__softirqentry_text_start
1.54 ± 10% -20.3% 1.23 ± 13% perf-profile.calltrace.cycles-pp.__put_compound_page.__put_page.skb_release_data.skb_release_all.__kfree_skb
1.56 ± 10% -20.9% 1.24 ± 14% perf-profile.calltrace.cycles-pp.__put_page.skb_release_data.skb_release_all.__kfree_skb.tcp_recvmsg
5.25 ± 6% -26.8% 3.84 ± 11% perf-profile.calltrace.cycles-pp.__softirqentry_text_start.do_softirq_own_stack.do_softirq.__local_bh_enable_ip.ip_finish_output2
45.39 ± 4% -13.7% 39.18 ± 10% perf-profile.calltrace.cycles-pp.__vfs_read.vfs_read.sys_read.entry_SYSCALL_64_fastpath.read
1.64 ± 8% -16.0% 1.38 ± 2% perf-profile.calltrace.cycles-pp.alloc_pages_current.skb_page_frag_refill.sk_page_frag_refill.tcp_sendmsg.inet_sendmsg
5.42 ± 6% -27.5% 3.93 ± 11% perf-profile.calltrace.cycles-pp.do_softirq.__local_bh_enable_ip.ip_finish_output2.ip_finish_output.ip_output
5.34 ± 6% -27.2% 3.88 ± 11% perf-profile.calltrace.cycles-pp.do_softirq_own_stack.do_softirq.__local_bh_enable_ip.ip_finish_output2.ip_finish_output
46.20 ± 3% -14.0% 39.72 ± 10% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_fastpath.read
44.90 ± 4% -13.6% 38.78 ± 10% perf-profile.calltrace.cycles-pp.inet_recvmsg.sock_recvmsg.sock_read_iter.__vfs_read.vfs_read
20.23 ± 4% +46.4% 29.62 ± 14% perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.cpuidle_enter.call_cpuidle.cpu_startup_entry
6.73 ± 4% -23.6% 5.14 ± 12% perf-profile.calltrace.cycles-pp.ip_finish_output.ip_output.ip_local_out.ip_queue_xmit.tcp_transmit_skb
6.61 ± 5% -23.7% 5.04 ± 12% perf-profile.calltrace.cycles-pp.ip_finish_output2.ip_finish_output.ip_output.ip_local_out.ip_queue_xmit
3.94 ± 9% -30.2% 2.75 ± 9% perf-profile.calltrace.cycles-pp.ip_local_deliver.ip_rcv_finish.ip_rcv.__netif_receive_skb_core.__netif_receive_skb
3.88 ± 8% -36.5% 2.47 ± 19% perf-profile.calltrace.cycles-pp.ip_local_deliver_finish.ip_local_deliver.ip_rcv_finish.ip_rcv.__netif_receive_skb_core
4.84 ± 8% -30.2% 3.38 ± 9% perf-profile.calltrace.cycles-pp.ip_local_out.ip_queue_xmit.tcp_transmit_skb.tcp_send_ack.tcp_cleanup_rbuf
4.69 ± 8% -26.4% 3.45 ± 9% perf-profile.calltrace.cycles-pp.ip_output.ip_local_out.ip_queue_xmit.tcp_transmit_skb.tcp_send_ack
4.95 ± 8% -30.5% 3.44 ± 9% perf-profile.calltrace.cycles-pp.ip_queue_xmit.tcp_transmit_skb.tcp_send_ack.tcp_cleanup_rbuf.tcp_recvmsg
4.29 ± 8% -27.3% 3.12 ± 13% perf-profile.calltrace.cycles-pp.ip_rcv.__netif_receive_skb_core.__netif_receive_skb.process_backlog.net_rx_action
4.08 ± 9% -30.3% 2.84 ± 9% perf-profile.calltrace.cycles-pp.ip_rcv_finish.ip_rcv.__netif_receive_skb_core.__netif_receive_skb.process_backlog
4.95 ± 7% -26.1% 3.66 ± 12% perf-profile.calltrace.cycles-pp.net_rx_action.__softirqentry_text_start.do_softirq_own_stack.do_softirq.__local_bh_enable_ip
4.79 ± 8% -26.8% 3.51 ± 13% perf-profile.calltrace.cycles-pp.process_backlog.net_rx_action.__softirqentry_text_start.do_softirq_own_stack.do_softirq
46.36 ± 3% -14.1% 39.84 ± 10% perf-profile.calltrace.cycles-pp.read
1.85 ± 8% -13.7% 1.60 ± 3% perf-profile.calltrace.cycles-pp.sk_page_frag_refill.tcp_sendmsg.inet_sendmsg.sock_sendmsg.sock_write_iter
1.83 ± 8% -14.1% 1.57 ± 2% perf-profile.calltrace.cycles-pp.skb_page_frag_refill.sk_page_frag_refill.tcp_sendmsg.inet_sendmsg.sock_sendmsg
2.50 ± 9% -19.4% 2.02 ± 14% perf-profile.calltrace.cycles-pp.skb_release_all.__kfree_skb.tcp_recvmsg.inet_recvmsg.sock_recvmsg
2.34 ± 9% -21.0% 1.85 ± 15% perf-profile.calltrace.cycles-pp.skb_release_data.skb_release_all.__kfree_skb.tcp_recvmsg.inet_recvmsg
45.19 ± 4% -13.7% 39.00 ± 10% perf-profile.calltrace.cycles-pp.sock_read_iter.__vfs_read.vfs_read.sys_read.entry_SYSCALL_64_fastpath
45.08 ± 4% -13.7% 38.91 ± 10% perf-profile.calltrace.cycles-pp.sock_recvmsg.sock_read_iter.__vfs_read.vfs_read.sys_read
46.07 ± 3% -14.0% 39.62 ± 10% perf-profile.calltrace.cycles-pp.sys_read.entry_SYSCALL_64_fastpath.read
6.30 ± 8% -29.3% 4.46 ± 11% perf-profile.calltrace.cycles-pp.tcp_cleanup_rbuf.tcp_recvmsg.inet_recvmsg.sock_recvmsg.sock_read_iter
1.63 ± 8% -19.4% 1.31 ± 9% perf-profile.calltrace.cycles-pp.tcp_rcv_established.tcp_v4_do_rcv.tcp_v4_rcv.ip_local_deliver_finish.ip_local_deliver
44.73 ± 4% -13.5% 38.69 ± 10% perf-profile.calltrace.cycles-pp.tcp_recvmsg.inet_recvmsg.sock_recvmsg.sock_read_iter.__vfs_read
6.16 ± 8% -29.6% 4.34 ± 10% perf-profile.calltrace.cycles-pp.tcp_send_ack.tcp_cleanup_rbuf.tcp_recvmsg.inet_recvmsg.sock_recvmsg
5.11 ± 7% -31.2% 3.52 ± 8% perf-profile.calltrace.cycles-pp.tcp_transmit_skb.tcp_send_ack.tcp_cleanup_rbuf.tcp_recvmsg.inet_recvmsg
3.00 ± 10% -29.2% 2.12 ± 20% perf-profile.calltrace.cycles-pp.tcp_v4_rcv.ip_local_deliver_finish.ip_local_deliver.ip_rcv_finish.ip_rcv
45.92 ± 3% -13.9% 39.53 ± 10% perf-profile.calltrace.cycles-pp.vfs_read.sys_read.entry_SYSCALL_64_fastpath.read
1.57 ± 7% -14.9% 1.34 ± 4% perf-profile.children.cycles-pp.__alloc_pages_nodemask
1.32 ± 7% -15.9% 1.11 ± 5% perf-profile.children.cycles-pp.__dev_queue_xmit
1.33 ± 9% -17.9% 1.09 ± 14% perf-profile.children.cycles-pp.__free_pages_ok
3.17 ± 6% -16.5% 2.64 ± 13% perf-profile.children.cycles-pp.__kfree_skb
5.96 ± 6% -24.0% 4.53 ± 14% perf-profile.children.cycles-pp.__local_bh_enable_ip
4.91 ± 9% -24.0% 3.73 ± 14% perf-profile.children.cycles-pp.__netif_receive_skb
4.87 ± 9% -23.9% 3.71 ± 14% perf-profile.children.cycles-pp.__netif_receive_skb_core
1.58 ± 8% -19.6% 1.27 ± 12% perf-profile.children.cycles-pp.__put_compound_page
1.61 ± 8% -20.0% 1.29 ± 12% perf-profile.children.cycles-pp.__put_page
5.74 ± 7% -24.0% 4.36 ± 13% perf-profile.children.cycles-pp.__softirqentry_text_start
45.41 ± 4% -13.6% 39.23 ± 10% perf-profile.children.cycles-pp.__vfs_read
1.69 ± 8% -15.8% 1.42 ± 3% perf-profile.children.cycles-pp.alloc_pages_current
1.53 ± 11% -39.0% 0.94 ± 16% perf-profile.children.cycles-pp.apic_timer_interrupt
22.38 ± 4% +34.7% 30.14 ± 15% perf-profile.children.cycles-pp.call_cpuidle
23.52 ± 4% +32.5% 31.17 ± 14% perf-profile.children.cycles-pp.cpu_startup_entry
22.37 ± 4% +34.7% 30.13 ± 15% perf-profile.children.cycles-pp.cpuidle_enter
21.12 ± 6% +41.2% 29.83 ± 15% perf-profile.children.cycles-pp.cpuidle_enter_state
5.72 ± 7% -24.5% 4.32 ± 14% perf-profile.children.cycles-pp.do_softirq
5.65 ± 7% -24.2% 4.28 ± 14% perf-profile.children.cycles-pp.do_softirq_own_stack
44.92 ± 4% -13.6% 38.80 ± 10% perf-profile.children.cycles-pp.inet_recvmsg
20.23 ± 4% +46.4% 29.63 ± 14% perf-profile.children.cycles-pp.intel_idle
7.37 ± 7% -22.4% 5.72 ± 12% perf-profile.children.cycles-pp.ip_finish_output
7.24 ± 7% -22.6% 5.60 ± 12% perf-profile.children.cycles-pp.ip_finish_output2
4.14 ± 10% -24.6% 3.12 ± 14% perf-profile.children.cycles-pp.ip_local_deliver
4.08 ± 9% -24.4% 3.08 ± 15% perf-profile.children.cycles-pp.ip_local_deliver_finish
8.43 ± 7% -22.6% 6.53 ± 11% perf-profile.children.cycles-pp.ip_local_out
8.09 ± 7% -22.7% 6.25 ± 12% perf-profile.children.cycles-pp.ip_output
8.72 ± 7% -22.9% 6.72 ± 11% perf-profile.children.cycles-pp.ip_queue_xmit
4.52 ± 9% -24.5% 3.41 ± 14% perf-profile.children.cycles-pp.ip_rcv
4.28 ± 9% -24.4% 3.24 ± 14% perf-profile.children.cycles-pp.ip_rcv_finish
5.21 ± 8% -23.3% 4.00 ± 14% perf-profile.children.cycles-pp.net_rx_action
5.04 ± 9% -24.1% 3.83 ± 14% perf-profile.children.cycles-pp.process_backlog
46.36 ± 3% -14.1% 39.84 ± 10% perf-profile.children.cycles-pp.read
1.86 ± 8% -14.0% 1.60 ± 3% perf-profile.children.cycles-pp.sk_page_frag_refill
1.84 ± 8% -13.8% 1.59 ± 2% perf-profile.children.cycles-pp.skb_page_frag_refill
2.82 ± 6% -17.3% 2.33 ± 12% perf-profile.children.cycles-pp.skb_release_all
2.58 ± 6% -19.1% 2.08 ± 13% perf-profile.children.cycles-pp.skb_release_data
1.49 ± 10% -39.0% 0.91 ± 15% perf-profile.children.cycles-pp.smp_apic_timer_interrupt
45.19 ± 4% -13.7% 39.00 ± 10% perf-profile.children.cycles-pp.sock_read_iter
45.08 ± 4% -13.7% 38.91 ± 10% perf-profile.children.cycles-pp.sock_recvmsg
46.08 ± 3% -13.9% 39.69 ± 10% perf-profile.children.cycles-pp.sys_read
6.31 ± 8% -29.4% 4.46 ± 11% perf-profile.children.cycles-pp.tcp_cleanup_rbuf
44.74 ± 4% -13.5% 38.70 ± 10% perf-profile.children.cycles-pp.tcp_recvmsg
6.57 ± 4% -27.5% 4.76 ± 9% perf-profile.children.cycles-pp.tcp_send_ack
9.69 ± 6% -21.9% 7.58 ± 11% perf-profile.children.cycles-pp.tcp_transmit_skb
3.57 ± 9% -24.7% 2.69 ± 16% perf-profile.children.cycles-pp.tcp_v4_rcv
45.94 ± 3% -13.8% 39.59 ± 10% perf-profile.children.cycles-pp.vfs_read
20.23 ± 4% +46.5% 29.62 ± 14% perf-profile.self.cycles-pp.intel_idle
turbostat.PkgWatt
32.5 ++-------------------------------------------------------------------+
| OO O O OO O O OO O OO O O O O O O OO O |
32 O+ O O O O O O O O OO O O OO O OO O
| |
| |
31.5 ++ |
| |
31 ++ |
| .* |
30.5 ++ .*. * * |
| *.*.*.**. .*. *.**. **.* * + + *.* |
*. : * * + *. + *.*.**.* *.* + |
30 ++* * * *.* |
| |
29.5 ++-------------------------------------------------------------------+
turbostat.CorWatt
14.5 ++-------------------------------------------------------------------+
| |
14 ++ O O O O O O O |
O OO O O O O O OO O OO O O O OO O O OO O OO O O OO O O OO O
| |
13.5 ++ |
| |
13 ++ |
| |
12.5 ++ .* |
| *. .* .*. * * |
| *.*.*.* * + *.**. **.* * + + .*.* |
12 *+ : * + *. + *.*.**.* *.* .* |
| * * * * |
11.5 ++-------------------------------------------------------------------+
nuttcp.throughput_Mbps
34500 ++------------------------------------------------------------------+
34000 ++ O O |
| O |
33500 ++OO O O O O OO O O OO O O |
33000 O+ O O O O O O O O OO OO O OO O O OO O OO O
| |
32500 ++ |
32000 ++ |
31500 ++ |
| |
31000 ++ * *.*. *. |
30500 *+ *.*.*.** **. .**.*.*. *.*.* + *. .*.* * + * |
| * + + * * *.* * *.*. * |
30000 ++ * * |
29500 ++------------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Thanks,
Xiaolong
View attachment "config-4.8.0-rc8-00068-g0e369d7" of type "text/plain" (152576 bytes)
View attachment "job-script" of type "text/plain" (6526 bytes)
View attachment "job.yaml" of type "text/plain" (4033 bytes)
View attachment "reproduce" of type "text/plain" (150 bytes)
Powered by blists - more mailing lists