[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160816142642.GA24206@yexl-desktop>
Date: Tue, 16 Aug 2016 22:26:43 +0800
From: kernel test robot <xiaolong.ye@...el.com>
To: Ville Syrjälä <ville.syrjala@...ux.intel.com>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
Borislav Petkov <bp@...e.de>, "H. Peter Anvin" <hpa@...or.com>,
Andy Lutomirski <luto@...capital.net>,
Brian Gerst <brgerst@...il.com>,
Denys Vlasenko <dvlasenk@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...nel.org>,
LKML <linux-kernel@...r.kernel.org>, lkp@...org
Subject: [lkp] [x86/hweight] 65ea11ec6a: will-it-scale.per_process_ops 9.3%
improvement
FYI, we noticed a 9.3% improvement of will-it-scale.per_process_ops due to commit:
commit 65ea11ec6a82b1d44aba62b59e9eb20247e57c6e ("x86/hweight: Don't clobber %rdi")
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
in testcase: will-it-scale
on test machine: 32 threads Sandy Bridge-EP with 64G memory
with following parameters:
test: unix1
cpufreq_governor: performance
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase:
gcc-6/performance/x86_64-rhel/debian-x86_64-2015-02-07.cgz/lkp-sb03/unix1/will-it-scale
commit:
v4.8-rc1
65ea11ec6a ("x86/hweight: Don't clobber %rdi")
v4.8-rc1 65ea11ec6a82b1d44aba62b59e
---------------- --------------------------
fail:runs %reproduction fail:runs
| | |
1:8 -12% :4 last_state.is_incomplete_run
4:8 -50% :4 kmsg.DHCP/BOOTP:Reply_not_for_us,op[#]xid[#]
7:8 -88% :4 kmsg.drm:drm_edid_block_valid[drm]]*ERROR*EDID_checksum_is_invalid,remainder_is
7:8 -88% :4 kmsg.i8042:Can't_read_CTR_while_initializing_i8042
%stddev %change %stddev
\ | \
1063041 ± 0% +9.3% 1161810 ± 0% will-it-scale.per_process_ops
976004 ± 0% +9.0% 1063615 ± 0% will-it-scale.per_thread_ops
0.57 ± 0% -6.7% 0.53 ± 1% will-it-scale.scalability
175.96 ± 0% +8.0% 190.10 ± 0% will-it-scale.time.user_time
0.00 ± 20% -31.5% 0.00 ± 26% sched_debug.cpu.next_balance.stddev
101.14 ± 11% +9639.4% 9850 ±121% latency_stats.avg.rpc_wait_bit_killable.__rpc_execute.rpc_execute.rpc_run_task.nfs4_call_sync_sequence.[nfsv4]._nfs4_proc_getattr.[nfsv4].nfs4_proc_getattr.[nfsv4].__nfs_revalidate_inode.nfs_do_access.nfs_permission.__inode_permission.inode_permission
148.57 ± 15% +57704.4% 85880 ±125% latency_stats.max.rpc_wait_bit_killable.__rpc_execute.rpc_execute.rpc_run_task.nfs4_call_sync_sequence.[nfsv4]._nfs4_proc_getattr.[nfsv4].nfs4_proc_getattr.[nfsv4].__nfs_revalidate_inode.nfs_do_access.nfs_permission.__inode_permission.inode_permission
886.00 ± 14% +9757.0% 87333 ±123% latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_execute.rpc_run_task.nfs4_call_sync_sequence.[nfsv4]._nfs4_proc_getattr.[nfsv4].nfs4_proc_getattr.[nfsv4].__nfs_revalidate_inode.nfs_do_access.nfs_permission.__inode_permission.inode_permission
3.041e+12 ± 1% +7.4% 3.267e+12 ± 1% perf-stat.branch-instructions
0.31 ± 0% -86.6% 0.04 ± 4% perf-stat.branch-miss-rate
9.456e+09 ± 1% -85.6% 1.364e+09 ± 3% perf-stat.branch-misses
5.147e+12 ± 1% +5.4% 5.427e+12 ± 1% perf-stat.dTLB-loads
3.869e+12 ± 0% +6.7% 4.128e+12 ± 1% perf-stat.dTLB-stores
29.02 ± 13% +223.2% 93.80 ± 0% perf-stat.iTLB-load-miss-rate
2.353e+08 ± 21% +733.0% 1.96e+09 ± 0% perf-stat.iTLB-load-misses
5.7e+08 ± 9% -77.2% 1.297e+08 ± 10% perf-stat.iTLB-loads
1.696e+13 ± 0% +6.9% 1.814e+13 ± 0% perf-stat.instructions
75030 ± 18% -87.7% 9251 ± 1% perf-stat.instructions-per-iTLB-miss
1.04 ± 0% +7.6% 1.12 ± 1% perf-stat.ipc
24064971 ± 3% -6.6% 22469931 ± 3% perf-stat.node-load-misses
53705459 ± 1% -3.1% 52034054 ± 2% perf-stat.node-loads
7.32 ± 5% +23.3% 9.03 ± 4% perf-profile.cycles.__alloc_skb.alloc_skb_with_frags.sock_alloc_send_pskb.unix_stream_sendmsg.sock_sendmsg
1.29 ± 4% +11.7% 1.44 ± 5% perf-profile.cycles.__fdget_pos.sys_write.entry_SYSCALL_64_fastpath
1.15 ± 4% +12.1% 1.29 ± 4% perf-profile.cycles.__fget.__fget_light.__fdget_pos.sys_write.entry_SYSCALL_64_fastpath
1.22 ± 5% +11.7% 1.36 ± 5% perf-profile.cycles.__fget_light.__fdget_pos.sys_write.entry_SYSCALL_64_fastpath
1.86 ± 4% -58.4% 0.77 ± 7% perf-profile.cycles.__inode_security_revalidate.selinux_file_permission.security_file_permission.rw_verify_area.vfs_write
0.00 ± -1% +Inf% 2.65 ± 5% perf-profile.cycles.__kmalloc_node_track_caller.__kmalloc_reserve.isra.33.__alloc_skb.alloc_skb_with_frags.sock_alloc_send_pskb
1.89 ± 8% -100.0% 0.00 ± -1% perf-profile.cycles.__kmalloc_node_track_caller.__kmalloc_reserve.isra.35.__alloc_skb.alloc_skb_with_frags.sock_alloc_send_pskb
0.00 ± -1% +Inf% 3.55 ± 5% perf-profile.cycles.__kmalloc_reserve.isra.33.__alloc_skb.alloc_skb_with_frags.sock_alloc_send_pskb.unix_stream_sendmsg
2.52 ± 8% -100.0% 0.00 ± -1% perf-profile.cycles.__kmalloc_reserve.isra.35.__alloc_skb.alloc_skb_with_frags.sock_alloc_send_pskb.unix_stream_sendmsg
1.43 ± 4% -91.1% 0.13 ±173% perf-profile.cycles.__might_sleep.__inode_security_revalidate.selinux_file_permission.security_file_permission.rw_verify_area
1.15 ± 5% -65.7% 0.40 ± 57% perf-profile.cycles.__might_sleep.mutex_lock.unix_stream_read_generic.unix_stream_recvmsg.sock_recvmsg
1.33 ± 7% +14.0% 1.52 ± 2% perf-profile.cycles._raw_spin_lock_irqsave.skb_queue_tail.unix_stream_sendmsg.sock_sendmsg.sock_write_iter
1.37 ± 6% +20.4% 1.65 ± 3% perf-profile.cycles._raw_spin_lock_irqsave.skb_unlink.unix_stream_read_generic.unix_stream_recvmsg.sock_recvmsg
1.09 ± 9% +15.6% 1.26 ± 5% perf-profile.cycles._raw_spin_unlock_irqrestore.skb_queue_tail.unix_stream_sendmsg.sock_sendmsg.sock_write_iter
1.01 ± 6% +15.4% 1.17 ± 7% perf-profile.cycles._raw_spin_unlock_irqrestore.skb_unlink.unix_stream_read_generic.unix_stream_recvmsg.sock_recvmsg
8.01 ± 6% +22.5% 9.82 ± 4% perf-profile.cycles.alloc_skb_with_frags.sock_alloc_send_pskb.unix_stream_sendmsg.sock_sendmsg.sock_write_iter
7.33 ± 6% +14.8% 8.42 ± 4% perf-profile.cycles.consume_skb.unix_stream_read_generic.unix_stream_recvmsg.sock_recvmsg.sock_read_iter
0.98 ± 8% +15.0% 1.12 ± 4% perf-profile.cycles.consume_skb.unix_stream_recvmsg.sock_recvmsg.sock_read_iter.__vfs_read
1.60 ± 5% +18.7% 1.91 ± 3% perf-profile.cycles.copy_from_iter.skb_copy_datagram_from_iter.unix_stream_sendmsg.sock_sendmsg.sock_write_iter
2.30 ± 4% +11.5% 2.56 ± 6% perf-profile.cycles.entry_SYSCALL_64
2.10 ± 3% +18.1% 2.48 ± 5% perf-profile.cycles.entry_SYSCALL_64_after_swapgs
2.82 ± 7% -34.6% 1.85 ± 6% perf-profile.cycles.file_has_perm.selinux_file_permission.security_file_permission.rw_verify_area.vfs_read
1.55 ± 6% +21.3% 1.89 ± 5% perf-profile.cycles.fput.entry_SYSCALL_64_fastpath
1.13 ± 9% +17.0% 1.32 ± 3% perf-profile.cycles.kfree.skb_free_head.skb_release_data.skb_release_all.consume_skb
0.76 ± 8% +21.9% 0.93 ± 5% perf-profile.cycles.kfree_skbmem.consume_skb.unix_stream_read_generic.unix_stream_recvmsg.sock_recvmsg
0.77 ± 10% +27.0% 0.98 ± 5% perf-profile.cycles.ksize.__alloc_skb.alloc_skb_with_frags.sock_alloc_send_pskb.unix_stream_sendmsg
2.08 ± 6% -31.5% 1.42 ± 6% perf-profile.cycles.mutex_lock.unix_stream_read_generic.unix_stream_recvmsg.sock_recvmsg.sock_read_iter
0.89 ± 9% +18.8% 1.06 ± 6% perf-profile.cycles.mutex_unlock.unix_stream_recvmsg.sock_recvmsg.sock_read_iter.__vfs_read
6.80 ± 3% -19.3% 5.49 ± 3% perf-profile.cycles.rw_verify_area.vfs_read.sys_read.entry_SYSCALL_64_fastpath
5.54 ± 4% -23.5% 4.24 ± 5% perf-profile.cycles.rw_verify_area.vfs_write.sys_write.entry_SYSCALL_64_fastpath
6.21 ± 4% -19.5% 5.00 ± 3% perf-profile.cycles.security_file_permission.rw_verify_area.vfs_read.sys_read.entry_SYSCALL_64_fastpath
5.23 ± 4% -25.6% 3.89 ± 5% perf-profile.cycles.security_file_permission.rw_verify_area.vfs_write.sys_write.entry_SYSCALL_64_fastpath
4.67 ± 4% -24.1% 3.55 ± 4% perf-profile.cycles.selinux_file_permission.security_file_permission.rw_verify_area.vfs_read.sys_read
4.87 ± 5% -28.0% 3.51 ± 5% perf-profile.cycles.selinux_file_permission.security_file_permission.rw_verify_area.vfs_write.sys_write
2.43 ± 5% +29.8% 3.15 ± 3% perf-profile.cycles.skb_copy_datagram_from_iter.unix_stream_sendmsg.sock_sendmsg.sock_write_iter.__vfs_write
1.18 ± 8% +16.1% 1.36 ± 2% perf-profile.cycles.skb_free_head.skb_release_data.skb_release_all.consume_skb.unix_stream_read_generic
2.60 ± 7% +15.4% 3.00 ± 3% perf-profile.cycles.skb_queue_tail.unix_stream_sendmsg.sock_sendmsg.sock_write_iter.__vfs_write
6.30 ± 6% +15.2% 7.26 ± 4% perf-profile.cycles.skb_release_all.consume_skb.unix_stream_read_generic.unix_stream_recvmsg.sock_recvmsg
1.45 ± 7% +19.4% 1.73 ± 2% perf-profile.cycles.skb_release_data.skb_release_all.consume_skb.unix_stream_read_generic.unix_stream_recvmsg
4.63 ± 6% +14.4% 5.30 ± 5% perf-profile.cycles.skb_release_head_state.skb_release_all.consume_skb.unix_stream_read_generic.unix_stream_recvmsg
1.01 ± 4% +16.7% 1.18 ± 5% perf-profile.cycles.skb_set_owner_w.sock_alloc_send_pskb.unix_stream_sendmsg.sock_sendmsg.sock_write_iter
2.59 ± 6% +18.2% 3.07 ± 4% perf-profile.cycles.skb_unlink.unix_stream_read_generic.unix_stream_recvmsg.sock_recvmsg.sock_read_iter
9.66 ± 5% +21.1% 11.70 ± 3% perf-profile.cycles.sock_alloc_send_pskb.unix_stream_sendmsg.sock_sendmsg.sock_write_iter.__vfs_write
25.86 ± 5% +14.8% 29.68 ± 4% perf-profile.cycles.sock_sendmsg.sock_write_iter.__vfs_write.vfs_write.sys_write
3.88 ± 7% +13.1% 4.38 ± 5% perf-profile.cycles.sock_wfree.unix_destruct_scm.skb_release_head_state.skb_release_all.consume_skb
4.24 ± 7% +13.3% 4.80 ± 5% perf-profile.cycles.unix_destruct_scm.skb_release_head_state.skb_release_all.consume_skb.unix_stream_read_generic
21.96 ± 5% +17.1% 25.71 ± 3% perf-profile.cycles.unix_stream_sendmsg.sock_sendmsg.sock_write_iter.__vfs_write.vfs_write
1.20 ± 6% -100.0% 0.00 ± -1% perf-profile.cycles.unix_stream_sendmsg.sock_write_iter.__vfs_write.vfs_write.sys_write
2.28 ± 6% +13.7% 2.60 ± 3% perf-profile.cycles.unix_write_space.sock_wfree.unix_destruct_scm.skb_release_head_state.skb_release_all
3.84 ± 5% -16.8% 3.20 ± 2% perf-profile.func.cycles.___might_sleep
1.96 ± 7% +20.8% 2.36 ± 4% perf-profile.func.cycles.__alloc_skb
2.40 ± 4% +11.3% 2.67 ± 4% perf-profile.func.cycles.__fget
1.30 ± 9% +48.7% 1.94 ± 4% perf-profile.func.cycles.__kmalloc_node_track_caller
1.05 ± 5% +12.6% 1.19 ± 7% perf-profile.func.cycles.__vfs_read
0.99 ± 7% +27.1% 1.26 ± 4% perf-profile.func.cycles.__vfs_write
1.01 ± 5% -51.9% 0.48 ± 3% perf-profile.func.cycles._cond_resched
2.78 ± 6% +17.0% 3.25 ± 2% perf-profile.func.cycles._raw_spin_lock_irqsave
2.19 ± 8% +15.5% 2.53 ± 6% perf-profile.func.cycles._raw_spin_unlock_irqrestore
1.10 ± 8% +11.2% 1.23 ± 4% perf-profile.func.cycles.consume_skb
0.97 ± 5% +25.6% 1.22 ± 3% perf-profile.func.cycles.copy_from_iter
2.30 ± 4% +11.5% 2.56 ± 6% perf-profile.func.cycles.entry_SYSCALL_64
2.10 ± 3% +18.1% 2.48 ± 5% perf-profile.func.cycles.entry_SYSCALL_64_after_swapgs
2.26 ± 4% -38.4% 1.39 ± 5% perf-profile.func.cycles.file_has_perm
1.55 ± 6% +21.3% 1.89 ± 5% perf-profile.func.cycles.fput
1.18 ± 8% +17.2% 1.38 ± 3% perf-profile.func.cycles.kfree
0.86 ± 10% +22.0% 1.05 ± 4% perf-profile.func.cycles.ksize
0.90 ± 8% +18.7% 1.06 ± 5% perf-profile.func.cycles.mutex_unlock
1.91 ± 6% -13.1% 1.66 ± 3% perf-profile.func.cycles.selinux_file_permission
1.05 ± 5% +16.7% 1.23 ± 5% perf-profile.func.cycles.skb_set_owner_w
1.66 ± 8% +16.3% 1.93 ± 7% perf-profile.func.cycles.sock_wfree
2.44 ± 4% -39.7% 1.47 ± 2% perf-profile.func.cycles.sock_write_iter
4.20 ± 6% -21.1% 3.32 ± 3% perf-profile.func.cycles.unix_stream_sendmsg
2.35 ± 6% +14.3% 2.69 ± 3% perf-profile.func.cycles.unix_write_space
Thanks,
Xiaolong
View attachment "config-4.8.0-rc1-00001-g65ea11e" of type "text/plain" (152588 bytes)
View attachment "job.yaml" of type "text/plain" (3834 bytes)
View attachment "reproduce" of type "text/plain" (163 bytes)
Powered by blists - more mailing lists