lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9FF32F53-5EF8-40D4-B696-A30FDF7201E1@zytor.com>
Date:	Tue, 16 Aug 2016 09:59:00 -0700
From:	"H. Peter Anvin" <hpa@...or.com>
To:	kernel test robot <xiaolong.ye@...el.com>,
	Ville Syrjälä <ville.syrjala@...ux.intel.com>
CC:	Linus Torvalds <torvalds@...ux-foundation.org>,
	Borislav Petkov <bp@...e.de>,
	Andy Lutomirski <luto@...capital.net>,
	Brian Gerst <brgerst@...il.com>,
	Denys Vlasenko <dvlasenk@...hat.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...nel.org>,
	LKML <linux-kernel@...r.kernel.org>, lkp@...org
Subject: Re: [lkp] [x86/hweight]  65ea11ec6a:  will-it-scale.per_process_ops 9.3% improvement

On August 16, 2016 7:26:43 AM PDT, kernel test robot <xiaolong.ye@...el.com> wrote:
>
>FYI, we noticed a 9.3% improvement of will-it-scale.per_process_ops due
>to commit:
>
>commit 65ea11ec6a82b1d44aba62b59e9eb20247e57c6e ("x86/hweight: Don't
>clobber %rdi")
>https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
>master
>
>in testcase: will-it-scale
>on test machine: 32 threads Sandy Bridge-EP with 64G memory
>with following parameters:
>
>	test: unix1
>	cpufreq_governor: performance
>
>
>Disclaimer:
>Results have been estimated based on internal Intel analysis and are
>provided
>for informational purposes only. Any difference in system hardware or
>software
>design or configuration may affect actual performance.
>
>Details are as below:
>-------------------------------------------------------------------------------------------------->
>
>
>To reproduce:
>
>git clone
>git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
>        cd lkp-tests
>        bin/lkp install job.yaml  # job file is attached in this email
>        bin/lkp run     job.yaml
>
>=========================================================================================
>compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase:
>gcc-6/performance/x86_64-rhel/debian-x86_64-2015-02-07.cgz/lkp-sb03/unix1/will-it-scale
>
>commit: 
>  v4.8-rc1
>  65ea11ec6a ("x86/hweight: Don't clobber %rdi")
>
>        v4.8-rc1 65ea11ec6a82b1d44aba62b59e 
>---------------- -------------------------- 
>       fail:runs  %reproduction    fail:runs
>           |             |             |    
>       1:8          -12%            :4     last_state.is_incomplete_run
>4:8          -50%            :4    
>kmsg.DHCP/BOOTP:Reply_not_for_us,op[#]xid[#]
>7:8          -88%            :4    
>kmsg.drm:drm_edid_block_valid[drm]]*ERROR*EDID_checksum_is_invalid,remainder_is
>7:8          -88%            :4    
>kmsg.i8042:Can't_read_CTR_while_initializing_i8042
>         %stddev     %change         %stddev
>             \          |                \  
>1063041 ±  0%      +9.3%    1161810 ±  0% 
>will-it-scale.per_process_ops
> 976004 ±  0%      +9.0%    1063615 ±  0%  will-it-scale.per_thread_ops
>      0.57 ±  0%      -6.7%       0.53 ±  1%  will-it-scale.scalability
> 175.96 ±  0%      +8.0%     190.10 ±  0%  will-it-scale.time.user_time
>0.00 ± 20%     -31.5%       0.00 ± 26% 
>sched_debug.cpu.next_balance.stddev
>101.14 ± 11%   +9639.4%       9850 ±121% 
>latency_stats.avg.rpc_wait_bit_killable.__rpc_execute.rpc_execute.rpc_run_task.nfs4_call_sync_sequence.[nfsv4]._nfs4_proc_getattr.[nfsv4].nfs4_proc_getattr.[nfsv4].__nfs_revalidate_inode.nfs_do_access.nfs_permission.__inode_permission.inode_permission
>148.57 ± 15%  +57704.4%      85880 ±125% 
>latency_stats.max.rpc_wait_bit_killable.__rpc_execute.rpc_execute.rpc_run_task.nfs4_call_sync_sequence.[nfsv4]._nfs4_proc_getattr.[nfsv4].nfs4_proc_getattr.[nfsv4].__nfs_revalidate_inode.nfs_do_access.nfs_permission.__inode_permission.inode_permission
>886.00 ± 14%   +9757.0%      87333 ±123% 
>latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_execute.rpc_run_task.nfs4_call_sync_sequence.[nfsv4]._nfs4_proc_getattr.[nfsv4].nfs4_proc_getattr.[nfsv4].__nfs_revalidate_inode.nfs_do_access.nfs_permission.__inode_permission.inode_permission
>3.041e+12 ±  1%      +7.4%  3.267e+12 ±  1% 
>perf-stat.branch-instructions
>     0.31 ±  0%     -86.6%       0.04 ±  4%  perf-stat.branch-miss-rate
> 9.456e+09 ±  1%     -85.6%  1.364e+09 ±  3%  perf-stat.branch-misses
> 5.147e+12 ±  1%      +5.4%  5.427e+12 ±  1%  perf-stat.dTLB-loads
> 3.869e+12 ±  0%      +6.7%  4.128e+12 ±  1%  perf-stat.dTLB-stores
> 29.02 ± 13%    +223.2%      93.80 ±  0%  perf-stat.iTLB-load-miss-rate
>2.353e+08 ± 21%    +733.0%   1.96e+09 ±  0%  perf-stat.iTLB-load-misses
>   5.7e+08 ±  9%     -77.2%  1.297e+08 ± 10%  perf-stat.iTLB-loads
> 1.696e+13 ±  0%      +6.9%  1.814e+13 ±  0%  perf-stat.instructions
>75030 ± 18%     -87.7%       9251 ±  1% 
>perf-stat.instructions-per-iTLB-miss
>      1.04 ±  0%      +7.6%       1.12 ±  1%  perf-stat.ipc
> 24064971 ±  3%      -6.6%   22469931 ±  3%  perf-stat.node-load-misses
>  53705459 ±  1%      -3.1%   52034054 ±  2%  perf-stat.node-loads
>7.32 ±  5%     +23.3%       9.03 ±  4% 
>perf-profile.cycles.__alloc_skb.alloc_skb_with_frags.sock_alloc_send_pskb.unix_stream_sendmsg.sock_sendmsg
>1.29 ±  4%     +11.7%       1.44 ±  5% 
>perf-profile.cycles.__fdget_pos.sys_write.entry_SYSCALL_64_fastpath
>1.15 ±  4%     +12.1%       1.29 ±  4% 
>perf-profile.cycles.__fget.__fget_light.__fdget_pos.sys_write.entry_SYSCALL_64_fastpath
>1.22 ±  5%     +11.7%       1.36 ±  5% 
>perf-profile.cycles.__fget_light.__fdget_pos.sys_write.entry_SYSCALL_64_fastpath
>1.86 ±  4%     -58.4%       0.77 ±  7% 
>perf-profile.cycles.__inode_security_revalidate.selinux_file_permission.security_file_permission.rw_verify_area.vfs_write
>0.00 ± -1%      +Inf%       2.65 ±  5% 
>perf-profile.cycles.__kmalloc_node_track_caller.__kmalloc_reserve.isra.33.__alloc_skb.alloc_skb_with_frags.sock_alloc_send_pskb
>1.89 ±  8%    -100.0%       0.00 ± -1% 
>perf-profile.cycles.__kmalloc_node_track_caller.__kmalloc_reserve.isra.35.__alloc_skb.alloc_skb_with_frags.sock_alloc_send_pskb
>0.00 ± -1%      +Inf%       3.55 ±  5% 
>perf-profile.cycles.__kmalloc_reserve.isra.33.__alloc_skb.alloc_skb_with_frags.sock_alloc_send_pskb.unix_stream_sendmsg
>2.52 ±  8%    -100.0%       0.00 ± -1% 
>perf-profile.cycles.__kmalloc_reserve.isra.35.__alloc_skb.alloc_skb_with_frags.sock_alloc_send_pskb.unix_stream_sendmsg
>1.43 ±  4%     -91.1%       0.13 ±173% 
>perf-profile.cycles.__might_sleep.__inode_security_revalidate.selinux_file_permission.security_file_permission.rw_verify_area
>1.15 ±  5%     -65.7%       0.40 ± 57% 
>perf-profile.cycles.__might_sleep.mutex_lock.unix_stream_read_generic.unix_stream_recvmsg.sock_recvmsg
>1.33 ±  7%     +14.0%       1.52 ±  2% 
>perf-profile.cycles._raw_spin_lock_irqsave.skb_queue_tail.unix_stream_sendmsg.sock_sendmsg.sock_write_iter
>1.37 ±  6%     +20.4%       1.65 ±  3% 
>perf-profile.cycles._raw_spin_lock_irqsave.skb_unlink.unix_stream_read_generic.unix_stream_recvmsg.sock_recvmsg
>1.09 ±  9%     +15.6%       1.26 ±  5% 
>perf-profile.cycles._raw_spin_unlock_irqrestore.skb_queue_tail.unix_stream_sendmsg.sock_sendmsg.sock_write_iter
>1.01 ±  6%     +15.4%       1.17 ±  7% 
>perf-profile.cycles._raw_spin_unlock_irqrestore.skb_unlink.unix_stream_read_generic.unix_stream_recvmsg.sock_recvmsg
>8.01 ±  6%     +22.5%       9.82 ±  4% 
>perf-profile.cycles.alloc_skb_with_frags.sock_alloc_send_pskb.unix_stream_sendmsg.sock_sendmsg.sock_write_iter
>7.33 ±  6%     +14.8%       8.42 ±  4% 
>perf-profile.cycles.consume_skb.unix_stream_read_generic.unix_stream_recvmsg.sock_recvmsg.sock_read_iter
>0.98 ±  8%     +15.0%       1.12 ±  4% 
>perf-profile.cycles.consume_skb.unix_stream_recvmsg.sock_recvmsg.sock_read_iter.__vfs_read
>1.60 ±  5%     +18.7%       1.91 ±  3% 
>perf-profile.cycles.copy_from_iter.skb_copy_datagram_from_iter.unix_stream_sendmsg.sock_sendmsg.sock_write_iter
>2.30 ±  4%     +11.5%       2.56 ±  6% 
>perf-profile.cycles.entry_SYSCALL_64
>2.10 ±  3%     +18.1%       2.48 ±  5% 
>perf-profile.cycles.entry_SYSCALL_64_after_swapgs
>2.82 ±  7%     -34.6%       1.85 ±  6% 
>perf-profile.cycles.file_has_perm.selinux_file_permission.security_file_permission.rw_verify_area.vfs_read
>1.55 ±  6%     +21.3%       1.89 ±  5% 
>perf-profile.cycles.fput.entry_SYSCALL_64_fastpath
>1.13 ±  9%     +17.0%       1.32 ±  3% 
>perf-profile.cycles.kfree.skb_free_head.skb_release_data.skb_release_all.consume_skb
>0.76 ±  8%     +21.9%       0.93 ±  5% 
>perf-profile.cycles.kfree_skbmem.consume_skb.unix_stream_read_generic.unix_stream_recvmsg.sock_recvmsg
>0.77 ± 10%     +27.0%       0.98 ±  5% 
>perf-profile.cycles.ksize.__alloc_skb.alloc_skb_with_frags.sock_alloc_send_pskb.unix_stream_sendmsg
>2.08 ±  6%     -31.5%       1.42 ±  6% 
>perf-profile.cycles.mutex_lock.unix_stream_read_generic.unix_stream_recvmsg.sock_recvmsg.sock_read_iter
>0.89 ±  9%     +18.8%       1.06 ±  6% 
>perf-profile.cycles.mutex_unlock.unix_stream_recvmsg.sock_recvmsg.sock_read_iter.__vfs_read
>6.80 ±  3%     -19.3%       5.49 ±  3% 
>perf-profile.cycles.rw_verify_area.vfs_read.sys_read.entry_SYSCALL_64_fastpath
>5.54 ±  4%     -23.5%       4.24 ±  5% 
>perf-profile.cycles.rw_verify_area.vfs_write.sys_write.entry_SYSCALL_64_fastpath
>6.21 ±  4%     -19.5%       5.00 ±  3% 
>perf-profile.cycles.security_file_permission.rw_verify_area.vfs_read.sys_read.entry_SYSCALL_64_fastpath
>5.23 ±  4%     -25.6%       3.89 ±  5% 
>perf-profile.cycles.security_file_permission.rw_verify_area.vfs_write.sys_write.entry_SYSCALL_64_fastpath
>4.67 ±  4%     -24.1%       3.55 ±  4% 
>perf-profile.cycles.selinux_file_permission.security_file_permission.rw_verify_area.vfs_read.sys_read
>4.87 ±  5%     -28.0%       3.51 ±  5% 
>perf-profile.cycles.selinux_file_permission.security_file_permission.rw_verify_area.vfs_write.sys_write
>2.43 ±  5%     +29.8%       3.15 ±  3% 
>perf-profile.cycles.skb_copy_datagram_from_iter.unix_stream_sendmsg.sock_sendmsg.sock_write_iter.__vfs_write
>1.18 ±  8%     +16.1%       1.36 ±  2% 
>perf-profile.cycles.skb_free_head.skb_release_data.skb_release_all.consume_skb.unix_stream_read_generic
>2.60 ±  7%     +15.4%       3.00 ±  3% 
>perf-profile.cycles.skb_queue_tail.unix_stream_sendmsg.sock_sendmsg.sock_write_iter.__vfs_write
>6.30 ±  6%     +15.2%       7.26 ±  4% 
>perf-profile.cycles.skb_release_all.consume_skb.unix_stream_read_generic.unix_stream_recvmsg.sock_recvmsg
>1.45 ±  7%     +19.4%       1.73 ±  2% 
>perf-profile.cycles.skb_release_data.skb_release_all.consume_skb.unix_stream_read_generic.unix_stream_recvmsg
>4.63 ±  6%     +14.4%       5.30 ±  5% 
>perf-profile.cycles.skb_release_head_state.skb_release_all.consume_skb.unix_stream_read_generic.unix_stream_recvmsg
>1.01 ±  4%     +16.7%       1.18 ±  5% 
>perf-profile.cycles.skb_set_owner_w.sock_alloc_send_pskb.unix_stream_sendmsg.sock_sendmsg.sock_write_iter
>2.59 ±  6%     +18.2%       3.07 ±  4% 
>perf-profile.cycles.skb_unlink.unix_stream_read_generic.unix_stream_recvmsg.sock_recvmsg.sock_read_iter
>9.66 ±  5%     +21.1%      11.70 ±  3% 
>perf-profile.cycles.sock_alloc_send_pskb.unix_stream_sendmsg.sock_sendmsg.sock_write_iter.__vfs_write
>25.86 ±  5%     +14.8%      29.68 ±  4% 
>perf-profile.cycles.sock_sendmsg.sock_write_iter.__vfs_write.vfs_write.sys_write
>3.88 ±  7%     +13.1%       4.38 ±  5% 
>perf-profile.cycles.sock_wfree.unix_destruct_scm.skb_release_head_state.skb_release_all.consume_skb
>4.24 ±  7%     +13.3%       4.80 ±  5% 
>perf-profile.cycles.unix_destruct_scm.skb_release_head_state.skb_release_all.consume_skb.unix_stream_read_generic
>21.96 ±  5%     +17.1%      25.71 ±  3% 
>perf-profile.cycles.unix_stream_sendmsg.sock_sendmsg.sock_write_iter.__vfs_write.vfs_write
>1.20 ±  6%    -100.0%       0.00 ± -1% 
>perf-profile.cycles.unix_stream_sendmsg.sock_write_iter.__vfs_write.vfs_write.sys_write
>2.28 ±  6%     +13.7%       2.60 ±  3% 
>perf-profile.cycles.unix_write_space.sock_wfree.unix_destruct_scm.skb_release_head_state.skb_release_all
>3.84 ±  5%     -16.8%       3.20 ±  2% 
>perf-profile.func.cycles.___might_sleep
>1.96 ±  7%     +20.8%       2.36 ±  4% 
>perf-profile.func.cycles.__alloc_skb
>2.40 ±  4%     +11.3%       2.67 ±  4%  perf-profile.func.cycles.__fget
>1.30 ±  9%     +48.7%       1.94 ±  4% 
>perf-profile.func.cycles.__kmalloc_node_track_caller
>1.05 ±  5%     +12.6%       1.19 ±  7% 
>perf-profile.func.cycles.__vfs_read
>0.99 ±  7%     +27.1%       1.26 ±  4% 
>perf-profile.func.cycles.__vfs_write
>1.01 ±  5%     -51.9%       0.48 ±  3% 
>perf-profile.func.cycles._cond_resched
>2.78 ±  6%     +17.0%       3.25 ±  2% 
>perf-profile.func.cycles._raw_spin_lock_irqsave
>2.19 ±  8%     +15.5%       2.53 ±  6% 
>perf-profile.func.cycles._raw_spin_unlock_irqrestore
>1.10 ±  8%     +11.2%       1.23 ±  4% 
>perf-profile.func.cycles.consume_skb
>0.97 ±  5%     +25.6%       1.22 ±  3% 
>perf-profile.func.cycles.copy_from_iter
>2.30 ±  4%     +11.5%       2.56 ±  6% 
>perf-profile.func.cycles.entry_SYSCALL_64
>2.10 ±  3%     +18.1%       2.48 ±  5% 
>perf-profile.func.cycles.entry_SYSCALL_64_after_swapgs
>2.26 ±  4%     -38.4%       1.39 ±  5% 
>perf-profile.func.cycles.file_has_perm
>  1.55 ±  6%     +21.3%       1.89 ±  5%  perf-profile.func.cycles.fput
> 1.18 ±  8%     +17.2%       1.38 ±  3%  perf-profile.func.cycles.kfree
> 0.86 ± 10%     +22.0%       1.05 ±  4%  perf-profile.func.cycles.ksize
>0.90 ±  8%     +18.7%       1.06 ±  5% 
>perf-profile.func.cycles.mutex_unlock
>1.91 ±  6%     -13.1%       1.66 ±  3% 
>perf-profile.func.cycles.selinux_file_permission
>1.05 ±  5%     +16.7%       1.23 ±  5% 
>perf-profile.func.cycles.skb_set_owner_w
>1.66 ±  8%     +16.3%       1.93 ±  7% 
>perf-profile.func.cycles.sock_wfree
>2.44 ±  4%     -39.7%       1.47 ±  2% 
>perf-profile.func.cycles.sock_write_iter
>4.20 ±  6%     -21.1%       3.32 ±  3% 
>perf-profile.func.cycles.unix_stream_sendmsg
>2.35 ±  6%     +14.3%       2.69 ±  3% 
>perf-profile.func.cycles.unix_write_space
>
>
>
>Thanks,
>Xiaolong

Dang...
-- 
Sent from my Android device with K-9 Mail. Please excuse brevity and formatting.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ