lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <202403151041.2a9a00df-oliver.sang@intel.com>
Date: Fri, 15 Mar 2024 11:17:26 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Kuniyuki Iwashima <kuniyu@...zon.com>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, <linux-kernel@...r.kernel.org>,
	Jakub Kicinski <kuba@...nel.org>, <netdev@...r.kernel.org>,
	<ying.huang@...el.com>, <feng.tang@...el.com>, <fengwei.yin@...el.com>,
	<oliver.sang@...el.com>
Subject: [linus:master] [af_unix]  d9f21b3613:  stress-ng.sockfd.ops_per_sec
 9.1% improvement



Hello,

kernel test robot noticed a 9.1% improvement of stress-ng.sockfd.ops_per_sec on:


commit: d9f21b3613337b55cc9d4a6ead484dca68475143 ("af_unix: Try to run GC async.")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

testcase: stress-ng
test machine: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 256G memory
parameters:

	nr_threads: 100%
	testtime: 60s
	test: sockfd
	cpufreq_governor: performance






Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20240315/202403151041.2a9a00df-oliver.sang@intel.com

=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
  gcc-12/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-spr-r02/sockfd/stress-ng/60s

commit: 
  8b90a9f819 ("af_unix: Run GC on only one CPU.")
  d9f21b3613 ("af_unix: Try to run GC async.")

8b90a9f819dc2a06 d9f21b3613337b55cc9d4a6ead4 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
     25305 ±  4%      +9.7%      27753 ±  2%  perf-c2c.HITM.total
     64392            +1.8%      65544        vmstat.system.cs
   1926720            +1.4%    1954260        proc-vmstat.numa_hit
   1694682            +1.5%    1719926        proc-vmstat.numa_local
   3151070            +3.4%    3257664        proc-vmstat.pgalloc_normal
      0.28 ±  8%     -15.0%       0.24 ±  9%  sched_debug.cfs_rq:/.h_nr_running.stddev
    259.21 ±  7%     -12.9%     225.86 ±  6%  sched_debug.cfs_rq:/.runnable_avg.stddev
     23.78 ± 13%     -20.9%      18.80 ± 27%  sched_debug.cpu.clock.stddev
  50265901            +9.1%   54861338        stress-ng.sockfd.ops
    837446            +9.1%     913917        stress-ng.sockfd.ops_per_sec
   2293458            -2.8%    2230066        stress-ng.time.involuntary_context_switches
   1581490            +8.1%    1709261        stress-ng.time.voluntary_context_switches
  26480342            +4.2%   27595498        perf-stat.i.cache-misses
  90320805            +3.9%   93807170        perf-stat.i.cache-references
      9.86            -1.7%       9.70        perf-stat.i.cpi
     25274            -5.1%      23975        perf-stat.i.cycles-between-cache-misses
 6.498e+10            +1.1%  6.571e+10        perf-stat.i.instructions
      0.11            +1.7%       0.11        perf-stat.i.ipc
     10.00            -1.7%       9.83        perf-stat.overall.cpi
     24733            -4.7%      23575        perf-stat.overall.cycles-between-cache-misses
      0.10            +1.7%       0.10        perf-stat.overall.ipc
 1.438e+10            +1.3%  1.458e+10        perf-stat.ps.branch-instructions
  24920120            +4.9%   26142747        perf-stat.ps.cache-misses
  86987270            +4.5%   90934893        perf-stat.ps.cache-references
 6.162e+10            +1.7%  6.268e+10        perf-stat.ps.instructions
 3.698e+12            +2.2%  3.781e+12        perf-stat.total.instructions
     66.00 ± 70%     -49.5       16.45 ±223%  perf-profile.calltrace.cycles-pp.stress_sockfd
     33.12 ± 70%     -24.9        8.24 ±223%  perf-profile.calltrace.cycles-pp.sendmsg.stress_sockfd
     33.08 ± 70%     -24.9        8.23 ±223%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.sendmsg.stress_sockfd
     33.08 ± 70%     -24.9        8.23 ±223%  perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.sendmsg.stress_sockfd
     33.05 ± 70%     -24.8        8.22 ±223%  perf-profile.calltrace.cycles-pp.__sys_sendmsg.do_syscall_64.entry_SYSCALL_64_after_hwframe.sendmsg.stress_sockfd
     33.04 ± 70%     -24.8        8.22 ±223%  perf-profile.calltrace.cycles-pp.___sys_sendmsg.__sys_sendmsg.do_syscall_64.entry_SYSCALL_64_after_hwframe.sendmsg
     32.99 ± 70%     -24.8        8.20 ±223%  perf-profile.calltrace.cycles-pp.____sys_sendmsg.___sys_sendmsg.__sys_sendmsg.do_syscall_64.entry_SYSCALL_64_after_hwframe
     32.95 ± 70%     -24.8        8.19 ±223%  perf-profile.calltrace.cycles-pp.unix_stream_sendmsg.____sys_sendmsg.___sys_sendmsg.__sys_sendmsg.do_syscall_64
     32.67 ± 70%     -24.5        8.16 ±223%  perf-profile.calltrace.cycles-pp.recvmsg.stress_sockfd
     32.65 ± 70%     -24.5        8.15 ±223%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.recvmsg.stress_sockfd
     32.65 ± 70%     -24.5        8.15 ±223%  perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.recvmsg.stress_sockfd
     32.64 ± 70%     -24.5        8.14 ±223%  perf-profile.calltrace.cycles-pp.__sys_recvmsg.do_syscall_64.entry_SYSCALL_64_after_hwframe.recvmsg.stress_sockfd
     32.63 ± 70%     -24.5        8.14 ±223%  perf-profile.calltrace.cycles-pp.___sys_recvmsg.__sys_recvmsg.do_syscall_64.entry_SYSCALL_64_after_hwframe.recvmsg
     32.60 ± 70%     -24.5        8.14 ±223%  perf-profile.calltrace.cycles-pp.____sys_recvmsg.___sys_recvmsg.__sys_recvmsg.do_syscall_64.entry_SYSCALL_64_after_hwframe
     32.60 ± 70%     -24.5        8.14 ±223%  perf-profile.calltrace.cycles-pp.sock_recvmsg.____sys_recvmsg.___sys_recvmsg.__sys_recvmsg.do_syscall_64
     32.59 ± 70%     -24.5        8.13 ±223%  perf-profile.calltrace.cycles-pp.unix_stream_recvmsg.sock_recvmsg.____sys_recvmsg.___sys_recvmsg.__sys_recvmsg
     32.58 ± 70%     -24.5        8.13 ±223%  perf-profile.calltrace.cycles-pp.unix_stream_read_generic.unix_stream_recvmsg.sock_recvmsg.____sys_recvmsg.___sys_recvmsg
     32.51 ± 70%     -24.4        8.10 ±223%  perf-profile.calltrace.cycles-pp.unix_scm_to_skb.unix_stream_sendmsg.____sys_sendmsg.___sys_sendmsg.__sys_sendmsg
     32.51 ± 70%     -24.4        8.10 ±223%  perf-profile.calltrace.cycles-pp.unix_attach_fds.unix_scm_to_skb.unix_stream_sendmsg.____sys_sendmsg.___sys_sendmsg
     32.44 ± 70%     -24.4        8.07 ±223%  perf-profile.calltrace.cycles-pp.unix_inflight.unix_attach_fds.unix_scm_to_skb.unix_stream_sendmsg.____sys_sendmsg
     32.43 ± 70%     -24.4        8.07 ±223%  perf-profile.calltrace.cycles-pp._raw_spin_lock.unix_inflight.unix_attach_fds.unix_scm_to_skb.unix_stream_sendmsg
     32.37 ± 70%     -24.3        8.06 ±223%  perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.unix_inflight.unix_attach_fds.unix_scm_to_skb
     32.31 ± 70%     -24.2        8.06 ±223%  perf-profile.calltrace.cycles-pp.unix_detach_fds.unix_stream_read_generic.unix_stream_recvmsg.sock_recvmsg.____sys_recvmsg
     32.30 ± 70%     -24.2        8.06 ±223%  perf-profile.calltrace.cycles-pp.unix_notinflight.unix_detach_fds.unix_stream_read_generic.unix_stream_recvmsg.sock_recvmsg
     32.30 ± 70%     -24.2        8.06 ±223%  perf-profile.calltrace.cycles-pp._raw_spin_lock.unix_notinflight.unix_detach_fds.unix_stream_read_generic.unix_stream_recvmsg
     32.23 ± 70%     -24.2        8.04 ±223%  perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.unix_notinflight.unix_detach_fds.unix_stream_read_generic
     66.37 ± 70%     -49.8       16.57 ±223%  perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
     66.36 ± 70%     -49.8       16.56 ±223%  perf-profile.children.cycles-pp.do_syscall_64
     66.00 ± 70%     -49.5       16.45 ±223%  perf-profile.children.cycles-pp.stress_sockfd
     64.86 ± 70%     -48.7       16.17 ±223%  perf-profile.children.cycles-pp._raw_spin_lock
     64.64 ± 70%     -48.5       16.11 ±223%  perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
     33.13 ± 70%     -24.9        8.24 ±223%  perf-profile.children.cycles-pp.sendmsg
     33.06 ± 70%     -24.8        8.22 ±223%  perf-profile.children.cycles-pp.__sys_sendmsg
     33.04 ± 70%     -24.8        8.22 ±223%  perf-profile.children.cycles-pp.___sys_sendmsg
     32.99 ± 70%     -24.8        8.20 ±223%  perf-profile.children.cycles-pp.____sys_sendmsg
     32.95 ± 70%     -24.8        8.19 ±223%  perf-profile.children.cycles-pp.unix_stream_sendmsg
     32.68 ± 70%     -24.5        8.16 ±223%  perf-profile.children.cycles-pp.recvmsg
     32.64 ± 70%     -24.5        8.15 ±223%  perf-profile.children.cycles-pp.__sys_recvmsg
     32.63 ± 70%     -24.5        8.14 ±223%  perf-profile.children.cycles-pp.___sys_recvmsg
     32.61 ± 70%     -24.5        8.14 ±223%  perf-profile.children.cycles-pp.____sys_recvmsg
     32.60 ± 70%     -24.5        8.14 ±223%  perf-profile.children.cycles-pp.sock_recvmsg
     32.59 ± 70%     -24.5        8.13 ±223%  perf-profile.children.cycles-pp.unix_stream_read_generic
     32.59 ± 70%     -24.5        8.13 ±223%  perf-profile.children.cycles-pp.unix_stream_recvmsg
     32.51 ± 70%     -24.4        8.10 ±223%  perf-profile.children.cycles-pp.unix_scm_to_skb
     32.51 ± 70%     -24.4        8.10 ±223%  perf-profile.children.cycles-pp.unix_attach_fds
     32.44 ± 70%     -24.4        8.07 ±223%  perf-profile.children.cycles-pp.unix_inflight
     32.31 ± 70%     -24.2        8.06 ±223%  perf-profile.children.cycles-pp.unix_detach_fds
     32.30 ± 70%     -24.2        8.06 ±223%  perf-profile.children.cycles-pp.unix_notinflight
     64.36 ± 70%     -48.3       16.04 ±223%  perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ