lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <202504101443.bc7b7079-lkp@intel.com>
Date: Sat, 12 Apr 2025 15:45:57 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Eric Dumazet <edumazet@...gle.com>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, <linux-kernel@...r.kernel.org>,
	Jakub Kicinski <kuba@...nel.org>, Kuniyuki Iwashima <kuniyu@...zon.com>,
	Jason Xing <kerneljasonxing@...il.com>, <netdev@...r.kernel.org>,
	<oliver.sang@...el.com>
Subject: [linus:master] [inet]  d4438ce68b:  stress-ng.sockmany.ops_per_sec
 40.4% improvement



Hello,

kernel test robot noticed a 40.4% improvement of stress-ng.sockmany.ops_per_sec on:


commit: d4438ce68bf145aa1d7ed03ebf3b8ece337e3f64 ("inet: call inet6_ehashfn() once from inet6_hash_connect()")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

testcase: stress-ng
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
parameters:

	nr_threads: 100%
	testtime: 60s
	test: sockmany
	cpufreq_governor: performance



Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250410/202504101443.bc7b7079-lkp@intel.com

=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
  gcc-12/performance/x86_64-rhel-9.4/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp8/sockmany/stress-ng/60s

commit: 
  9544d60a26 ("inet: change lport contribution to inet_ehashfn() and inet6_ehashfn()")
  d4438ce68b ("inet: call inet6_ehashfn() once from inet6_hash_connect()")

9544d60a2605d150 d4438ce68bf145aa1d7ed03ebf3 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
     66811 ±  2%     +37.2%      91683 ±  2%  vmstat.system.cs
      2.92 ± 38%      +0.7        3.62 ± 21%  mpstat.cpu.all.idle%
      0.78 ±  6%      +0.3        1.09 ±  2%  mpstat.cpu.all.soft%
     89855 ± 45%     -58.3%      37495 ± 96%  numa-meminfo.node0.Mapped
    181411 ± 29%     +50.9%     273696 ± 11%  numa-meminfo.node1.Shmem
    433609 ±  6%     +25.7%     545181 ±  4%  numa-numastat.node1.local_node
    462621 ±  5%     +24.1%     574183 ±  4%  numa-numastat.node1.numa_hit
    199550 ±  3%     -32.9%     133991 ±  5%  perf-c2c.DRAM.local
    141678 ±  5%     -34.2%      93183 ±  7%  perf-c2c.DRAM.remote
     45462 ± 29%     +50.9%      68581 ± 10%  numa-vmstat.node1.nr_shmem
    462094 ±  5%     +24.1%     573303 ±  4%  numa-vmstat.node1.numa_hit
    433089 ±  6%     +25.7%     544302 ±  4%  numa-vmstat.node1.numa_local
   2029078 ±  2%     +40.4%    2847905 ±  2%  stress-ng.sockmany.ops
     33711 ±  2%     +40.4%      47329 ±  2%  stress-ng.sockmany.ops_per_sec
   2010826 ±  2%     +40.8%    2831055 ±  2%  stress-ng.time.involuntary_context_switches
   2122618 ±  2%     +38.3%    2934839 ±  2%  stress-ng.time.voluntary_context_switches
     87035            +6.7%      92859        proc-vmstat.nr_shmem
    336264            +2.0%     343096        proc-vmstat.nr_slab_reclaimable
    938906           +12.1%    1052474        proc-vmstat.numa_hit
    872687           +13.0%     986265        proc-vmstat.numa_local
   2896245           +19.6%    3464080        proc-vmstat.pgalloc_normal
    305944            +1.9%     311839        proc-vmstat.pgfault
   2751656           +20.4%    3312497        proc-vmstat.pgfree
     26462 ± 10%     -29.0%      18788 ±  9%  sched_debug.cfs_rq:/.avg_vruntime.stddev
     26462 ± 10%     -29.0%      18788 ±  9%  sched_debug.cfs_rq:/.min_vruntime.stddev
      3.49 ± 14%     +28.9%       4.50 ±  9%  sched_debug.cpu.clock.stddev
     34426 ±  2%     +36.3%      46926 ±  2%  sched_debug.cpu.nr_switches.avg
     64283 ±  6%     +28.9%      82835 ±  8%  sched_debug.cpu.nr_switches.max
      9571 ±  9%     +39.4%      13341 ± 10%  sched_debug.cpu.nr_switches.stddev
      4.20 ± 10%     +19.0%       5.00 ± 10%  sched_debug.cpu.nr_uninterruptible.stddev
     18.34           +35.6%      24.87        perf-stat.i.MPKI
 9.043e+09 ±  2%     +28.2%  1.159e+10        perf-stat.i.branch-instructions
      1.91            -0.1        1.82        perf-stat.i.branch-miss-rate%
 1.741e+08           +20.4%  2.095e+08 ±  2%  perf-stat.i.branch-misses
 9.167e+08           +38.9%  1.274e+09        perf-stat.i.cache-misses
 1.555e+09           +38.0%  2.146e+09        perf-stat.i.cache-references
     69950 ±  2%     +37.7%      96349 ±  2%  perf-stat.i.context-switches
    258.20           -26.5%     189.89        perf-stat.i.cycles-between-cache-misses
      0.04 ± 41%     -54.6%       0.02 ± 57%  perf-stat.i.major-faults
      0.60 ±  6%     +83.8%       1.11 ±  5%  perf-stat.i.metric.K/sec
      0.16 ± 78%    +175.0%       0.43 ± 25%  perf-sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown]
      0.01 ±  3%     -17.5%       0.01        perf-sched.sch_delay.avg.ms.schedule_timeout.inet_csk_accept.inet_accept.do_accept
      0.29 ± 14%    +132.4%       0.67 ± 21%  perf-sched.sch_delay.avg.ms.schedule_timeout.wait_woken.sk_wait_data.tcp_recvmsg_locked
     11.50 ± 71%     -76.7%       2.68 ± 69%  perf-sched.sch_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
      0.05           -15.8%       0.04        perf-sched.total_sch_delay.average.ms
      7.99           -19.3%       6.45 ±  2%  perf-sched.total_wait_and_delay.average.ms
    162716 ±  2%     +25.0%     203451 ±  2%  perf-sched.total_wait_and_delay.count.ms
      7.94           -19.3%       6.41 ±  2%  perf-sched.total_wait_time.average.ms
      0.09 ±  8%     -13.2%       0.08 ±  3%  perf-sched.wait_and_delay.avg.ms.__cond_resched.__release_sock.release_sock.__inet_stream_connect.inet_stream_connect
    149.92 ±  2%     -16.0%     125.93 ±  4%  perf-sched.wait_and_delay.avg.ms.schedule_hrtimeout_range.do_poll.constprop.0.do_sys_poll
      4.23 ±  2%     -21.9%       3.31 ±  2%  perf-sched.wait_and_delay.avg.ms.schedule_timeout.inet_csk_accept.inet_accept.do_accept
      0.58 ± 14%    +131.2%       1.34 ± 20%  perf-sched.wait_and_delay.avg.ms.schedule_timeout.wait_woken.sk_wait_data.tcp_recvmsg_locked
    246.84 ±  8%     +26.7%     312.83 ±  5%  perf-sched.wait_and_delay.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
      5361 ±  3%     -13.8%       4621 ±  2%  perf-sched.wait_and_delay.count.__cond_resched.__inet_hash_connect.tcp_v4_connect.__inet_stream_connect.inet_stream_connect
     68917 ±  3%     +36.0%      93723 ±  2%  perf-sched.wait_and_delay.count.__cond_resched.__release_sock.release_sock.__inet_stream_connect.inet_stream_connect
     91.00 ±  3%     +25.1%     113.83 ±  2%  perf-sched.wait_and_delay.count.schedule_hrtimeout_range.do_poll.constprop.0.do_sys_poll
     75346 ±  2%     +28.1%      96486 ±  2%  perf-sched.wait_and_delay.count.schedule_timeout.inet_csk_accept.inet_accept.do_accept
      6441 ± 11%     -56.8%       2780 ± 23%  perf-sched.wait_and_delay.count.schedule_timeout.wait_woken.sk_wait_data.tcp_recvmsg_locked
      2275 ±  9%     -20.0%       1820 ±  6%  perf-sched.wait_and_delay.count.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
      1771 ±  6%     -41.4%       1038 ± 55%  perf-sched.wait_and_delay.max.ms.__cond_resched.generic_perform_write.shmem_file_write_iter.vfs_write.ksys_write
      1.62 ± 77%     -65.0%       0.57 ±123%  perf-sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
      0.16 ± 78%    +175.0%       0.43 ± 25%  perf-sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown]
    149.66 ±  2%     -16.0%     125.77 ±  4%  perf-sched.wait_time.avg.ms.schedule_hrtimeout_range.do_poll.constprop.0.do_sys_poll
      4.22 ±  2%     -21.9%       3.29 ±  2%  perf-sched.wait_time.avg.ms.schedule_timeout.inet_csk_accept.inet_accept.do_accept
      0.29 ± 14%    +129.7%       0.67 ± 20%  perf-sched.wait_time.avg.ms.schedule_timeout.wait_woken.sk_wait_data.tcp_recvmsg_locked
    246.83 ±  8%     +26.7%     312.83 ±  5%  perf-sched.wait_time.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
      1771 ±  6%     -41.4%       1038 ± 55%  perf-sched.wait_time.max.ms.__cond_resched.generic_perform_write.shmem_file_write_iter.vfs_write.ksys_write
     11.50 ± 71%     -78.6%       2.46 ± 76%  perf-sched.wait_time.max.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ