lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <202503181447.69ed9a01-lkp@intel.com>
Date: Tue, 18 Mar 2025 14:39:56 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Eric Dumazet <edumazet@...gle.com>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, Jakub Kicinski
	<kuba@...nel.org>, Kuniyuki Iwashima <kuniyu@...zon.com>, Jason Xing
	<kerneljasonxing@...il.com>, <netdev@...r.kernel.org>,
	<oliver.sang@...el.com>
Subject: [linux-next:master] [inet]  9544d60a26:
 stress-ng.sockmany.ops_per_sec 4.5% improvement



Hello,

kernel test robot noticed a 4.5% improvement of stress-ng.sockmany.ops_per_sec on:


commit: 9544d60a2605d1500cf5c3e331a76b9eaf4538c9 ("inet: change lport contribution to inet_ehashfn() and inet6_ehashfn()")
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master


testcase: stress-ng
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 256G memory
parameters:

	nr_threads: 100%
	testtime: 60s
	test: sockmany
	cpufreq_governor: performance



Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250318/202503181447.69ed9a01-lkp@intel.com

=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
  gcc-12/performance/x86_64-rhel-9.4/100%/debian-12-x86_64-20240206.cgz/lkp-spr-r02/sockmany/stress-ng/60s

commit: 
  f8ece40786 ("tcp: bring back NUMA dispersion in inet_ehash_locks_alloc()")
  9544d60a26 ("inet: change lport contribution to inet_ehashfn() and inet6_ehashfn()")

f8ece40786c93422 9544d60a2605d1500cf5c3e331a 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
      0.03 ± 61%     +75.0%       0.06 ± 13%  vmstat.procs.b
    197669 ±  9%      +7.1%     211706        vmstat.system.cs
   3052932 ±  2%      +4.3%    3183417        proc-vmstat.nr_slab_unreclaimable
   2120009            +2.2%    2166756        proc-vmstat.numa_hit
   1888278            +2.1%    1927323        proc-vmstat.numa_local
    303242 ±  3%     +58.5%     480662 ±  2%  sched_debug.cfs_rq:/.avg_vruntime.stddev
      0.17 ±  7%     -16.7%       0.14 ± 11%  sched_debug.cfs_rq:/.h_nr_runnable.stddev
    303242 ±  3%     +58.5%     480662 ±  2%  sched_debug.cfs_rq:/.min_vruntime.stddev
   4336410            +4.5%    4531719        stress-ng.sockmany.ops
     71830            +4.5%      75040        stress-ng.sockmany.ops_per_sec
   7490830 ±  5%      +5.6%    7912072        stress-ng.time.voluntary_context_switches
    688478 ±  2%     -22.0%     537116 ±  3%  perf-c2c.DRAM.local
    612983           -19.5%     493390 ±  3%  perf-c2c.DRAM.remote
     22430 ±  2%    +873.5%     218364 ± 11%  perf-c2c.HITM.local
     23141 ±  2%    +846.6%     219069 ± 11%  perf-c2c.HITM.total
     40.09 ±  4%     -17.0%      33.28 ±  3%  perf-stat.i.MPKI
 1.398e+10 ±  4%     +17.0%  1.636e+10        perf-stat.i.branch-instructions
      2.26            -0.1        2.14        perf-stat.i.branch-miss-rate%
 3.091e+08 ±  4%     +12.1%  3.467e+08        perf-stat.i.branch-misses
     76.11 ±  3%      -9.4       66.74 ±  3%  perf-stat.i.cache-miss-rate%
 3.694e+09 ±  4%     +10.9%  4.096e+09        perf-stat.i.cache-references
      8.50 ±  3%     -11.6%       7.52        perf-stat.i.cpi
  7.47e+10 ±  4%     +16.5%  8.706e+10        perf-stat.i.instructions
     38.96           -18.1%      31.93 ±  3%  perf-stat.overall.MPKI
      2.21            -0.1        2.12        perf-stat.overall.branch-miss-rate%
     78.85           -11.0       67.89 ±  3%  perf-stat.overall.cache-miss-rate%
      8.30           -12.5%       7.27        perf-stat.overall.cpi
    213.10            +6.9%     227.81 ±  2%  perf-stat.overall.cycles-between-cache-misses
      0.12           +14.3%       0.14        perf-stat.overall.ipc
 1.375e+10 ±  4%     +17.0%  1.609e+10        perf-stat.ps.branch-instructions
  3.04e+08 ±  4%     +12.2%   3.41e+08        perf-stat.ps.branch-misses
 3.632e+09 ±  4%     +10.9%  4.028e+09        perf-stat.ps.cache-references
    204551 ±  9%      +6.8%     218435        perf-stat.ps.context-switches
 7.349e+10 ±  4%     +16.5%  8.565e+10        perf-stat.ps.instructions
 4.651e+12           +14.6%  5.328e+12        perf-stat.total.instructions
      1.22 ±111%     -98.2%       0.02 ±223%  perf-sched.sch_delay.avg.ms.__cond_resched.kmem_cache_alloc_noprof.alloc_empty_file.alloc_file_pseudo.sock_alloc_file
      0.55 ± 11%     -39.5%       0.33 ± 43%  perf-sched.sch_delay.avg.ms.schedule_hrtimeout_range.do_poll.constprop.0.do_sys_poll
      1.22 ±111%     -96.5%       0.04 ±223%  perf-sched.sch_delay.max.ms.__cond_resched.kmem_cache_alloc_noprof.alloc_empty_file.alloc_file_pseudo.sock_alloc_file
      3.87 ± 83%    +389.9%      18.96 ± 87%  perf-sched.sch_delay.max.ms.schedule_hrtimeout_range.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
      8.70 ± 30%    +271.2%      32.31 ±109%  perf-sched.sch_delay.max.ms.schedule_timeout.wait_woken.sk_wait_data.tcp_recvmsg_locked
      3.84 ±  5%    +516.8%      23.70 ± 82%  perf-sched.sch_delay.max.ms.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
     15.53           -13.3%      13.47 ±  2%  perf-sched.total_wait_and_delay.average.ms
    234871           +16.6%     273899        perf-sched.total_wait_and_delay.count.ms
     15.48           -13.3%      13.42 ±  2%  perf-sched.total_wait_time.average.ms
    808.31 ± 27%     -42.5%     464.90 ± 49%  perf-sched.wait_and_delay.avg.ms.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
    135.87 ± 16%     -38.9%      83.00 ±  7%  perf-sched.wait_and_delay.avg.ms.schedule_hrtimeout_range.do_poll.constprop.0.do_sys_poll
     10.11           -14.0%       8.69        perf-sched.wait_and_delay.avg.ms.schedule_timeout.inet_csk_accept.inet_accept.do_accept
      4.05 ±  3%    -100.0%       0.00        perf-sched.wait_and_delay.avg.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
    103599           +16.3%     120485        perf-sched.wait_and_delay.count.__cond_resched.__release_sock.release_sock.__inet_stream_connect.inet_stream_connect
     93.17 ± 19%     +67.4%     156.00 ±  7%  perf-sched.wait_and_delay.count.schedule_hrtimeout_range.do_poll.constprop.0.do_sys_poll
    109023 ±  2%     +17.2%     127816        perf-sched.wait_and_delay.count.schedule_timeout.inet_csk_accept.inet_accept.do_accept
      1230 ±  3%    -100.0%       0.00        perf-sched.wait_and_delay.count.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
     15.55 ±106%    -100.0%       0.00        perf-sched.wait_and_delay.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
      9.98           -13.0%       8.68        perf-sched.wait_time.avg.ms.__cond_resched.__release_sock.release_sock.tcp_sendmsg.__sys_sendto
    808.30 ± 27%     -42.5%     464.89 ± 49%  perf-sched.wait_time.avg.ms.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
      1.81 ± 67%   +8639.1%     157.80 ±217%  perf-sched.wait_time.avg.ms.devkmsg_read.vfs_read.ksys_read.do_syscall_64
    135.32 ± 16%     -38.9%      82.67 ±  6%  perf-sched.wait_time.avg.ms.schedule_hrtimeout_range.do_poll.constprop.0.do_sys_poll
     10.09           -14.0%       8.68        perf-sched.wait_time.avg.ms.schedule_timeout.inet_csk_accept.inet_accept.do_accept
      0.03 ± 90%  +1.1e+06%     372.59 ±111%  perf-sched.wait_time.max.ms.__cond_resched.ww_mutex_lock.drm_gem_vunmap_unlocked.drm_gem_fb_vunmap.drm_atomic_helper_commit_planes
      5.22 ± 70%  +15512.3%     815.30 ±205%  perf-sched.wait_time.max.ms.devkmsg_read.vfs_read.ksys_read.do_syscall_64




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ