[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <202503171623.f2e16b60-lkp@intel.com>
Date: Mon, 17 Mar 2025 21:44:54 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Eric Dumazet <edumazet@...gle.com>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, <netdev@...r.kernel.org>,
"David S . Miller" <davem@...emloft.net>, Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>, Neal Cardwell <ncardwell@...gle.com>,
Kuniyuki Iwashima <kuniyu@...zon.com>, Jason Xing <kernelxing@...cent.com>,
Simon Horman <horms@...nel.org>, <eric.dumazet@...il.com>, Eric Dumazet
<edumazet@...gle.com>, <oliver.sang@...el.com>
Subject: Re: [PATCH net-next 1/2] inet: change lport contribution to
inet_ehashfn() and inet6_ehashfn()
Hello,
kernel test robot noticed a 26.0% improvement of stress-ng.sockmany.ops_per_sec on:
commit: 265acc444f8a96246e9d42b54b6931d078034218 ("[PATCH net-next 1/2] inet: change lport contribution to inet_ehashfn() and inet6_ehashfn()")
url: https://github.com/intel-lab-lkp/linux/commits/Eric-Dumazet/inet-change-lport-contribution-to-inet_ehashfn-and-inet6_ehashfn/20250305-114734
base: https://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git f252f23ab657cd224cb8334ba69966396f3f629b
patch link: https://lore.kernel.org/all/20250305034550.879255-2-edumazet@google.com/
patch subject: [PATCH net-next 1/2] inet: change lport contribution to inet_ehashfn() and inet6_ehashfn()
testcase: stress-ng
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
parameters:
nr_threads: 100%
testtime: 60s
test: sockmany
cpufreq_governor: performance
In addition to that, the commit also has significant impact on the following tests:
+------------------+---------------------------------------------------------------------------------------------+
| testcase: change | stress-ng: stress-ng.sockmany.ops_per_sec 4.4% improvement |
| test machine | 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 256G memory |
| test parameters | cpufreq_governor=performance |
| | nr_threads=100% |
| | test=sockmany |
| | testtime=60s |
+------------------+---------------------------------------------------------------------------------------------+
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250317/202503171623.f2e16b60-lkp@intel.com
=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
gcc-12/performance/x86_64-rhel-9.4/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp8/sockmany/stress-ng/60s
commit:
f252f23ab6 ("net: Prevent use after free in netif_napi_set_irq_locked()")
265acc444f ("inet: change lport contribution to inet_ehashfn() and inet6_ehashfn()")
f252f23ab657cd22 265acc444f8a96246e9d42b54b6
---------------- ---------------------------
%stddev %change %stddev
\ | \
0.60 ± 6% +0.2 0.75 ± 6% mpstat.cpu.all.soft%
376850 ± 9% +15.7% 436068 ± 9% numa-numastat.node0.local_node
376612 ± 9% +15.8% 435968 ± 9% numa-vmstat.node0.numa_local
54708 +22.0% 66753 ± 2% vmstat.system.cs
2308 +1167.7% 29267 ± 26% perf-c2c.HITM.local
2499 +1078.3% 29447 ± 26% perf-c2c.HITM.total
1413 ± 8% -13.8% 1218 ± 4% sched_debug.cfs_rq:/.runnable_avg.max
28302 +21.2% 34303 ± 2% sched_debug.cpu.nr_switches.avg
39625 ± 6% +63.4% 64761 ± 6% sched_debug.cpu.nr_switches.max
4170 ± 9% +126.1% 9429 ± 8% sched_debug.cpu.nr_switches.stddev
1606932 +25.9% 2023746 ± 3% stress-ng.sockmany.ops
26687 +26.0% 33624 ± 3% stress-ng.sockmany.ops_per_sec
1561801 +28.1% 2000939 ± 3% stress-ng.time.involuntary_context_switches
1731525 +22.3% 2118259 ± 2% stress-ng.time.voluntary_context_switches
84783 +2.6% 86953 proc-vmstat.nr_shmem
5339 ± 6% -26.4% 3931 ± 16% proc-vmstat.numa_hint_faults_local
878479 +6.8% 937819 proc-vmstat.numa_hit
812262 +7.3% 871615 proc-vmstat.numa_local
2550690 +12.5% 2870404 proc-vmstat.pgalloc_normal
2407108 +13.2% 2724922 proc-vmstat.pgfree
21.96 -17.2% 18.18 ± 2% perf-stat.i.MPKI
7.517e+09 +18.8% 8.933e+09 perf-stat.i.branch-instructions
2.70 -0.7 1.96 perf-stat.i.branch-miss-rate%
2.03e+08 -13.1% 1.765e+08 perf-stat.i.branch-misses
60.22 -2.3 57.89 ± 2% perf-stat.i.cache-miss-rate%
1.472e+09 +4.7% 1.542e+09 perf-stat.i.cache-references
56669 +22.3% 69301 ± 2% perf-stat.i.context-switches
5.56 -18.4% 4.53 ± 2% perf-stat.i.cpi
4.24e+10 +19.2% 5.054e+10 perf-stat.i.instructions
0.20 +20.1% 0.24 ± 4% perf-stat.i.ipc
0.49 +21.0% 0.60 ± 8% perf-stat.i.metric.K/sec
21.03 -15.1% 17.85 perf-stat.overall.MPKI
2.70 -0.7 1.98 perf-stat.overall.branch-miss-rate%
60.56 -2.1 58.49 perf-stat.overall.cache-miss-rate%
5.34 -16.6% 4.45 perf-stat.overall.cpi
253.77 -1.7% 249.50 perf-stat.overall.cycles-between-cache-misses
0.19 +19.9% 0.22 perf-stat.overall.ipc
7.395e+09 +18.9% 8.789e+09 perf-stat.ps.branch-instructions
1.997e+08 -13.0% 1.737e+08 perf-stat.ps.branch-misses
1.448e+09 +4.7% 1.517e+09 perf-stat.ps.cache-references
55820 +22.2% 68204 ± 2% perf-stat.ps.context-switches
4.172e+10 +19.2% 4.972e+10 perf-stat.ps.instructions
2.556e+12 +20.2% 3.072e+12 ± 2% perf-stat.total.instructions
0.35 ± 9% -14.9% 0.29 ± 6% perf-sched.sch_delay.avg.ms.__cond_resched.__inet_hash_connect.tcp_v4_connect.__inet_stream_connect.inet_stream_connect
0.06 ± 7% -20.5% 0.04 ± 4% perf-sched.sch_delay.avg.ms.__cond_resched.__release_sock.release_sock.__inet_stream_connect.inet_stream_connect
0.16 ±218% +798.3% 1.44 ± 40% perf-sched.sch_delay.avg.ms.__cond_resched.kmem_cache_alloc_noprof.alloc_empty_file.alloc_file_pseudo.sock_alloc_file
0.25 ±152% +291.3% 0.99 ± 45% perf-sched.sch_delay.avg.ms.__cond_resched.kmem_cache_alloc_noprof.security_inode_alloc.inode_init_always_gfp.alloc_inode
0.11 ±166% +568.2% 0.75 ± 45% perf-sched.sch_delay.avg.ms.__cond_resched.lock_sock_nested.inet_stream_connect.__sys_connect.__x64_sys_connect
0.84 ± 14% +39.2% 1.17 ± 9% perf-sched.sch_delay.avg.ms.__cond_resched.stop_one_cpu.sched_exec.bprm_execve.part
0.11 ± 22% +108.5% 0.23 ± 12% perf-sched.sch_delay.avg.ms.schedule_timeout.wait_woken.sk_wait_data.tcp_recvmsg_locked
0.08 ± 59% -60.0% 0.03 ± 4% perf-sched.sch_delay.max.ms.__cond_resched.__release_sock.release_sock.tcp_sendmsg.__sys_sendto
0.16 ±218% +1286.4% 2.22 ± 25% perf-sched.sch_delay.max.ms.__cond_resched.kmem_cache_alloc_noprof.alloc_empty_file.alloc_file_pseudo.sock_alloc_file
0.13 ±153% +910.1% 1.27 ± 34% perf-sched.sch_delay.max.ms.__cond_resched.lock_sock_nested.inet_stream_connect.__sys_connect.__x64_sys_connect
9.23 -12.5% 8.08 perf-sched.total_wait_and_delay.average.ms
139892 +15.3% 161338 perf-sched.total_wait_and_delay.count.ms
9.18 -12.5% 8.03 perf-sched.total_wait_time.average.ms
0.70 ± 8% -14.5% 0.60 ± 6% perf-sched.wait_and_delay.avg.ms.__cond_resched.__inet_hash_connect.tcp_v4_connect.__inet_stream_connect.inet_stream_connect
0.11 ± 8% -20.1% 0.09 ± 4% perf-sched.wait_and_delay.avg.ms.__cond_resched.__release_sock.release_sock.__inet_stream_connect.inet_stream_connect
429.48 ± 44% +63.6% 702.60 ± 11% perf-sched.wait_and_delay.avg.ms.devkmsg_read.vfs_read.ksys_read.do_syscall_64
4.97 -14.0% 4.28 perf-sched.wait_and_delay.avg.ms.schedule_timeout.inet_csk_accept.inet_accept.do_accept
0.23 ± 21% +104.2% 0.46 ± 12% perf-sched.wait_and_delay.avg.ms.schedule_timeout.wait_woken.sk_wait_data.tcp_recvmsg_locked
48576 ± 5% +36.3% 66215 ± 2% perf-sched.wait_and_delay.count.__cond_resched.__release_sock.release_sock.__inet_stream_connect.inet_stream_connect
81.83 +9.8% 89.83 ± 2% perf-sched.wait_and_delay.count.schedule_hrtimeout_range.do_poll.constprop.0.do_sys_poll
64098 +16.3% 74560 perf-sched.wait_and_delay.count.schedule_timeout.inet_csk_accept.inet_accept.do_accept
15531 ± 17% -46.2% 8355 ± 6% perf-sched.wait_and_delay.count.schedule_timeout.wait_woken.sk_wait_data.tcp_recvmsg_locked
0.36 ± 8% -14.2% 0.31 ± 6% perf-sched.wait_time.avg.ms.__cond_resched.__inet_hash_connect.tcp_v4_connect.__inet_stream_connect.inet_stream_connect
0.06 ± 7% -20.2% 0.04 ± 4% perf-sched.wait_time.avg.ms.__cond_resched.__release_sock.release_sock.__inet_stream_connect.inet_stream_connect
0.04 ±178% -94.4% 0.00 ±130% perf-sched.wait_time.avg.ms.__cond_resched.down_write_killable.exec_mmap.begin_new_exec.load_elf_binary
0.16 ±218% +798.5% 1.44 ± 40% perf-sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc_noprof.alloc_empty_file.alloc_file_pseudo.sock_alloc_file
0.11 ±166% +568.6% 0.75 ± 45% perf-sched.wait_time.avg.ms.__cond_resched.lock_sock_nested.inet_stream_connect.__sys_connect.__x64_sys_connect
427.69 ± 45% +63.1% 697.48 ± 10% perf-sched.wait_time.avg.ms.devkmsg_read.vfs_read.ksys_read.do_syscall_64
4.95 -14.0% 4.26 perf-sched.wait_time.avg.ms.schedule_timeout.inet_csk_accept.inet_accept.do_accept
0.12 ± 20% +99.9% 0.23 ± 12% perf-sched.wait_time.avg.ms.schedule_timeout.wait_woken.sk_wait_data.tcp_recvmsg_locked
0.16 ±218% +1286.4% 2.22 ± 25% perf-sched.wait_time.max.ms.__cond_resched.kmem_cache_alloc_noprof.alloc_empty_file.alloc_file_pseudo.sock_alloc_file
0.13 ±153% +911.4% 1.27 ± 34% perf-sched.wait_time.max.ms.__cond_resched.lock_sock_nested.inet_stream_connect.__sys_connect.__x64_sys_connect
***************************************************************************************************
lkp-spr-r02: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 256G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
gcc-12/performance/x86_64-rhel-9.4/100%/debian-12-x86_64-20240206.cgz/lkp-spr-r02/sockmany/stress-ng/60s
commit:
f252f23ab6 ("net: Prevent use after free in netif_napi_set_irq_locked()")
265acc444f ("inet: change lport contribution to inet_ehashfn() and inet6_ehashfn()")
f252f23ab657cd22 265acc444f8a96246e9d42b54b6
---------------- ---------------------------
%stddev %change %stddev
\ | \
205766 +3.2% 212279 vmstat.system.cs
309724 ± 5% +63.6% 506684 ± 9% sched_debug.cfs_rq:/.avg_vruntime.stddev
309724 ± 5% +63.6% 506684 ± 9% sched_debug.cfs_rq:/.min_vruntime.stddev
1307371 ± 8% -14.5% 1117523 ± 7% sched_debug.cpu.avg_idle.max
4333131 +4.4% 4525951 stress-ng.sockmany.ops
71816 +4.4% 74988 stress-ng.sockmany.ops_per_sec
7639150 +3.6% 7910527 stress-ng.time.voluntary_context_switches
693603 -18.6% 564616 ± 3% perf-c2c.DRAM.local
611374 -16.8% 508688 ± 2% perf-c2c.DRAM.remote
19509 +994.2% 213470 ± 7% perf-c2c.HITM.local
20252 +957.6% 214187 ± 7% perf-c2c.HITM.total
204521 +3.1% 210765 proc-vmstat.nr_shmem
938137 +2.9% 965493 proc-vmstat.nr_slab_reclaimable
3102658 +3.0% 3196837 proc-vmstat.nr_slab_unreclaimable
2113801 +1.8% 2151131 proc-vmstat.numa_hit
1881174 +2.0% 1919223 proc-vmstat.numa_local
6186586 +3.6% 6406837 proc-vmstat.pgalloc_normal
0.76 ± 46% -83.0% 0.13 ±144% perf-sched.sch_delay.avg.ms.__cond_resched.mutex_lock.perf_poll.do_poll.constprop
0.02 ± 2% -6.3% 0.02 ± 2% perf-sched.sch_delay.avg.ms.schedule_timeout.inet_csk_accept.inet_accept.do_accept
15.43 -12.6% 13.48 perf-sched.total_wait_and_delay.average.ms
234971 +15.6% 271684 perf-sched.total_wait_and_delay.count.ms
15.37 -12.6% 13.43 perf-sched.total_wait_time.average.ms
140.18 ± 5% -37.2% 88.02 ± 11% perf-sched.wait_and_delay.avg.ms.schedule_hrtimeout_range.do_poll.constprop.0.do_sys_poll
10.17 -14.1% 8.74 perf-sched.wait_and_delay.avg.ms.schedule_timeout.inet_csk_accept.inet_accept.do_accept
4.02 -100.0% 0.00 perf-sched.wait_and_delay.avg.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
104089 +16.4% 121193 perf-sched.wait_and_delay.count.__cond_resched.__release_sock.release_sock.__inet_stream_connect.inet_stream_connect
88.17 ± 6% +68.1% 148.17 ± 13% perf-sched.wait_and_delay.count.schedule_hrtimeout_range.do_poll.constprop.0.do_sys_poll
108724 +16.8% 127034 perf-sched.wait_and_delay.count.schedule_timeout.inet_csk_accept.inet_accept.do_accept
1232 -100.0% 0.00 perf-sched.wait_and_delay.count.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
4592 ± 12% +26.1% 5792 ± 14% perf-sched.wait_and_delay.count.schedule_timeout.wait_woken.sk_wait_data.tcp_recvmsg_locked
11.29 ± 68% -100.0% 0.00 perf-sched.wait_and_delay.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
9.99 -13.3% 8.66 perf-sched.wait_time.avg.ms.__cond_resched.__release_sock.release_sock.tcp_sendmsg.__sys_sendto
139.53 ± 6% -37.2% 87.60 ± 11% perf-sched.wait_time.avg.ms.schedule_hrtimeout_range.do_poll.constprop.0.do_sys_poll
10.15 -14.1% 8.72 perf-sched.wait_time.avg.ms.schedule_timeout.inet_csk_accept.inet_accept.do_accept
41.10 -17.2% 34.03 perf-stat.i.MPKI
1.424e+10 +14.6% 1.631e+10 perf-stat.i.branch-instructions
2.28 -0.1 2.17 perf-stat.i.branch-miss-rate%
3.193e+08 +9.4% 3.492e+08 perf-stat.i.branch-misses
77.01 -9.5 67.48 perf-stat.i.cache-miss-rate%
2.981e+09 -5.1% 2.83e+09 perf-stat.i.cache-misses
3.806e+09 +8.4% 4.127e+09 perf-stat.i.cache-references
217129 +3.2% 224056 perf-stat.i.context-switches
8.68 -12.7% 7.58 perf-stat.i.cpi
242.24 +4.0% 251.97 perf-stat.i.cycles-between-cache-misses
7.608e+10 +14.1% 8.679e+10 perf-stat.i.instructions
0.13 +13.3% 0.15 perf-stat.i.ipc
39.15 -16.8% 32.58 perf-stat.overall.MPKI
2.24 -0.1 2.14 perf-stat.overall.branch-miss-rate%
78.30 -9.7 68.56 perf-stat.overall.cache-miss-rate%
8.35 -12.4% 7.31 perf-stat.overall.cpi
213.17 +5.3% 224.53 perf-stat.overall.cycles-between-cache-misses
0.12 +14.1% 0.14 perf-stat.overall.ipc
1.401e+10 +14.6% 1.604e+10 perf-stat.ps.branch-instructions
3.139e+08 +9.4% 3.434e+08 perf-stat.ps.branch-misses
2.931e+09 -5.1% 2.782e+09 perf-stat.ps.cache-misses
3.743e+09 +8.4% 4.058e+09 perf-stat.ps.cache-references
213541 +3.3% 220574 perf-stat.ps.context-switches
7.485e+10 +14.1% 8.539e+10 perf-stat.ps.instructions
4.597e+12 +13.9% 5.235e+12 perf-stat.total.instructions
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
Powered by blists - more mailing lists