[<prev] [next>] [day] [month] [year] [list]
Message-ID: <202504101443.bc7b7079-lkp@intel.com>
Date: Sat, 12 Apr 2025 15:45:57 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Eric Dumazet <edumazet@...gle.com>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, <linux-kernel@...r.kernel.org>,
Jakub Kicinski <kuba@...nel.org>, Kuniyuki Iwashima <kuniyu@...zon.com>,
Jason Xing <kerneljasonxing@...il.com>, <netdev@...r.kernel.org>,
<oliver.sang@...el.com>
Subject: [linus:master] [inet] d4438ce68b: stress-ng.sockmany.ops_per_sec
40.4% improvement
Hello,
kernel test robot noticed a 40.4% improvement of stress-ng.sockmany.ops_per_sec on:
commit: d4438ce68bf145aa1d7ed03ebf3b8ece337e3f64 ("inet: call inet6_ehashfn() once from inet6_hash_connect()")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
testcase: stress-ng
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
parameters:
nr_threads: 100%
testtime: 60s
test: sockmany
cpufreq_governor: performance
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250410/202504101443.bc7b7079-lkp@intel.com
=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
gcc-12/performance/x86_64-rhel-9.4/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp8/sockmany/stress-ng/60s
commit:
9544d60a26 ("inet: change lport contribution to inet_ehashfn() and inet6_ehashfn()")
d4438ce68b ("inet: call inet6_ehashfn() once from inet6_hash_connect()")
9544d60a2605d150 d4438ce68bf145aa1d7ed03ebf3
---------------- ---------------------------
%stddev %change %stddev
\ | \
66811 ± 2% +37.2% 91683 ± 2% vmstat.system.cs
2.92 ± 38% +0.7 3.62 ± 21% mpstat.cpu.all.idle%
0.78 ± 6% +0.3 1.09 ± 2% mpstat.cpu.all.soft%
89855 ± 45% -58.3% 37495 ± 96% numa-meminfo.node0.Mapped
181411 ± 29% +50.9% 273696 ± 11% numa-meminfo.node1.Shmem
433609 ± 6% +25.7% 545181 ± 4% numa-numastat.node1.local_node
462621 ± 5% +24.1% 574183 ± 4% numa-numastat.node1.numa_hit
199550 ± 3% -32.9% 133991 ± 5% perf-c2c.DRAM.local
141678 ± 5% -34.2% 93183 ± 7% perf-c2c.DRAM.remote
45462 ± 29% +50.9% 68581 ± 10% numa-vmstat.node1.nr_shmem
462094 ± 5% +24.1% 573303 ± 4% numa-vmstat.node1.numa_hit
433089 ± 6% +25.7% 544302 ± 4% numa-vmstat.node1.numa_local
2029078 ± 2% +40.4% 2847905 ± 2% stress-ng.sockmany.ops
33711 ± 2% +40.4% 47329 ± 2% stress-ng.sockmany.ops_per_sec
2010826 ± 2% +40.8% 2831055 ± 2% stress-ng.time.involuntary_context_switches
2122618 ± 2% +38.3% 2934839 ± 2% stress-ng.time.voluntary_context_switches
87035 +6.7% 92859 proc-vmstat.nr_shmem
336264 +2.0% 343096 proc-vmstat.nr_slab_reclaimable
938906 +12.1% 1052474 proc-vmstat.numa_hit
872687 +13.0% 986265 proc-vmstat.numa_local
2896245 +19.6% 3464080 proc-vmstat.pgalloc_normal
305944 +1.9% 311839 proc-vmstat.pgfault
2751656 +20.4% 3312497 proc-vmstat.pgfree
26462 ± 10% -29.0% 18788 ± 9% sched_debug.cfs_rq:/.avg_vruntime.stddev
26462 ± 10% -29.0% 18788 ± 9% sched_debug.cfs_rq:/.min_vruntime.stddev
3.49 ± 14% +28.9% 4.50 ± 9% sched_debug.cpu.clock.stddev
34426 ± 2% +36.3% 46926 ± 2% sched_debug.cpu.nr_switches.avg
64283 ± 6% +28.9% 82835 ± 8% sched_debug.cpu.nr_switches.max
9571 ± 9% +39.4% 13341 ± 10% sched_debug.cpu.nr_switches.stddev
4.20 ± 10% +19.0% 5.00 ± 10% sched_debug.cpu.nr_uninterruptible.stddev
18.34 +35.6% 24.87 perf-stat.i.MPKI
9.043e+09 ± 2% +28.2% 1.159e+10 perf-stat.i.branch-instructions
1.91 -0.1 1.82 perf-stat.i.branch-miss-rate%
1.741e+08 +20.4% 2.095e+08 ± 2% perf-stat.i.branch-misses
9.167e+08 +38.9% 1.274e+09 perf-stat.i.cache-misses
1.555e+09 +38.0% 2.146e+09 perf-stat.i.cache-references
69950 ± 2% +37.7% 96349 ± 2% perf-stat.i.context-switches
258.20 -26.5% 189.89 perf-stat.i.cycles-between-cache-misses
0.04 ± 41% -54.6% 0.02 ± 57% perf-stat.i.major-faults
0.60 ± 6% +83.8% 1.11 ± 5% perf-stat.i.metric.K/sec
0.16 ± 78% +175.0% 0.43 ± 25% perf-sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown]
0.01 ± 3% -17.5% 0.01 perf-sched.sch_delay.avg.ms.schedule_timeout.inet_csk_accept.inet_accept.do_accept
0.29 ± 14% +132.4% 0.67 ± 21% perf-sched.sch_delay.avg.ms.schedule_timeout.wait_woken.sk_wait_data.tcp_recvmsg_locked
11.50 ± 71% -76.7% 2.68 ± 69% perf-sched.sch_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
0.05 -15.8% 0.04 perf-sched.total_sch_delay.average.ms
7.99 -19.3% 6.45 ± 2% perf-sched.total_wait_and_delay.average.ms
162716 ± 2% +25.0% 203451 ± 2% perf-sched.total_wait_and_delay.count.ms
7.94 -19.3% 6.41 ± 2% perf-sched.total_wait_time.average.ms
0.09 ± 8% -13.2% 0.08 ± 3% perf-sched.wait_and_delay.avg.ms.__cond_resched.__release_sock.release_sock.__inet_stream_connect.inet_stream_connect
149.92 ± 2% -16.0% 125.93 ± 4% perf-sched.wait_and_delay.avg.ms.schedule_hrtimeout_range.do_poll.constprop.0.do_sys_poll
4.23 ± 2% -21.9% 3.31 ± 2% perf-sched.wait_and_delay.avg.ms.schedule_timeout.inet_csk_accept.inet_accept.do_accept
0.58 ± 14% +131.2% 1.34 ± 20% perf-sched.wait_and_delay.avg.ms.schedule_timeout.wait_woken.sk_wait_data.tcp_recvmsg_locked
246.84 ± 8% +26.7% 312.83 ± 5% perf-sched.wait_and_delay.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
5361 ± 3% -13.8% 4621 ± 2% perf-sched.wait_and_delay.count.__cond_resched.__inet_hash_connect.tcp_v4_connect.__inet_stream_connect.inet_stream_connect
68917 ± 3% +36.0% 93723 ± 2% perf-sched.wait_and_delay.count.__cond_resched.__release_sock.release_sock.__inet_stream_connect.inet_stream_connect
91.00 ± 3% +25.1% 113.83 ± 2% perf-sched.wait_and_delay.count.schedule_hrtimeout_range.do_poll.constprop.0.do_sys_poll
75346 ± 2% +28.1% 96486 ± 2% perf-sched.wait_and_delay.count.schedule_timeout.inet_csk_accept.inet_accept.do_accept
6441 ± 11% -56.8% 2780 ± 23% perf-sched.wait_and_delay.count.schedule_timeout.wait_woken.sk_wait_data.tcp_recvmsg_locked
2275 ± 9% -20.0% 1820 ± 6% perf-sched.wait_and_delay.count.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
1771 ± 6% -41.4% 1038 ± 55% perf-sched.wait_and_delay.max.ms.__cond_resched.generic_perform_write.shmem_file_write_iter.vfs_write.ksys_write
1.62 ± 77% -65.0% 0.57 ±123% perf-sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
0.16 ± 78% +175.0% 0.43 ± 25% perf-sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown]
149.66 ± 2% -16.0% 125.77 ± 4% perf-sched.wait_time.avg.ms.schedule_hrtimeout_range.do_poll.constprop.0.do_sys_poll
4.22 ± 2% -21.9% 3.29 ± 2% perf-sched.wait_time.avg.ms.schedule_timeout.inet_csk_accept.inet_accept.do_accept
0.29 ± 14% +129.7% 0.67 ± 20% perf-sched.wait_time.avg.ms.schedule_timeout.wait_woken.sk_wait_data.tcp_recvmsg_locked
246.83 ± 8% +26.7% 312.83 ± 5% perf-sched.wait_time.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
1771 ± 6% -41.4% 1038 ± 55% perf-sched.wait_time.max.ms.__cond_resched.generic_perform_write.shmem_file_write_iter.vfs_write.ksys_write
11.50 ± 71% -78.6% 2.46 ± 76% perf-sched.wait_time.max.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
Powered by blists - more mailing lists