[<prev] [next>] [day] [month] [year] [list]
Message-ID: <202509261609.dec14b91-lkp@intel.com>
Date: Fri, 26 Sep 2025 16:40:26 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Eric Dumazet <edumazet@...gle.com>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, Paolo Abeni
<pabeni@...hat.com>, Willem de Bruijn <willemb@...gle.com>, David Ahern
<dsahern@...nel.org>, Kuniyuki Iwashima <kuniyu@...gle.com>, Jakub Kicinski
<kuba@...nel.org>, <netdev@...r.kernel.org>, <oliver.sang@...el.com>
Subject: [linux-next:master] [udp] 6471658dc6: netperf.Throughput_Mbps
200.0% improvement
Hello,
kernel test robot noticed a 200.0% improvement of netperf.Throughput_Mbps on:
commit: 6471658dc66c670580a7616e75f51b52917e7883 ("udp: use skb_attempt_defer_free()")
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
testcase: netperf
config: x86_64-rhel-9.4
compiler: gcc-14
test machine: 192 threads 2 sockets Intel(R) Xeon(R) 6740E CPU @ 2.4GHz (Sierra Forest) with 256G memory
parameters:
ip: ipv4
runtime: 300s
nr_threads: 50%
cluster: cs-localhost
test: UDP_STREAM
cpufreq_governor: performance
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250926/202509261609.dec14b91-lkp@intel.com
=========================================================================================
cluster/compiler/cpufreq_governor/ip/kconfig/nr_threads/rootfs/runtime/tbox_group/test/testcase:
cs-localhost/gcc-14/performance/ipv4/x86_64-rhel-9.4/50%/debian-13-x86_64-20250902.cgz/300s/lkp-srf-2sp3/UDP_STREAM/netperf
commit:
3cd04c8f4a ("udp: make busylock per socket")
6471658dc6 ("udp: use skb_attempt_defer_free()")
3cd04c8f4afed71a 6471658dc66c670580a7616e75f
---------------- ---------------------------
%stddev %change %stddev
\ | \
6.079e+09 ± 4% +47.8% 8.983e+09 cpuidle..time
4.012e+08 +320.9% 1.689e+09 ± 2% cpuidle..usage
9360404 +22.3% 11449805 ± 4% numa-meminfo.node1.Active
9360396 +22.3% 11449799 ± 4% numa-meminfo.node1.Active(anon)
8894257 ± 3% +22.2% 10867440 ± 3% numa-meminfo.node1.Shmem
1.044e+09 ± 3% +206.8% 3.203e+09 numa-numastat.node0.local_node
1.044e+09 ± 3% +206.8% 3.204e+09 numa-numastat.node0.numa_hit
1.013e+09 ± 2% +218.0% 3.221e+09 numa-numastat.node1.local_node
1.013e+09 ± 2% +218.0% 3.221e+09 numa-numastat.node1.numa_hit
9.93 ± 5% +4.4 14.28 mpstat.cpu.all.idle%
0.59 +1.3 1.89 mpstat.cpu.all.irq%
1.77 +12.4 14.15 ± 2% mpstat.cpu.all.soft%
86.78 -18.8 67.94 mpstat.cpu.all.sys%
0.93 +0.8 1.74 mpstat.cpu.all.usr%
1.044e+09 ± 3% +206.9% 3.204e+09 numa-vmstat.node0.numa_hit
1.044e+09 ± 3% +206.9% 3.203e+09 numa-vmstat.node0.numa_local
2339319 +22.4% 2863299 ± 4% numa-vmstat.node1.nr_active_anon
2222754 ± 3% +22.3% 2717697 ± 3% numa-vmstat.node1.nr_shmem
2339318 +22.4% 2863297 ± 4% numa-vmstat.node1.nr_zone_active_anon
1.013e+09 ± 2% +218.1% 3.221e+09 numa-vmstat.node1.numa_hit
1.013e+09 ± 2% +218.1% 3.221e+09 numa-vmstat.node1.numa_local
9763805 ± 3% +21.6% 11869079 ± 3% meminfo.Active
9763788 ± 3% +21.6% 11869062 ± 3% meminfo.Active(anon)
805138 +15.5% 929863 meminfo.AnonPages
12584871 ± 2% +15.7% 14565534 ± 2% meminfo.Cached
9930753 ± 3% +21.2% 12038335 ± 3% meminfo.Committed_AS
16167194 +14.9% 18577687 ± 2% meminfo.Memused
8962742 ± 3% +22.1% 10943457 ± 3% meminfo.Shmem
16392623 +14.4% 18753189 ± 2% meminfo.max_used_kB
38913 +200.8% 117050 netperf.ThroughputBoth_Mbps
3735655 +200.8% 11236826 netperf.ThroughputBoth_total_Mbps
18515 +201.7% 55862 netperf.ThroughputRecv_Mbps
1777441 +201.7% 5362763 netperf.ThroughputRecv_total_Mbps
20398 +200.0% 61188 netperf.Throughput_Mbps
1958214 +200.0% 5874063 netperf.Throughput_total_Mbps
88004812 ± 8% -78.3% 19110782 ± 18% netperf.time.involuntary_context_switches
41333 +18.4% 48917 netperf.time.minor_page_faults
9067 -24.1% 6883 netperf.time.percent_of_cpu_this_job_got
27208 -25.1% 20391 netperf.time.system_time
158.64 +115.0% 341.02 netperf.time.user_time
2.139e+09 +200.8% 6.433e+09 netperf.workload
2441370 ± 3% +21.6% 2967589 ± 3% proc-vmstat.nr_active_anon
201274 +15.5% 232430 proc-vmstat.nr_anon_pages
6049392 -1.0% 5989283 proc-vmstat.nr_dirty_background_threshold
12113577 -1.0% 11993210 proc-vmstat.nr_dirty_threshold
3146655 ± 2% +15.7% 3641743 ± 2% proc-vmstat.nr_file_pages
60865222 -1.0% 60263229 proc-vmstat.nr_free_pages
60728673 -0.9% 60156480 proc-vmstat.nr_free_pages_blocks
2241122 ± 3% +22.1% 2736223 ± 3% proc-vmstat.nr_shmem
43088 +2.6% 44210 proc-vmstat.nr_slab_reclaimable
2441370 ± 3% +21.6% 2967589 ± 3% proc-vmstat.nr_zone_active_anon
46931 ± 26% +1165.5% 593929 ± 21% proc-vmstat.numa_hint_faults
35701 ± 34% +1506.1% 573389 ± 22% proc-vmstat.numa_hint_faults_local
2.057e+09 +212.3% 6.425e+09 proc-vmstat.numa_hit
2.057e+09 +212.3% 6.424e+09 proc-vmstat.numa_local
10954 ± 3% +77.0% 19391 ± 2% proc-vmstat.numa_pages_migrated
95835 ± 35% +588.2% 659561 ± 20% proc-vmstat.numa_pte_updates
1.641e+10 +212.8% 5.132e+10 proc-vmstat.pgalloc_normal
1186751 +45.6% 1727586 ± 7% proc-vmstat.pgfault
1.641e+10 +212.8% 5.132e+10 proc-vmstat.pgfree
10954 ± 3% +77.0% 19391 ± 2% proc-vmstat.pgmigrate_success
48468 +7.4% 52040 proc-vmstat.pgreuse
1.689e+10 +108.1% 3.514e+10 perf-stat.i.branch-instructions
38251661 +111.9% 81050915 perf-stat.i.branch-misses
0.92 ± 45% +3.6 4.57 ± 41% perf-stat.i.cache-miss-rate%
5.1e+09 -84.4% 7.944e+08 ± 13% perf-stat.i.cache-references
3325973 ± 2% +251.7% 11696843 ± 2% perf-stat.i.context-switches
6.70 -56.4% 2.92 perf-stat.i.cpi
5.565e+11 -3.1% 5.393e+11 perf-stat.i.cpu-cycles
2315 ± 14% +1249.5% 31253 ± 9% perf-stat.i.cpu-migrations
8.352e+10 +121.5% 1.85e+11 perf-stat.i.instructions
0.15 +124.3% 0.34 perf-stat.i.ipc
17.32 ± 2% +251.7% 60.92 ± 2% perf-stat.i.metric.K/sec
3553 +50.1% 5334 ± 8% perf-stat.i.minor-faults
3553 +50.1% 5334 ± 8% perf-stat.i.page-faults
0.62 ± 68% +3.8 4.40 ± 41% perf-stat.overall.cache-miss-rate%
6.66 -56.2% 2.92 perf-stat.overall.cpi
0.15 +128.5% 0.34 perf-stat.overall.ipc
11795 -26.6% 8659 perf-stat.overall.path-length
1.683e+10 +108.1% 3.502e+10 perf-stat.ps.branch-instructions
38128826 +111.9% 80783216 perf-stat.ps.branch-misses
5.083e+09 -84.4% 7.918e+08 ± 13% perf-stat.ps.cache-references
3315367 ± 2% +251.7% 11658619 ± 2% perf-stat.ps.context-switches
5.547e+11 -3.1% 5.375e+11 perf-stat.ps.cpu-cycles
2310 ± 14% +1247.8% 31146 ± 9% perf-stat.ps.cpu-migrations
8.326e+10 +121.4% 1.844e+11 perf-stat.ps.instructions
3534 +50.0% 5302 ± 8% perf-stat.ps.minor-faults
3534 +50.0% 5302 ± 8% perf-stat.ps.page-faults
2.523e+13 +120.8% 5.57e+13 perf-stat.total.instructions
26243720 -28.6% 18749089 sched_debug.cfs_rq:/.avg_vruntime.avg
28003592 -25.9% 20752934 ± 4% sched_debug.cfs_rq:/.avg_vruntime.max
25141866 -32.7% 16920920 sched_debug.cfs_rq:/.avg_vruntime.min
0.30 ± 4% +26.3% 0.38 ± 2% sched_debug.cfs_rq:/.h_nr_queued.stddev
0.30 ± 4% +28.8% 0.38 ± 2% sched_debug.cfs_rq:/.h_nr_runnable.stddev
209424 ± 34% +169.4% 564215 ± 12% sched_debug.cfs_rq:/.left_deadline.avg
2071516 ± 26% +50.9% 3126519 ± 6% sched_debug.cfs_rq:/.left_deadline.stddev
209420 ± 34% +169.4% 564200 ± 12% sched_debug.cfs_rq:/.left_vruntime.avg
2071477 ± 26% +50.9% 3126440 ± 6% sched_debug.cfs_rq:/.left_vruntime.stddev
26243720 -28.6% 18749089 sched_debug.cfs_rq:/.min_vruntime.avg
28003592 -25.9% 20752934 ± 4% sched_debug.cfs_rq:/.min_vruntime.max
25141866 -32.7% 16920920 sched_debug.cfs_rq:/.min_vruntime.min
0.27 ± 3% +24.1% 0.33 ± 2% sched_debug.cfs_rq:/.nr_queued.stddev
209420 ± 34% +169.4% 564200 ± 12% sched_debug.cfs_rq:/.right_vruntime.avg
2071477 ± 26% +50.9% 3126440 ± 6% sched_debug.cfs_rq:/.right_vruntime.stddev
209.56 ± 6% +32.0% 276.63 ± 2% sched_debug.cfs_rq:/.runnable_avg.stddev
192.89 ± 5% +31.7% 253.94 ± 2% sched_debug.cfs_rq:/.util_avg.stddev
819031 ± 3% -65.5% 282924 ± 8% sched_debug.cpu.avg_idle.avg
4756 ± 2% -37.0% 2995 ± 2% sched_debug.cpu.avg_idle.min
1084177 -55.0% 487973 ± 11% sched_debug.cpu.avg_idle.stddev
740.98 ± 24% +55.0% 1148 ± 24% sched_debug.cpu.clock_task.stddev
2338 ± 5% +22.8% 2872 sched_debug.cpu.curr->pid.stddev
742501 ± 19% +46.3% 1085919 ± 5% sched_debug.cpu.max_idle_balance_cost.min
106154 ± 12% -44.8% 58635 ± 9% sched_debug.cpu.max_idle_balance_cost.stddev
0.00 ± 17% -20.6% 0.00 ± 3% sched_debug.cpu.next_balance.stddev
0.31 ± 5% +24.8% 0.39 sched_debug.cpu.nr_running.stddev
2579918 ± 2% +252.5% 9093241 ± 2% sched_debug.cpu.nr_switches.avg
3594985 +181.0% 10103415 ± 4% sched_debug.cpu.nr_switches.max
1042225 ± 52% +546.5% 6737845 ± 14% sched_debug.cpu.nr_switches.min
135.08 ± 23% -24.8% 101.56 ± 18% sched_debug.cpu.nr_uninterruptible.max
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
Powered by blists - more mailing lists