lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <202509261609.dec14b91-lkp@intel.com>
Date: Fri, 26 Sep 2025 16:40:26 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Eric Dumazet <edumazet@...gle.com>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, Paolo Abeni
	<pabeni@...hat.com>, Willem de Bruijn <willemb@...gle.com>, David Ahern
	<dsahern@...nel.org>, Kuniyuki Iwashima <kuniyu@...gle.com>, Jakub Kicinski
	<kuba@...nel.org>, <netdev@...r.kernel.org>, <oliver.sang@...el.com>
Subject: [linux-next:master] [udp]  6471658dc6:  netperf.Throughput_Mbps
 200.0% improvement



Hello,

kernel test robot noticed a 200.0% improvement of netperf.Throughput_Mbps on:


commit: 6471658dc66c670580a7616e75f51b52917e7883 ("udp: use skb_attempt_defer_free()")
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master


testcase: netperf
config: x86_64-rhel-9.4
compiler: gcc-14
test machine: 192 threads 2 sockets Intel(R) Xeon(R) 6740E  CPU @ 2.4GHz (Sierra Forest) with 256G memory
parameters:

	ip: ipv4
	runtime: 300s
	nr_threads: 50%
	cluster: cs-localhost
	test: UDP_STREAM
	cpufreq_governor: performance






Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250926/202509261609.dec14b91-lkp@intel.com

=========================================================================================
cluster/compiler/cpufreq_governor/ip/kconfig/nr_threads/rootfs/runtime/tbox_group/test/testcase:
  cs-localhost/gcc-14/performance/ipv4/x86_64-rhel-9.4/50%/debian-13-x86_64-20250902.cgz/300s/lkp-srf-2sp3/UDP_STREAM/netperf

commit: 
  3cd04c8f4a ("udp: make busylock per socket")
  6471658dc6 ("udp: use skb_attempt_defer_free()")

3cd04c8f4afed71a 6471658dc66c670580a7616e75f 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
 6.079e+09 ±  4%     +47.8%  8.983e+09        cpuidle..time
 4.012e+08          +320.9%  1.689e+09 ±  2%  cpuidle..usage
   9360404           +22.3%   11449805 ±  4%  numa-meminfo.node1.Active
   9360396           +22.3%   11449799 ±  4%  numa-meminfo.node1.Active(anon)
   8894257 ±  3%     +22.2%   10867440 ±  3%  numa-meminfo.node1.Shmem
 1.044e+09 ±  3%    +206.8%  3.203e+09        numa-numastat.node0.local_node
 1.044e+09 ±  3%    +206.8%  3.204e+09        numa-numastat.node0.numa_hit
 1.013e+09 ±  2%    +218.0%  3.221e+09        numa-numastat.node1.local_node
 1.013e+09 ±  2%    +218.0%  3.221e+09        numa-numastat.node1.numa_hit
      9.93 ±  5%      +4.4       14.28        mpstat.cpu.all.idle%
      0.59            +1.3        1.89        mpstat.cpu.all.irq%
      1.77           +12.4       14.15 ±  2%  mpstat.cpu.all.soft%
     86.78           -18.8       67.94        mpstat.cpu.all.sys%
      0.93            +0.8        1.74        mpstat.cpu.all.usr%
 1.044e+09 ±  3%    +206.9%  3.204e+09        numa-vmstat.node0.numa_hit
 1.044e+09 ±  3%    +206.9%  3.203e+09        numa-vmstat.node0.numa_local
   2339319           +22.4%    2863299 ±  4%  numa-vmstat.node1.nr_active_anon
   2222754 ±  3%     +22.3%    2717697 ±  3%  numa-vmstat.node1.nr_shmem
   2339318           +22.4%    2863297 ±  4%  numa-vmstat.node1.nr_zone_active_anon
 1.013e+09 ±  2%    +218.1%  3.221e+09        numa-vmstat.node1.numa_hit
 1.013e+09 ±  2%    +218.1%  3.221e+09        numa-vmstat.node1.numa_local
   9763805 ±  3%     +21.6%   11869079 ±  3%  meminfo.Active
   9763788 ±  3%     +21.6%   11869062 ±  3%  meminfo.Active(anon)
    805138           +15.5%     929863        meminfo.AnonPages
  12584871 ±  2%     +15.7%   14565534 ±  2%  meminfo.Cached
   9930753 ±  3%     +21.2%   12038335 ±  3%  meminfo.Committed_AS
  16167194           +14.9%   18577687 ±  2%  meminfo.Memused
   8962742 ±  3%     +22.1%   10943457 ±  3%  meminfo.Shmem
  16392623           +14.4%   18753189 ±  2%  meminfo.max_used_kB
     38913          +200.8%     117050        netperf.ThroughputBoth_Mbps
   3735655          +200.8%   11236826        netperf.ThroughputBoth_total_Mbps
     18515          +201.7%      55862        netperf.ThroughputRecv_Mbps
   1777441          +201.7%    5362763        netperf.ThroughputRecv_total_Mbps
     20398          +200.0%      61188        netperf.Throughput_Mbps
   1958214          +200.0%    5874063        netperf.Throughput_total_Mbps
  88004812 ±  8%     -78.3%   19110782 ± 18%  netperf.time.involuntary_context_switches
     41333           +18.4%      48917        netperf.time.minor_page_faults
      9067           -24.1%       6883        netperf.time.percent_of_cpu_this_job_got
     27208           -25.1%      20391        netperf.time.system_time
    158.64          +115.0%     341.02        netperf.time.user_time
 2.139e+09          +200.8%  6.433e+09        netperf.workload
   2441370 ±  3%     +21.6%    2967589 ±  3%  proc-vmstat.nr_active_anon
    201274           +15.5%     232430        proc-vmstat.nr_anon_pages
   6049392            -1.0%    5989283        proc-vmstat.nr_dirty_background_threshold
  12113577            -1.0%   11993210        proc-vmstat.nr_dirty_threshold
   3146655 ±  2%     +15.7%    3641743 ±  2%  proc-vmstat.nr_file_pages
  60865222            -1.0%   60263229        proc-vmstat.nr_free_pages
  60728673            -0.9%   60156480        proc-vmstat.nr_free_pages_blocks
   2241122 ±  3%     +22.1%    2736223 ±  3%  proc-vmstat.nr_shmem
     43088            +2.6%      44210        proc-vmstat.nr_slab_reclaimable
   2441370 ±  3%     +21.6%    2967589 ±  3%  proc-vmstat.nr_zone_active_anon
     46931 ± 26%   +1165.5%     593929 ± 21%  proc-vmstat.numa_hint_faults
     35701 ± 34%   +1506.1%     573389 ± 22%  proc-vmstat.numa_hint_faults_local
 2.057e+09          +212.3%  6.425e+09        proc-vmstat.numa_hit
 2.057e+09          +212.3%  6.424e+09        proc-vmstat.numa_local
     10954 ±  3%     +77.0%      19391 ±  2%  proc-vmstat.numa_pages_migrated
     95835 ± 35%    +588.2%     659561 ± 20%  proc-vmstat.numa_pte_updates
 1.641e+10          +212.8%  5.132e+10        proc-vmstat.pgalloc_normal
   1186751           +45.6%    1727586 ±  7%  proc-vmstat.pgfault
 1.641e+10          +212.8%  5.132e+10        proc-vmstat.pgfree
     10954 ±  3%     +77.0%      19391 ±  2%  proc-vmstat.pgmigrate_success
     48468            +7.4%      52040        proc-vmstat.pgreuse
 1.689e+10          +108.1%  3.514e+10        perf-stat.i.branch-instructions
  38251661          +111.9%   81050915        perf-stat.i.branch-misses
      0.92 ± 45%      +3.6        4.57 ± 41%  perf-stat.i.cache-miss-rate%
   5.1e+09           -84.4%  7.944e+08 ± 13%  perf-stat.i.cache-references
   3325973 ±  2%    +251.7%   11696843 ±  2%  perf-stat.i.context-switches
      6.70           -56.4%       2.92        perf-stat.i.cpi
 5.565e+11            -3.1%  5.393e+11        perf-stat.i.cpu-cycles
      2315 ± 14%   +1249.5%      31253 ±  9%  perf-stat.i.cpu-migrations
 8.352e+10          +121.5%   1.85e+11        perf-stat.i.instructions
      0.15          +124.3%       0.34        perf-stat.i.ipc
     17.32 ±  2%    +251.7%      60.92 ±  2%  perf-stat.i.metric.K/sec
      3553           +50.1%       5334 ±  8%  perf-stat.i.minor-faults
      3553           +50.1%       5334 ±  8%  perf-stat.i.page-faults
      0.62 ± 68%      +3.8        4.40 ± 41%  perf-stat.overall.cache-miss-rate%
      6.66           -56.2%       2.92        perf-stat.overall.cpi
      0.15          +128.5%       0.34        perf-stat.overall.ipc
     11795           -26.6%       8659        perf-stat.overall.path-length
 1.683e+10          +108.1%  3.502e+10        perf-stat.ps.branch-instructions
  38128826          +111.9%   80783216        perf-stat.ps.branch-misses
 5.083e+09           -84.4%  7.918e+08 ± 13%  perf-stat.ps.cache-references
   3315367 ±  2%    +251.7%   11658619 ±  2%  perf-stat.ps.context-switches
 5.547e+11            -3.1%  5.375e+11        perf-stat.ps.cpu-cycles
      2310 ± 14%   +1247.8%      31146 ±  9%  perf-stat.ps.cpu-migrations
 8.326e+10          +121.4%  1.844e+11        perf-stat.ps.instructions
      3534           +50.0%       5302 ±  8%  perf-stat.ps.minor-faults
      3534           +50.0%       5302 ±  8%  perf-stat.ps.page-faults
 2.523e+13          +120.8%   5.57e+13        perf-stat.total.instructions
  26243720           -28.6%   18749089        sched_debug.cfs_rq:/.avg_vruntime.avg
  28003592           -25.9%   20752934 ±  4%  sched_debug.cfs_rq:/.avg_vruntime.max
  25141866           -32.7%   16920920        sched_debug.cfs_rq:/.avg_vruntime.min
      0.30 ±  4%     +26.3%       0.38 ±  2%  sched_debug.cfs_rq:/.h_nr_queued.stddev
      0.30 ±  4%     +28.8%       0.38 ±  2%  sched_debug.cfs_rq:/.h_nr_runnable.stddev
    209424 ± 34%    +169.4%     564215 ± 12%  sched_debug.cfs_rq:/.left_deadline.avg
   2071516 ± 26%     +50.9%    3126519 ±  6%  sched_debug.cfs_rq:/.left_deadline.stddev
    209420 ± 34%    +169.4%     564200 ± 12%  sched_debug.cfs_rq:/.left_vruntime.avg
   2071477 ± 26%     +50.9%    3126440 ±  6%  sched_debug.cfs_rq:/.left_vruntime.stddev
  26243720           -28.6%   18749089        sched_debug.cfs_rq:/.min_vruntime.avg
  28003592           -25.9%   20752934 ±  4%  sched_debug.cfs_rq:/.min_vruntime.max
  25141866           -32.7%   16920920        sched_debug.cfs_rq:/.min_vruntime.min
      0.27 ±  3%     +24.1%       0.33 ±  2%  sched_debug.cfs_rq:/.nr_queued.stddev
    209420 ± 34%    +169.4%     564200 ± 12%  sched_debug.cfs_rq:/.right_vruntime.avg
   2071477 ± 26%     +50.9%    3126440 ±  6%  sched_debug.cfs_rq:/.right_vruntime.stddev
    209.56 ±  6%     +32.0%     276.63 ±  2%  sched_debug.cfs_rq:/.runnable_avg.stddev
    192.89 ±  5%     +31.7%     253.94 ±  2%  sched_debug.cfs_rq:/.util_avg.stddev
    819031 ±  3%     -65.5%     282924 ±  8%  sched_debug.cpu.avg_idle.avg
      4756 ±  2%     -37.0%       2995 ±  2%  sched_debug.cpu.avg_idle.min
   1084177           -55.0%     487973 ± 11%  sched_debug.cpu.avg_idle.stddev
    740.98 ± 24%     +55.0%       1148 ± 24%  sched_debug.cpu.clock_task.stddev
      2338 ±  5%     +22.8%       2872        sched_debug.cpu.curr->pid.stddev
    742501 ± 19%     +46.3%    1085919 ±  5%  sched_debug.cpu.max_idle_balance_cost.min
    106154 ± 12%     -44.8%      58635 ±  9%  sched_debug.cpu.max_idle_balance_cost.stddev
      0.00 ± 17%     -20.6%       0.00 ±  3%  sched_debug.cpu.next_balance.stddev
      0.31 ±  5%     +24.8%       0.39        sched_debug.cpu.nr_running.stddev
   2579918 ±  2%    +252.5%    9093241 ±  2%  sched_debug.cpu.nr_switches.avg
   3594985          +181.0%   10103415 ±  4%  sched_debug.cpu.nr_switches.max
   1042225 ± 52%    +546.5%    6737845 ± 14%  sched_debug.cpu.nr_switches.min
    135.08 ± 23%     -24.8%     101.56 ± 18%  sched_debug.cpu.nr_uninterruptible.max




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ