lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <202508180406.dbf438fc-lkp@intel.com>
Date: Mon, 18 Aug 2025 12:48:06 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Eric Dumazet <edumazet@...gle.com>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, <linux-kernel@...r.kernel.org>,
	Jakub Kicinski <kuba@...nel.org>, Kuniyuki Iwashima <kuniyu@...gle.com>,
	<netdev@...r.kernel.org>, <oliver.sang@...el.com>
Subject: [linus:master] [tcp]  1d2fbaad7c:  stress-ng.sigurg.ops_per_sec
 12.2% regression


Hello,

kernel test robot noticed a 12.2% regression of stress-ng.sigurg.ops_per_sec on:


commit: 1d2fbaad7cd8cc96899179f9898ad2787a15f0a0 ("tcp: stronger sk_rcvbuf checks")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

[still regression on      linus/master d7ee5bdce7892643409dea7266c34977e651b479]
[still regression on linux-next/master 1357b2649c026b51353c84ddd32bc963e8999603]
[still regression on        fix commit 972ca7a3bc9a136b15ba698713b056a4900e2634]

testcase: stress-ng
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 256G memory
parameters:

	nr_threads: 100%
	testtime: 60s
	test: sigurg
	cpufreq_governor: performance


If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@...el.com>
| Closes: https://lore.kernel.org/oe-lkp/202508180406.dbf438fc-lkp@intel.com


Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250818/202508180406.dbf438fc-lkp@intel.com

=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
  gcc-12/performance/x86_64-rhel-9.4/100%/debian-12-x86_64-20240206.cgz/lkp-spr-r02/sigurg/stress-ng/60s

commit: 
  75dff0584c ("tcp: add const to tcp_try_rmem_schedule() and sk_rmem_schedule() skb")
  1d2fbaad7c ("tcp: stronger sk_rcvbuf checks")

75dff0584cce7920 1d2fbaad7cd8cc96899179f9898 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
     36434            +7.6%      39205        vmstat.system.cs
   5683321           -13.3%    4926200 ±  2%  vmstat.system.in
    530991 ±  2%      -9.3%     481619 ±  3%  meminfo.Mapped
   1132865           -13.5%     979753        meminfo.SUnreclaim
   1292406           -11.9%    1138096        meminfo.Slab
      0.62 ±  2%      +0.1        0.70        mpstat.cpu.all.irq%
     24.14            -8.3       15.83 ±  2%  mpstat.cpu.all.soft%
     10.95            +2.3       13.22        mpstat.cpu.all.usr%
    627541 ±  4%     -15.4%     530831 ±  5%  numa-meminfo.node0.SUnreclaim
    721419 ±  3%     -14.0%     620592 ±  8%  numa-meminfo.node0.Slab
    513808 ±  6%     -13.1%     446297 ±  4%  numa-meminfo.node1.SUnreclaim
   6100681           -23.2%    4686698 ±  2%  numa-numastat.node0.local_node
   6205260           -22.6%    4802561 ±  2%  numa-numastat.node0.numa_hit
   5548582           -18.0%    4547552        numa-numastat.node1.local_node
   5676020           -17.8%    4663456        numa-numastat.node1.numa_hit
     22382 ±  2%     -37.0%      14107 ±  4%  perf-c2c.DRAM.local
     28565 ± 14%     -28.5%      20433 ± 19%  perf-c2c.DRAM.remote
     61612 ±  4%     -28.7%      43958 ± 10%  perf-c2c.HITM.local
     18329 ± 14%     -27.0%      13378 ± 19%  perf-c2c.HITM.remote
     79941           -28.3%      57336 ±  6%  perf-c2c.HITM.total
    155304 ±  4%     -14.4%     132870 ±  5%  numa-vmstat.node0.nr_slab_unreclaimable
   6217921           -22.8%    4801413 ±  2%  numa-vmstat.node0.numa_hit
   6113343           -23.4%    4685551 ±  2%  numa-vmstat.node0.numa_local
    127106 ±  6%     -12.0%     111885 ±  4%  numa-vmstat.node1.nr_slab_unreclaimable
   5686635           -18.0%    4662431        numa-vmstat.node1.numa_hit
   5559197           -18.2%    4546532        numa-vmstat.node1.numa_local
  3.39e+08           -12.2%  2.977e+08        stress-ng.sigurg.ops
   5652273           -12.2%    4963242        stress-ng.sigurg.ops_per_sec
   1885719           +11.0%    2092671        stress-ng.time.involuntary_context_switches
     16523           +11.2%      18365        stress-ng.time.percent_of_cpu_this_job_got
      8500            +9.2%       9278        stress-ng.time.system_time
      1438           +23.0%       1769        stress-ng.time.user_time
    195971            -6.0%     184305        stress-ng.time.voluntary_context_switches
    487113 ±  7%      -5.8%     459038        proc-vmstat.nr_active_anon
    134039            -9.5%     121269 ±  4%  proc-vmstat.nr_mapped
    186858 ± 20%     -15.3%     158269 ±  2%  proc-vmstat.nr_shmem
    284955 ±  2%     -13.8%     245616        proc-vmstat.nr_slab_unreclaimable
    487113 ±  7%      -5.8%     459038        proc-vmstat.nr_zone_active_anon
  11891822           -20.5%    9456122        proc-vmstat.numa_hit
  11659806           -20.9%    9224357        proc-vmstat.numa_local
  86214365           -22.0%   67254297        proc-vmstat.pgalloc_normal
  85564410           -21.8%   66883184        proc-vmstat.pgfree
   6156738           +13.9%    7012286        sched_debug.cfs_rq:/.avg_vruntime.avg
   7693151           +10.1%    8468818        sched_debug.cfs_rq:/.avg_vruntime.max
   4636464 ±  5%     +14.2%    5295369 ±  4%  sched_debug.cfs_rq:/.avg_vruntime.min
    238.39 ± 92%    +228.2%     782.32 ± 46%  sched_debug.cfs_rq:/.load_avg.avg
   6156739           +13.9%    7012287        sched_debug.cfs_rq:/.min_vruntime.avg
   7693151           +10.1%    8468818        sched_debug.cfs_rq:/.min_vruntime.max
   4636464 ±  5%     +14.2%    5295369 ±  4%  sched_debug.cfs_rq:/.min_vruntime.min
      2580 ±  3%     -13.3%       2236 ±  8%  sched_debug.cfs_rq:/.runnable_avg.max
    104496 ± 28%     -64.4%      37246 ± 38%  sched_debug.cpu.avg_idle.min
      1405 ±  3%     +12.7%       1583 ±  2%  sched_debug.cpu.nr_switches.stddev
      0.68 ±  3%     -40.9%       0.40 ±  3%  perf-stat.i.MPKI
 9.475e+10           +26.6%  1.199e+11        perf-stat.i.branch-instructions
      0.13 ±  5%      -0.0        0.09 ±  2%  perf-stat.i.branch-miss-rate%
 1.178e+08 ±  3%     -14.9%  1.003e+08        perf-stat.i.branch-misses
     40.25            -3.2       37.02        perf-stat.i.cache-miss-rate%
 3.325e+08 ±  2%     -25.9%  2.465e+08 ±  3%  perf-stat.i.cache-misses
 8.258e+08           -19.2%  6.672e+08        perf-stat.i.cache-references
     37598            +8.1%      40642        perf-stat.i.context-switches
      1.31           -21.4%       1.03        perf-stat.i.cpi
      2327           -15.3%       1970 ±  2%  perf-stat.i.cpu-migrations
      1927 ±  2%     +33.7%       2577 ±  3%  perf-stat.i.cycles-between-cache-misses
 4.888e+11           +26.3%  6.174e+11        perf-stat.i.instructions
      0.77           +26.9%       0.98        perf-stat.i.ipc
      0.68 ±  3%     -41.3%       0.40 ±  3%  perf-stat.overall.MPKI
      0.12 ±  4%      -0.0        0.08 ±  2%  perf-stat.overall.branch-miss-rate%
     40.27            -3.3       36.95        perf-stat.overall.cache-miss-rate%
      1.31           -21.5%       1.03        perf-stat.overall.cpi
      1928 ±  2%     +33.8%       2581 ±  3%  perf-stat.overall.cycles-between-cache-misses
      0.76           +27.4%       0.97        perf-stat.overall.ipc
 9.264e+10           +26.7%  1.173e+11        perf-stat.ps.branch-instructions
 1.148e+08 ±  3%     -14.8%   97834009        perf-stat.ps.branch-misses
 3.253e+08 ±  2%     -25.8%  2.413e+08 ±  3%  perf-stat.ps.cache-misses
 8.077e+08           -19.2%   6.53e+08        perf-stat.ps.cache-references
     36742            +8.2%      39741        perf-stat.ps.context-switches
      2273           -15.4%       1922 ±  2%  perf-stat.ps.cpu-migrations
 4.779e+11           +26.4%  6.041e+11        perf-stat.ps.instructions
 2.914e+13           +27.4%  3.711e+13        perf-stat.total.instructions
      4.41 ±  4%     -17.7%       3.63 ±  6%  perf-sched.sch_delay.avg.ms.__cond_resched.__release_sock.release_sock.tcp_recvmsg.inet_recvmsg
      6.74           -12.8%       5.87 ±  3%  perf-sched.sch_delay.avg.ms.__cond_resched.__release_sock.release_sock.tcp_sendmsg.__sys_sendto
      5.30           -19.6%       4.26 ±  4%  perf-sched.sch_delay.avg.ms.__cond_resched.kmem_cache_alloc_node_noprof.__alloc_skb.tcp_stream_alloc_skb.tcp_sendmsg_locked
      5.22           -23.4%       4.00 ±  2%  perf-sched.sch_delay.avg.ms.__cond_resched.kmem_cache_alloc_node_noprof.kmalloc_reserve.__alloc_skb.tcp_stream_alloc_skb
      4.83 ±  4%     -15.7%       4.08 ±  2%  perf-sched.sch_delay.avg.ms.__cond_resched.lock_sock_nested.tcp_recvmsg.inet_recvmsg.sock_recvmsg
      5.20 ±  2%     -17.9%       4.27        perf-sched.sch_delay.avg.ms.__cond_resched.lock_sock_nested.tcp_sendmsg.__sys_sendto.__x64_sys_sendto
      4.92 ±  2%     -16.5%       4.11 ±  2%  perf-sched.sch_delay.avg.ms.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
      5.75 ± 18%     -88.1%       0.69 ±115%  perf-sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_common_interrupt.[unknown]
      5.00 ±  3%     -16.2%       4.19 ±  2%  perf-sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
      0.35 ± 15%     -37.1%       0.22 ± 12%  perf-sched.sch_delay.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
      3.47           -16.7%       2.89 ±  3%  perf-sched.sch_delay.avg.ms.schedule_timeout.wait_woken.sk_wait_data.tcp_recvmsg_locked
      0.13 ±  7%     -14.8%       0.11 ±  8%  perf-sched.sch_delay.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
     33.82 ± 55%     -45.0%      18.60 ± 16%  perf-sched.sch_delay.max.ms.__cond_resched.__release_sock.release_sock.tcp_recvmsg.inet_recvmsg
     36.83 ± 10%     -32.7%      24.80 ± 10%  perf-sched.sch_delay.max.ms.__cond_resched.kmem_cache_alloc_node_noprof.__alloc_skb.tcp_stream_alloc_skb.tcp_sendmsg_locked
     10.05 ± 49%     -58.0%       4.22 ± 33%  perf-sched.sch_delay.max.ms.do_nanosleep.hrtimer_nanosleep.common_nsleep.__x64_sys_clock_nanosleep
     55.68 ±  9%     -16.5%      46.49 ± 17%  perf-sched.sch_delay.max.ms.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
      6.12 ± 18%     -81.2%       1.15 ±116%  perf-sched.sch_delay.max.ms.irqentry_exit_to_user_mode.asm_common_interrupt.[unknown]
      7.91 ± 27%     -39.8%       4.77 ± 24%  perf-sched.sch_delay.max.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
      1.73 ±104%     -99.1%       0.02 ± 44%  perf-sched.sch_delay.max.ms.schedule_timeout.kcompactd.kthread.ret_from_fork
      4.56 ±  2%     -15.3%       3.86 ±  2%  perf-sched.total_sch_delay.average.ms
     26.64 ±  2%      -9.6%      24.08        perf-sched.total_wait_and_delay.average.ms
     22.08 ±  2%      -8.5%      20.21 ±  2%  perf-sched.total_wait_time.average.ms
     13.50           -12.5%      11.82 ±  3%  perf-sched.wait_and_delay.avg.ms.__cond_resched.__release_sock.release_sock.tcp_sendmsg.__sys_sendto
     15.61          -100.0%       0.00        perf-sched.wait_and_delay.avg.ms.__cond_resched.kmem_cache_alloc_node_noprof.__alloc_skb.tcp_stream_alloc_skb.tcp_sendmsg_locked
      9.72 ±  4%     -15.7%       8.20        perf-sched.wait_and_delay.avg.ms.__cond_resched.lock_sock_nested.tcp_recvmsg.inet_recvmsg.sock_recvmsg
     15.17 ±  6%     -21.9%      11.85 ±  3%  perf-sched.wait_and_delay.avg.ms.__cond_resched.lock_sock_nested.tcp_sendmsg.__sys_sendto.__x64_sys_sendto
     11.74 ±  2%     -12.5%      10.27 ±  2%  perf-sched.wait_and_delay.avg.ms.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
     10.39 ±  4%     -13.0%       9.04 ±  2%  perf-sched.wait_and_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
      7.06           -15.7%       5.95 ±  3%  perf-sched.wait_and_delay.avg.ms.schedule_timeout.wait_woken.sk_wait_data.tcp_recvmsg_locked
      2317           -49.6%       1169 ±  9%  perf-sched.wait_and_delay.count.__cond_resched.__release_sock.release_sock.tcp_sendmsg.__sys_sendto
      1488 ±  5%    -100.0%       0.00        perf-sched.wait_and_delay.count.__cond_resched.kmem_cache_alloc_node_noprof.__alloc_skb.tcp_stream_alloc_skb.tcp_sendmsg_locked
      1953 ±  8%    +347.2%       8733 ±  4%  perf-sched.wait_and_delay.count.__cond_resched.lock_sock_nested.tcp_recvmsg.inet_recvmsg.sock_recvmsg
      2360 ±  5%    +251.5%       8296 ±  6%  perf-sched.wait_and_delay.count.__cond_resched.lock_sock_nested.tcp_sendmsg.__sys_sendto.__x64_sys_sendto
     22781 ±  3%     +16.7%      26578 ±  3%  perf-sched.wait_and_delay.count.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
     13753 ±  4%     -14.8%      11717 ±  3%  perf-sched.wait_and_delay.count.schedule_timeout.wait_woken.sk_stream_wait_memory.tcp_sendmsg_locked
      6038 ±  2%     -12.6%       5275 ±  7%  perf-sched.wait_and_delay.count.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
     71.60 ±  8%    -100.0%       0.00        perf-sched.wait_and_delay.max.ms.__cond_resched.kmem_cache_alloc_node_noprof.__alloc_skb.tcp_stream_alloc_skb.tcp_sendmsg_locked
     53.03 ±  7%    +140.7%     127.64 ± 45%  perf-sched.wait_and_delay.max.ms.__cond_resched.lock_sock_nested.tcp_recvmsg.inet_recvmsg.sock_recvmsg
    435.94 ±122%    +263.3%       1583 ± 27%  perf-sched.wait_and_delay.max.ms.do_task_dead.do_exit.do_group_exit.__x64_sys_exit_group.x64_sys_call
    987.24 ± 22%     +59.8%       1577 ±  6%  perf-sched.wait_and_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
      4.49 ±  4%     -16.2%       3.76 ±  6%  perf-sched.wait_time.avg.ms.__cond_resched.__release_sock.release_sock.tcp_recvmsg.inet_recvmsg
      6.77           -12.1%       5.95 ±  4%  perf-sched.wait_time.avg.ms.__cond_resched.__release_sock.release_sock.tcp_sendmsg.__sys_sendto
      4.89 ±  4%     -15.7%       4.12        perf-sched.wait_time.avg.ms.__cond_resched.lock_sock_nested.tcp_recvmsg.inet_recvmsg.sock_recvmsg
      9.97 ±  9%     -24.0%       7.58 ±  5%  perf-sched.wait_time.avg.ms.__cond_resched.lock_sock_nested.tcp_sendmsg.__sys_sendto.__x64_sys_sendto
      6.82 ±  2%      -9.6%       6.17        perf-sched.wait_time.avg.ms.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
      5.75 ± 18%     -87.3%       0.73 ±113%  perf-sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_common_interrupt.[unknown]
      2.01 ± 14%     +38.1%       2.78 ±  7%  perf-sched.wait_time.avg.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
      2.38 ±  7%     -12.0%       2.09 ±  8%  perf-sched.wait_time.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
      5.50 ±  3%     +24.4%       6.84 ±  7%  perf-sched.wait_time.avg.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
      3.59           -14.7%       3.06 ±  3%  perf-sched.wait_time.avg.ms.schedule_timeout.wait_woken.sk_wait_data.tcp_recvmsg_locked
     26.85 ±  6%    +311.9%     110.60 ± 63%  perf-sched.wait_time.max.ms.__cond_resched.lock_sock_nested.tcp_recvmsg.inet_recvmsg.sock_recvmsg
      6.12 ± 18%     -81.2%       1.15 ±116%  perf-sched.wait_time.max.ms.irqentry_exit_to_user_mode.asm_common_interrupt.[unknown]
    985.54 ± 22%     +59.9%       1576 ±  6%  perf-sched.wait_time.max.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
      2411 ± 57%     -74.1%     623.65 ± 38%  perf-sched.wait_time.max.ms.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
     17.94 ± 19%     -38.0%      11.12 ± 19%  perf-sched.wait_time.max.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
    249.22 ± 28%     +48.9%     371.19 ± 16%  perf-sched.wait_time.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ