lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <202511251654.c89d08f7-lkp@intel.com>
Date: Tue, 25 Nov 2025 16:27:52 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Thomas Gleixner <tglx@...utronix.de>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, <linux-kernel@...r.kernel.org>,
	<x86@...nel.org>, Ingo Molnar <mingo@...nel.org>, Peter Zijlstra
	<peterz@...radead.org>, Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
	<oliver.sang@...el.com>
Subject: [tip:core/rseq] [rseq]  e2d4f42271:
 stress-ng.sem.sem_wait_calls_per_sec 2.3% improvement



Hello,

kernel test robot noticed a 2.3% improvement of stress-ng.sem.sem_wait_calls_per_sec on:


commit: e2d4f42271155045a49b89530f2c06ad8e9f1a1e ("rseq: Rework the TIF_NOTIFY handler")
https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git core/rseq


testcase: stress-ng
config: x86_64-rhel-9.4
compiler: gcc-14
test machine: 192 threads 2 sockets Intel(R) Xeon(R) 6740E  CPU @ 2.4GHz (Sierra Forest) with 256G memory
parameters:

	nr_threads: 100%
	testtime: 60s
	test: sem
	cpufreq_governor: performance



Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20251125/202511251654.c89d08f7-lkp@intel.com

=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
  gcc-14/performance/x86_64-rhel-9.4/100%/debian-13-x86_64-20250902.cgz/lkp-srf-2sp3/sem/stress-ng/60s

commit: 
  9f6ffd4ceb ("rseq: Separate the signal delivery path")
  e2d4f42271 ("rseq: Rework the TIF_NOTIFY handler")

9f6ffd4cebda8684 e2d4f42271155045a49b89530f2 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
    796.49 ±  5%     +15.3%     918.13 ± 12%  sched_debug.cpu.curr->pid.stddev
     19498 ±  3%     +18.9%      23177 ±  5%  numa-meminfo.node0.KernelStack
     22517 ±  2%     -15.5%      19034 ±  7%  numa-meminfo.node1.KernelStack
     41569 ± 58%    +192.9%     121756 ± 44%  numa-numastat.node0.other_node
    156472 ± 15%     -51.3%      76228 ± 70%  numa-numastat.node1.other_node
  17079533 ±  7%     -23.7%   13033056 ± 45%  perf-sched.total_wait_and_delay.count.ms
  17079533 ±  7%     -23.7%   13033056 ± 45%  perf-sched.wait_and_delay.count.[unknown].[unknown].[unknown].[unknown].[unknown]
    385459            +2.3%     394214        stress-ng.sem.sem_timedwait_calls_per_sec
    385418            +2.3%     394172        stress-ng.sem.sem_wait_calls_per_sec
      3947            +2.2%       4035        stress-ng.time.user_time
     19505 ±  3%     +18.8%      23176 ±  5%  numa-vmstat.node0.nr_kernel_stack
     41569 ± 58%    +192.9%     121756 ± 44%  numa-vmstat.node0.numa_other
     22534 ±  2%     -15.5%      19034 ±  7%  numa-vmstat.node1.nr_kernel_stack
    156472 ± 15%     -51.3%      76228 ± 70%  numa-vmstat.node1.numa_other
  1.51e+11            +1.9%  1.538e+11        perf-stat.i.branch-instructions
      0.88            -1.9%       0.87        perf-stat.i.cpi
     81367 ± 10%     +58.9%     129281 ± 30%  perf-stat.i.cycles-between-cache-misses
 6.983e+11            +1.7%  7.105e+11        perf-stat.i.instructions
      1.15            +1.7%       1.17        perf-stat.i.ipc
      0.87            -1.7%       0.85        perf-stat.overall.cpi
      1.16            +1.8%       1.18        perf-stat.overall.ipc
 1.484e+11            +1.9%  1.513e+11        perf-stat.ps.branch-instructions
 6.864e+11            +1.8%  6.986e+11        perf-stat.ps.instructions
 4.197e+13            +2.1%  4.284e+13        perf-stat.total.instructions
      5.09            -2.0        3.14 ±  2%  perf-profile.calltrace.cycles-pp.__rseq_handle_notify_resume.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe.__sched_yield
      6.22            -1.9        4.29 ±  2%  perf-profile.calltrace.cycles-pp.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe.__sched_yield
     58.07            -0.9       57.21        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__sched_yield
     58.59            -0.9       57.74        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__sched_yield
      2.45            +0.2        2.67 ±  2%  perf-profile.calltrace.cycles-pp.rseq_set_ids_get_csaddr.__rseq_handle_notify_resume.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe
      9.50            +0.3        9.76        perf-profile.calltrace.cycles-pp.pick_next_task_fair.__pick_next_task.__schedule.schedule.__x64_sys_sched_yield
      9.71            +0.3        9.98        perf-profile.calltrace.cycles-pp.__pick_next_task.__schedule.schedule.__x64_sys_sched_yield.do_syscall_64
     18.18            +0.4       18.57        perf-profile.calltrace.cycles-pp.__schedule.schedule.__x64_sys_sched_yield.do_syscall_64.entry_SYSCALL_64_after_hwframe
     18.37            +0.4       18.77        perf-profile.calltrace.cycles-pp.schedule.__x64_sys_sched_yield.do_syscall_64.entry_SYSCALL_64_after_hwframe.__sched_yield
     25.20            +0.5       25.71        perf-profile.calltrace.cycles-pp.__x64_sys_sched_yield.do_syscall_64.entry_SYSCALL_64_after_hwframe.__sched_yield
      5.12            -2.0        3.16 ±  2%  perf-profile.children.cycles-pp.__rseq_handle_notify_resume
      6.24            -1.9        4.31 ±  2%  perf-profile.children.cycles-pp.exit_to_user_mode_loop
     58.27            -0.8       57.44        perf-profile.children.cycles-pp.do_syscall_64
     58.78            -0.8       57.96        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
      0.05            +0.0        0.06 ±  6%  perf-profile.children.cycles-pp.sem_wait@plt
      1.16 ±  2%      +0.0        1.20 ±  2%  perf-profile.children.cycles-pp.__pick_eevdf
      2.45            +0.2        2.68 ±  2%  perf-profile.children.cycles-pp.rseq_set_ids_get_csaddr
      9.54            +0.3        9.80        perf-profile.children.cycles-pp.pick_next_task_fair
      9.73            +0.3       10.00        perf-profile.children.cycles-pp.__pick_next_task
     18.26            +0.4       18.66        perf-profile.children.cycles-pp.__schedule
     18.39            +0.4       18.80        perf-profile.children.cycles-pp.schedule
     25.22            +0.5       25.73        perf-profile.children.cycles-pp.__x64_sys_sched_yield
      0.58            -0.1        0.48 ±  3%  perf-profile.self.cycles-pp.__rseq_handle_notify_resume
      0.96            +0.0        0.98        perf-profile.self.cycles-pp.update_curr
      2.43            +0.2        2.67 ±  2%  perf-profile.self.cycles-pp.rseq_set_ids_get_csaddr




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ