lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <202512092342.3ee2de77-lkp@intel.com>
Date: Tue, 9 Dec 2025 23:41:37 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Thomas Gleixner <tglx@...utronix.de>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, <linux-kernel@...r.kernel.org>,
	Ingo Molnar <mingo@...nel.org>, Peter Zijlstra <peterz@...radead.org>,
	Mathieu Desnoyers <mathieu.desnoyers@...icios.com>, <oliver.sang@...el.com>
Subject: [linus:master] [rseq]  abc850e761:
 stress-ng.sem.sem_wait_calls_per_sec 3.1% improvement



Hello,

kernel test robot noticed a 3.1% improvement of stress-ng.sem.sem_wait_calls_per_sec on:


commit: abc850e7616c91ebaa3f5ba3617ab0a104d45039 ("rseq: Provide and use rseq_update_user_cs()")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master


testcase: stress-ng
config: x86_64-rhel-9.4
compiler: gcc-14
test machine: 192 threads 2 sockets Intel(R) Xeon(R) 6740E  CPU @ 2.4GHz (Sierra Forest) with 256G memory
parameters:

	nr_threads: 100%
	testtime: 60s
	test: sem
	cpufreq_governor: performance



Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20251209/202512092342.3ee2de77-lkp@intel.com

=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
  gcc-14/performance/x86_64-rhel-9.4/100%/debian-13-x86_64-20250902.cgz/lkp-srf-2sp3/sem/stress-ng/60s

commit: 
  9c37cb6e80 ("rseq: Provide static branch for runtime debugging")
  abc850e761 ("rseq: Provide and use rseq_update_user_cs()")

9c37cb6e80b8fcdd abc850e7616c91ebaa3f5ba3617 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
    713480 ± 29%     -24.9%     536114 ± 28%  meminfo.Mapped
  19261235 ± 14%     -28.3%   13815751 ± 45%  perf-sched.total_wait_and_delay.count.ms
  19261235 ± 14%     -28.3%   13815751 ± 45%  perf-sched.wait_and_delay.count.[unknown].[unknown].[unknown].[unknown].[unknown]
    209285 ±  4%      -3.8%     201417 ±  3%  proc-vmstat.nr_anon_pages
    179839 ± 29%     -25.3%     134393 ± 28%  proc-vmstat.nr_mapped
      0.21            +0.0        0.22        perf-stat.i.branch-miss-rate%
 3.044e+08            +3.6%  3.154e+08        perf-stat.i.branch-misses
 1.933e+08            +3.5%  2.001e+08        perf-stat.i.context-switches
      0.93 ±  3%      +6.7%       0.99        perf-stat.i.metric.M/sec
      0.20            +0.0        0.21        perf-stat.overall.branch-miss-rate%
 2.996e+08            +3.6%  3.104e+08        perf-stat.ps.branch-misses
 1.903e+08            +3.5%   1.97e+08        perf-stat.ps.context-switches
 1.341e+10            +2.6%  1.377e+10        stress-ng.sem.ops
 2.235e+08            +2.6%  2.294e+08        stress-ng.sem.ops_per_sec
    374680            +3.1%     386364        stress-ng.sem.sem_timedwait_calls_per_sec
    374638            +3.2%     386525        stress-ng.sem.sem_trywait_calls_per_sec
    374649            +3.1%     386331        stress-ng.sem.sem_wait_calls_per_sec
 1.178e+10            +3.5%  1.219e+10        stress-ng.time.involuntary_context_switches
      7623            -1.2%       7530        stress-ng.time.system_time
      3803            +2.8%       3908        stress-ng.time.user_time
      8.44            -2.8        5.65        perf-profile.calltrace.cycles-pp.__rseq_handle_notify_resume.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe.__sched_yield
      9.62            -2.6        7.02        perf-profile.calltrace.cycles-pp.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe.__sched_yield
     59.80            -1.0       58.80        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__sched_yield
     60.34            -1.0       59.38        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__sched_yield
      0.83            +0.0        0.86        perf-profile.calltrace.cycles-pp._raw_spin_lock.raw_spin_rq_lock_nested.__schedule.schedule.__x64_sys_sched_yield
      0.65            +0.0        0.68        perf-profile.calltrace.cycles-pp.__update_load_avg_se.update_load_avg.put_prev_entity.pick_next_task_fair.__pick_next_task
      1.00            +0.0        1.03        perf-profile.calltrace.cycles-pp.update_load_avg.set_next_entity.pick_next_task_fair.__pick_next_task.__schedule
      1.11            +0.0        1.15        perf-profile.calltrace.cycles-pp.__enqueue_entity.put_prev_entity.pick_next_task_fair.__pick_next_task.__schedule
      1.56            +0.1        1.62        perf-profile.calltrace.cycles-pp.update_load_avg.put_prev_entity.pick_next_task_fair.__pick_next_task.__schedule
      1.73            +0.1        1.79        perf-profile.calltrace.cycles-pp.native_sched_clock.sched_clock.sched_clock_cpu.update_rq_clock.yield_task_fair
      1.77            +0.1        1.84        perf-profile.calltrace.cycles-pp.sched_clock.sched_clock_cpu.update_rq_clock.yield_task_fair.do_sched_yield
      2.13            +0.1        2.20        perf-profile.calltrace.cycles-pp.__rdgsbase_inactive.__sched_yield
      1.84            +0.1        1.90        perf-profile.calltrace.cycles-pp.sched_clock_cpu.update_rq_clock.yield_task_fair.do_sched_yield.__x64_sys_sched_yield
      1.95            +0.1        2.02        perf-profile.calltrace.cycles-pp.set_next_entity.pick_next_task_fair.__pick_next_task.__schedule.schedule
      2.06            +0.1        2.14        perf-profile.calltrace.cycles-pp.update_rq_clock.yield_task_fair.do_sched_yield.__x64_sys_sched_yield.do_syscall_64
      2.38            +0.1        2.46        perf-profile.calltrace.cycles-pp.prepare_task_switch.__schedule.schedule.__x64_sys_sched_yield.do_syscall_64
      2.98            +0.1        3.08        perf-profile.calltrace.cycles-pp.__wrgsbase_inactive.__sched_yield
      3.26            +0.1        3.38        perf-profile.calltrace.cycles-pp.put_prev_entity.pick_next_task_fair.__pick_next_task.__schedule.schedule
      5.15            +0.2        5.32        perf-profile.calltrace.cycles-pp.yield_task_fair.do_sched_yield.__x64_sys_sched_yield.do_syscall_64.entry_SYSCALL_64_after_hwframe
      6.45            +0.2        6.66        perf-profile.calltrace.cycles-pp.do_sched_yield.__x64_sys_sched_yield.do_syscall_64.entry_SYSCALL_64_after_hwframe.__sched_yield
      3.28            +0.2        3.50        perf-profile.calltrace.cycles-pp.rseq_update_cpu_node_id.__rseq_handle_notify_resume.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe
      9.21            +0.3        9.50        perf-profile.calltrace.cycles-pp.pick_next_task_fair.__pick_next_task.__schedule.schedule.__x64_sys_sched_yield
      9.41            +0.3        9.70        perf-profile.calltrace.cycles-pp.__pick_next_task.__schedule.schedule.__x64_sys_sched_yield.do_syscall_64
     17.41            +0.5       17.88        perf-profile.calltrace.cycles-pp.__schedule.schedule.__x64_sys_sched_yield.do_syscall_64.entry_SYSCALL_64_after_hwframe
     17.59            +0.5       18.07        perf-profile.calltrace.cycles-pp.schedule.__x64_sys_sched_yield.do_syscall_64.entry_SYSCALL_64_after_hwframe.__sched_yield
     16.02            +0.6       16.60        perf-profile.calltrace.cycles-pp.os_xsave.__sched_yield
     24.16            +0.7       24.86        perf-profile.calltrace.cycles-pp.__x64_sys_sched_yield.do_syscall_64.entry_SYSCALL_64_after_hwframe.__sched_yield
     20.98            +0.7       21.71        perf-profile.calltrace.cycles-pp.restore_fpregs_from_fpstate.switch_fpu_return.arch_exit_to_user_mode_prepare.do_syscall_64.entry_SYSCALL_64_after_hwframe
     23.42            +0.8       24.26        perf-profile.calltrace.cycles-pp.switch_fpu_return.arch_exit_to_user_mode_prepare.do_syscall_64.entry_SYSCALL_64_after_hwframe.__sched_yield
     25.22            +0.9       26.11        perf-profile.calltrace.cycles-pp.arch_exit_to_user_mode_prepare.do_syscall_64.entry_SYSCALL_64_after_hwframe.__sched_yield
      8.52            -2.6        5.88        perf-profile.children.cycles-pp.__rseq_handle_notify_resume
      9.64            -2.6        7.04        perf-profile.children.cycles-pp.exit_to_user_mode_loop
     59.97            -1.0       58.99        perf-profile.children.cycles-pp.do_syscall_64
     60.46            -1.0       59.50        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
      0.09            +0.0        0.10        perf-profile.children.cycles-pp.propagate_entity_load_avg
      0.12 ±  4%      +0.0        0.14 ±  3%  perf-profile.children.cycles-pp.clock_gettime
      1.12            +0.0        1.16        perf-profile.children.cycles-pp.__enqueue_entity
      1.14            +0.0        1.18        perf-profile.children.cycles-pp.__update_load_avg_se
      1.97            +0.1        2.04        perf-profile.children.cycles-pp.set_next_entity
      2.38            +0.1        2.46        perf-profile.children.cycles-pp.prepare_task_switch
      2.41            +0.1        2.49        perf-profile.children.cycles-pp.__rdgsbase_inactive
      2.67            +0.1        2.77        perf-profile.children.cycles-pp.update_load_avg
      3.26            +0.1        3.37        perf-profile.children.cycles-pp.__wrgsbase_inactive
      3.30            +0.1        3.41        perf-profile.children.cycles-pp.put_prev_entity
      5.17            +0.2        5.34        perf-profile.children.cycles-pp.yield_task_fair
      6.49            +0.2        6.70        perf-profile.children.cycles-pp.do_sched_yield
      3.45            +0.2        3.68        perf-profile.children.cycles-pp.rseq_update_cpu_node_id
      9.24            +0.3        9.54        perf-profile.children.cycles-pp.pick_next_task_fair
      9.43            +0.3        9.73        perf-profile.children.cycles-pp.__pick_next_task
     17.48            +0.5       17.96        perf-profile.children.cycles-pp.__schedule
     17.61            +0.5       18.09        perf-profile.children.cycles-pp.schedule
     16.04            +0.6       16.61        perf-profile.children.cycles-pp.os_xsave
     24.18            +0.7       24.88        perf-profile.children.cycles-pp.__x64_sys_sched_yield
     21.01            +0.7       21.75        perf-profile.children.cycles-pp.restore_fpregs_from_fpstate
     23.44            +0.8       24.28        perf-profile.children.cycles-pp.switch_fpu_return
     25.24            +0.9       26.15        perf-profile.children.cycles-pp.arch_exit_to_user_mode_prepare
      0.67            +0.0        0.70        perf-profile.self.cycles-pp.___perf_sw_event
      0.92            +0.0        0.95        perf-profile.self.cycles-pp.update_curr
      1.11            +0.0        1.15        perf-profile.self.cycles-pp.__enqueue_entity
      0.80            +0.0        0.84        perf-profile.self.cycles-pp.update_load_avg
      0.70            +0.0        0.73        perf-profile.self.cycles-pp.pick_next_task_fair
      1.04            +0.0        1.08        perf-profile.self.cycles-pp.exit_to_user_mode_loop
      1.12            +0.1        1.17        perf-profile.self.cycles-pp.__update_load_avg_se
      1.78            +0.1        1.84        perf-profile.self.cycles-pp.arch_exit_to_user_mode_prepare
      1.48            +0.1        1.54        perf-profile.self.cycles-pp.prepare_task_switch
      0.69            +0.1        0.75        perf-profile.self.cycles-pp.do_syscall_64
      2.40            +0.1        2.48        perf-profile.self.cycles-pp.__rdgsbase_inactive
      3.12            +0.1        3.23        perf-profile.self.cycles-pp.__wrgsbase_inactive
     16.02            +0.6       16.61        perf-profile.self.cycles-pp.os_xsave
     21.00            +0.7       21.74        perf-profile.self.cycles-pp.restore_fpregs_from_fpstate
      0.37            +2.0        2.40        perf-profile.self.cycles-pp.__rseq_handle_notify_resume




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ