[<prev] [next>] [day] [month] [year] [list]
Message-ID: <202512092342.3ee2de77-lkp@intel.com>
Date: Tue, 9 Dec 2025 23:41:37 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Thomas Gleixner <tglx@...utronix.de>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, <linux-kernel@...r.kernel.org>,
Ingo Molnar <mingo@...nel.org>, Peter Zijlstra <peterz@...radead.org>,
Mathieu Desnoyers <mathieu.desnoyers@...icios.com>, <oliver.sang@...el.com>
Subject: [linus:master] [rseq] abc850e761:
stress-ng.sem.sem_wait_calls_per_sec 3.1% improvement
Hello,
kernel test robot noticed a 3.1% improvement of stress-ng.sem.sem_wait_calls_per_sec on:
commit: abc850e7616c91ebaa3f5ba3617ab0a104d45039 ("rseq: Provide and use rseq_update_user_cs()")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
testcase: stress-ng
config: x86_64-rhel-9.4
compiler: gcc-14
test machine: 192 threads 2 sockets Intel(R) Xeon(R) 6740E CPU @ 2.4GHz (Sierra Forest) with 256G memory
parameters:
nr_threads: 100%
testtime: 60s
test: sem
cpufreq_governor: performance
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20251209/202512092342.3ee2de77-lkp@intel.com
=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
gcc-14/performance/x86_64-rhel-9.4/100%/debian-13-x86_64-20250902.cgz/lkp-srf-2sp3/sem/stress-ng/60s
commit:
9c37cb6e80 ("rseq: Provide static branch for runtime debugging")
abc850e761 ("rseq: Provide and use rseq_update_user_cs()")
9c37cb6e80b8fcdd abc850e7616c91ebaa3f5ba3617
---------------- ---------------------------
%stddev %change %stddev
\ | \
713480 ± 29% -24.9% 536114 ± 28% meminfo.Mapped
19261235 ± 14% -28.3% 13815751 ± 45% perf-sched.total_wait_and_delay.count.ms
19261235 ± 14% -28.3% 13815751 ± 45% perf-sched.wait_and_delay.count.[unknown].[unknown].[unknown].[unknown].[unknown]
209285 ± 4% -3.8% 201417 ± 3% proc-vmstat.nr_anon_pages
179839 ± 29% -25.3% 134393 ± 28% proc-vmstat.nr_mapped
0.21 +0.0 0.22 perf-stat.i.branch-miss-rate%
3.044e+08 +3.6% 3.154e+08 perf-stat.i.branch-misses
1.933e+08 +3.5% 2.001e+08 perf-stat.i.context-switches
0.93 ± 3% +6.7% 0.99 perf-stat.i.metric.M/sec
0.20 +0.0 0.21 perf-stat.overall.branch-miss-rate%
2.996e+08 +3.6% 3.104e+08 perf-stat.ps.branch-misses
1.903e+08 +3.5% 1.97e+08 perf-stat.ps.context-switches
1.341e+10 +2.6% 1.377e+10 stress-ng.sem.ops
2.235e+08 +2.6% 2.294e+08 stress-ng.sem.ops_per_sec
374680 +3.1% 386364 stress-ng.sem.sem_timedwait_calls_per_sec
374638 +3.2% 386525 stress-ng.sem.sem_trywait_calls_per_sec
374649 +3.1% 386331 stress-ng.sem.sem_wait_calls_per_sec
1.178e+10 +3.5% 1.219e+10 stress-ng.time.involuntary_context_switches
7623 -1.2% 7530 stress-ng.time.system_time
3803 +2.8% 3908 stress-ng.time.user_time
8.44 -2.8 5.65 perf-profile.calltrace.cycles-pp.__rseq_handle_notify_resume.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe.__sched_yield
9.62 -2.6 7.02 perf-profile.calltrace.cycles-pp.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe.__sched_yield
59.80 -1.0 58.80 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__sched_yield
60.34 -1.0 59.38 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__sched_yield
0.83 +0.0 0.86 perf-profile.calltrace.cycles-pp._raw_spin_lock.raw_spin_rq_lock_nested.__schedule.schedule.__x64_sys_sched_yield
0.65 +0.0 0.68 perf-profile.calltrace.cycles-pp.__update_load_avg_se.update_load_avg.put_prev_entity.pick_next_task_fair.__pick_next_task
1.00 +0.0 1.03 perf-profile.calltrace.cycles-pp.update_load_avg.set_next_entity.pick_next_task_fair.__pick_next_task.__schedule
1.11 +0.0 1.15 perf-profile.calltrace.cycles-pp.__enqueue_entity.put_prev_entity.pick_next_task_fair.__pick_next_task.__schedule
1.56 +0.1 1.62 perf-profile.calltrace.cycles-pp.update_load_avg.put_prev_entity.pick_next_task_fair.__pick_next_task.__schedule
1.73 +0.1 1.79 perf-profile.calltrace.cycles-pp.native_sched_clock.sched_clock.sched_clock_cpu.update_rq_clock.yield_task_fair
1.77 +0.1 1.84 perf-profile.calltrace.cycles-pp.sched_clock.sched_clock_cpu.update_rq_clock.yield_task_fair.do_sched_yield
2.13 +0.1 2.20 perf-profile.calltrace.cycles-pp.__rdgsbase_inactive.__sched_yield
1.84 +0.1 1.90 perf-profile.calltrace.cycles-pp.sched_clock_cpu.update_rq_clock.yield_task_fair.do_sched_yield.__x64_sys_sched_yield
1.95 +0.1 2.02 perf-profile.calltrace.cycles-pp.set_next_entity.pick_next_task_fair.__pick_next_task.__schedule.schedule
2.06 +0.1 2.14 perf-profile.calltrace.cycles-pp.update_rq_clock.yield_task_fair.do_sched_yield.__x64_sys_sched_yield.do_syscall_64
2.38 +0.1 2.46 perf-profile.calltrace.cycles-pp.prepare_task_switch.__schedule.schedule.__x64_sys_sched_yield.do_syscall_64
2.98 +0.1 3.08 perf-profile.calltrace.cycles-pp.__wrgsbase_inactive.__sched_yield
3.26 +0.1 3.38 perf-profile.calltrace.cycles-pp.put_prev_entity.pick_next_task_fair.__pick_next_task.__schedule.schedule
5.15 +0.2 5.32 perf-profile.calltrace.cycles-pp.yield_task_fair.do_sched_yield.__x64_sys_sched_yield.do_syscall_64.entry_SYSCALL_64_after_hwframe
6.45 +0.2 6.66 perf-profile.calltrace.cycles-pp.do_sched_yield.__x64_sys_sched_yield.do_syscall_64.entry_SYSCALL_64_after_hwframe.__sched_yield
3.28 +0.2 3.50 perf-profile.calltrace.cycles-pp.rseq_update_cpu_node_id.__rseq_handle_notify_resume.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe
9.21 +0.3 9.50 perf-profile.calltrace.cycles-pp.pick_next_task_fair.__pick_next_task.__schedule.schedule.__x64_sys_sched_yield
9.41 +0.3 9.70 perf-profile.calltrace.cycles-pp.__pick_next_task.__schedule.schedule.__x64_sys_sched_yield.do_syscall_64
17.41 +0.5 17.88 perf-profile.calltrace.cycles-pp.__schedule.schedule.__x64_sys_sched_yield.do_syscall_64.entry_SYSCALL_64_after_hwframe
17.59 +0.5 18.07 perf-profile.calltrace.cycles-pp.schedule.__x64_sys_sched_yield.do_syscall_64.entry_SYSCALL_64_after_hwframe.__sched_yield
16.02 +0.6 16.60 perf-profile.calltrace.cycles-pp.os_xsave.__sched_yield
24.16 +0.7 24.86 perf-profile.calltrace.cycles-pp.__x64_sys_sched_yield.do_syscall_64.entry_SYSCALL_64_after_hwframe.__sched_yield
20.98 +0.7 21.71 perf-profile.calltrace.cycles-pp.restore_fpregs_from_fpstate.switch_fpu_return.arch_exit_to_user_mode_prepare.do_syscall_64.entry_SYSCALL_64_after_hwframe
23.42 +0.8 24.26 perf-profile.calltrace.cycles-pp.switch_fpu_return.arch_exit_to_user_mode_prepare.do_syscall_64.entry_SYSCALL_64_after_hwframe.__sched_yield
25.22 +0.9 26.11 perf-profile.calltrace.cycles-pp.arch_exit_to_user_mode_prepare.do_syscall_64.entry_SYSCALL_64_after_hwframe.__sched_yield
8.52 -2.6 5.88 perf-profile.children.cycles-pp.__rseq_handle_notify_resume
9.64 -2.6 7.04 perf-profile.children.cycles-pp.exit_to_user_mode_loop
59.97 -1.0 58.99 perf-profile.children.cycles-pp.do_syscall_64
60.46 -1.0 59.50 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
0.09 +0.0 0.10 perf-profile.children.cycles-pp.propagate_entity_load_avg
0.12 ± 4% +0.0 0.14 ± 3% perf-profile.children.cycles-pp.clock_gettime
1.12 +0.0 1.16 perf-profile.children.cycles-pp.__enqueue_entity
1.14 +0.0 1.18 perf-profile.children.cycles-pp.__update_load_avg_se
1.97 +0.1 2.04 perf-profile.children.cycles-pp.set_next_entity
2.38 +0.1 2.46 perf-profile.children.cycles-pp.prepare_task_switch
2.41 +0.1 2.49 perf-profile.children.cycles-pp.__rdgsbase_inactive
2.67 +0.1 2.77 perf-profile.children.cycles-pp.update_load_avg
3.26 +0.1 3.37 perf-profile.children.cycles-pp.__wrgsbase_inactive
3.30 +0.1 3.41 perf-profile.children.cycles-pp.put_prev_entity
5.17 +0.2 5.34 perf-profile.children.cycles-pp.yield_task_fair
6.49 +0.2 6.70 perf-profile.children.cycles-pp.do_sched_yield
3.45 +0.2 3.68 perf-profile.children.cycles-pp.rseq_update_cpu_node_id
9.24 +0.3 9.54 perf-profile.children.cycles-pp.pick_next_task_fair
9.43 +0.3 9.73 perf-profile.children.cycles-pp.__pick_next_task
17.48 +0.5 17.96 perf-profile.children.cycles-pp.__schedule
17.61 +0.5 18.09 perf-profile.children.cycles-pp.schedule
16.04 +0.6 16.61 perf-profile.children.cycles-pp.os_xsave
24.18 +0.7 24.88 perf-profile.children.cycles-pp.__x64_sys_sched_yield
21.01 +0.7 21.75 perf-profile.children.cycles-pp.restore_fpregs_from_fpstate
23.44 +0.8 24.28 perf-profile.children.cycles-pp.switch_fpu_return
25.24 +0.9 26.15 perf-profile.children.cycles-pp.arch_exit_to_user_mode_prepare
0.67 +0.0 0.70 perf-profile.self.cycles-pp.___perf_sw_event
0.92 +0.0 0.95 perf-profile.self.cycles-pp.update_curr
1.11 +0.0 1.15 perf-profile.self.cycles-pp.__enqueue_entity
0.80 +0.0 0.84 perf-profile.self.cycles-pp.update_load_avg
0.70 +0.0 0.73 perf-profile.self.cycles-pp.pick_next_task_fair
1.04 +0.0 1.08 perf-profile.self.cycles-pp.exit_to_user_mode_loop
1.12 +0.1 1.17 perf-profile.self.cycles-pp.__update_load_avg_se
1.78 +0.1 1.84 perf-profile.self.cycles-pp.arch_exit_to_user_mode_prepare
1.48 +0.1 1.54 perf-profile.self.cycles-pp.prepare_task_switch
0.69 +0.1 0.75 perf-profile.self.cycles-pp.do_syscall_64
2.40 +0.1 2.48 perf-profile.self.cycles-pp.__rdgsbase_inactive
3.12 +0.1 3.23 perf-profile.self.cycles-pp.__wrgsbase_inactive
16.02 +0.6 16.61 perf-profile.self.cycles-pp.os_xsave
21.00 +0.7 21.74 perf-profile.self.cycles-pp.restore_fpregs_from_fpstate
0.37 +2.0 2.40 perf-profile.self.cycles-pp.__rseq_handle_notify_resume
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
Powered by blists - more mailing lists