[<prev] [next>] [day] [month] [year] [list]
Message-ID: <202511251654.c89d08f7-lkp@intel.com>
Date: Tue, 25 Nov 2025 16:27:52 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Thomas Gleixner <tglx@...utronix.de>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, <linux-kernel@...r.kernel.org>,
<x86@...nel.org>, Ingo Molnar <mingo@...nel.org>, Peter Zijlstra
<peterz@...radead.org>, Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
<oliver.sang@...el.com>
Subject: [tip:core/rseq] [rseq] e2d4f42271:
stress-ng.sem.sem_wait_calls_per_sec 2.3% improvement
Hello,
kernel test robot noticed a 2.3% improvement of stress-ng.sem.sem_wait_calls_per_sec on:
commit: e2d4f42271155045a49b89530f2c06ad8e9f1a1e ("rseq: Rework the TIF_NOTIFY handler")
https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git core/rseq
testcase: stress-ng
config: x86_64-rhel-9.4
compiler: gcc-14
test machine: 192 threads 2 sockets Intel(R) Xeon(R) 6740E CPU @ 2.4GHz (Sierra Forest) with 256G memory
parameters:
nr_threads: 100%
testtime: 60s
test: sem
cpufreq_governor: performance
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20251125/202511251654.c89d08f7-lkp@intel.com
=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
gcc-14/performance/x86_64-rhel-9.4/100%/debian-13-x86_64-20250902.cgz/lkp-srf-2sp3/sem/stress-ng/60s
commit:
9f6ffd4ceb ("rseq: Separate the signal delivery path")
e2d4f42271 ("rseq: Rework the TIF_NOTIFY handler")
9f6ffd4cebda8684 e2d4f42271155045a49b89530f2
---------------- ---------------------------
%stddev %change %stddev
\ | \
796.49 ± 5% +15.3% 918.13 ± 12% sched_debug.cpu.curr->pid.stddev
19498 ± 3% +18.9% 23177 ± 5% numa-meminfo.node0.KernelStack
22517 ± 2% -15.5% 19034 ± 7% numa-meminfo.node1.KernelStack
41569 ± 58% +192.9% 121756 ± 44% numa-numastat.node0.other_node
156472 ± 15% -51.3% 76228 ± 70% numa-numastat.node1.other_node
17079533 ± 7% -23.7% 13033056 ± 45% perf-sched.total_wait_and_delay.count.ms
17079533 ± 7% -23.7% 13033056 ± 45% perf-sched.wait_and_delay.count.[unknown].[unknown].[unknown].[unknown].[unknown]
385459 +2.3% 394214 stress-ng.sem.sem_timedwait_calls_per_sec
385418 +2.3% 394172 stress-ng.sem.sem_wait_calls_per_sec
3947 +2.2% 4035 stress-ng.time.user_time
19505 ± 3% +18.8% 23176 ± 5% numa-vmstat.node0.nr_kernel_stack
41569 ± 58% +192.9% 121756 ± 44% numa-vmstat.node0.numa_other
22534 ± 2% -15.5% 19034 ± 7% numa-vmstat.node1.nr_kernel_stack
156472 ± 15% -51.3% 76228 ± 70% numa-vmstat.node1.numa_other
1.51e+11 +1.9% 1.538e+11 perf-stat.i.branch-instructions
0.88 -1.9% 0.87 perf-stat.i.cpi
81367 ± 10% +58.9% 129281 ± 30% perf-stat.i.cycles-between-cache-misses
6.983e+11 +1.7% 7.105e+11 perf-stat.i.instructions
1.15 +1.7% 1.17 perf-stat.i.ipc
0.87 -1.7% 0.85 perf-stat.overall.cpi
1.16 +1.8% 1.18 perf-stat.overall.ipc
1.484e+11 +1.9% 1.513e+11 perf-stat.ps.branch-instructions
6.864e+11 +1.8% 6.986e+11 perf-stat.ps.instructions
4.197e+13 +2.1% 4.284e+13 perf-stat.total.instructions
5.09 -2.0 3.14 ± 2% perf-profile.calltrace.cycles-pp.__rseq_handle_notify_resume.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe.__sched_yield
6.22 -1.9 4.29 ± 2% perf-profile.calltrace.cycles-pp.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe.__sched_yield
58.07 -0.9 57.21 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__sched_yield
58.59 -0.9 57.74 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__sched_yield
2.45 +0.2 2.67 ± 2% perf-profile.calltrace.cycles-pp.rseq_set_ids_get_csaddr.__rseq_handle_notify_resume.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe
9.50 +0.3 9.76 perf-profile.calltrace.cycles-pp.pick_next_task_fair.__pick_next_task.__schedule.schedule.__x64_sys_sched_yield
9.71 +0.3 9.98 perf-profile.calltrace.cycles-pp.__pick_next_task.__schedule.schedule.__x64_sys_sched_yield.do_syscall_64
18.18 +0.4 18.57 perf-profile.calltrace.cycles-pp.__schedule.schedule.__x64_sys_sched_yield.do_syscall_64.entry_SYSCALL_64_after_hwframe
18.37 +0.4 18.77 perf-profile.calltrace.cycles-pp.schedule.__x64_sys_sched_yield.do_syscall_64.entry_SYSCALL_64_after_hwframe.__sched_yield
25.20 +0.5 25.71 perf-profile.calltrace.cycles-pp.__x64_sys_sched_yield.do_syscall_64.entry_SYSCALL_64_after_hwframe.__sched_yield
5.12 -2.0 3.16 ± 2% perf-profile.children.cycles-pp.__rseq_handle_notify_resume
6.24 -1.9 4.31 ± 2% perf-profile.children.cycles-pp.exit_to_user_mode_loop
58.27 -0.8 57.44 perf-profile.children.cycles-pp.do_syscall_64
58.78 -0.8 57.96 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
0.05 +0.0 0.06 ± 6% perf-profile.children.cycles-pp.sem_wait@plt
1.16 ± 2% +0.0 1.20 ± 2% perf-profile.children.cycles-pp.__pick_eevdf
2.45 +0.2 2.68 ± 2% perf-profile.children.cycles-pp.rseq_set_ids_get_csaddr
9.54 +0.3 9.80 perf-profile.children.cycles-pp.pick_next_task_fair
9.73 +0.3 10.00 perf-profile.children.cycles-pp.__pick_next_task
18.26 +0.4 18.66 perf-profile.children.cycles-pp.__schedule
18.39 +0.4 18.80 perf-profile.children.cycles-pp.schedule
25.22 +0.5 25.73 perf-profile.children.cycles-pp.__x64_sys_sched_yield
0.58 -0.1 0.48 ± 3% perf-profile.self.cycles-pp.__rseq_handle_notify_resume
0.96 +0.0 0.98 perf-profile.self.cycles-pp.update_curr
2.43 +0.2 2.67 ± 2% perf-profile.self.cycles-pp.rseq_set_ids_get_csaddr
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
Powered by blists - more mailing lists