[<prev] [next>] [day] [month] [year] [list]
Message-ID: <202511071439.d081322d-lkp@intel.com>
Date: Fri, 7 Nov 2025 15:01:08 +0800
From: kernel test robot <oliver.sang@...el.com>
To: "Rafael J. Wysocki" <rafael.j.wysocki@...el.com>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, <linux-kernel@...r.kernel.org>,
Doug Smythies <dsmythies@...us.net>, All applicable <stable@...r.kernel.org>,
Christian Loehle <christian.loehle@....com>, <linux-pm@...r.kernel.org>,
<oliver.sang@...el.com>
Subject: [linus:master] [cpuidle] db86f55bf8: lmbench3.PIPE.latency.us
11.5% improvement
Hello,
kernel test robot noticed a 11.5% improvement of lmbench3.PIPE.latency.us on:
commit: db86f55bf81a3a297be05ee8775ae9a8c6e3a599 ("cpuidle: governors: menu: Select polling state in some more cases")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
testcase: lmbench3
config: x86_64-rhel-9.4
compiler: gcc-14
test machine: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 512G memory
parameters:
test_memory_size: 50%
nr_threads: 20%
mode: development
test: PIPE
cpufreq_governor: performance
In addition to that, the commit also has significant impact on the following tests:
+------------------+---------------------------------------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_thread_ops 13.4% improvement |
| test machine | 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 512G memory |
| test parameters | cpufreq_governor=performance |
| | test=context_switch1 |
+------------------+---------------------------------------------------------------------------------------------+
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20251107/202511071439.d081322d-lkp@intel.com
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_threads/rootfs/tbox_group/test/test_memory_size/testcase:
gcc-14/performance/x86_64-rhel-9.4/development/20%/debian-12-x86_64-20240206.cgz/lkp-spr-2sp4/PIPE/50%/lmbench3
commit:
v6.18-rc3
db86f55bf8 ("cpuidle: governors: menu: Select polling state in some more cases")
v6.18-rc3 db86f55bf81a3a297be05ee8775
---------------- ---------------------------
%stddev %change %stddev
\ | \
2.984e+08 ± 3% +13.0% 3.373e+08 ± 2% cpuidle..usage
3870548 ± 3% +8.9% 4215418 ± 3% vmstat.system.cs
5.11 -11.5% 4.52 lmbench3.PIPE.latency.us
2.949e+08 ± 3% +13.0% 3.334e+08 ± 2% lmbench3.time.voluntary_context_switches
1474808 ± 2% +13.7% 1676175 ± 2% sched_debug.cpu.nr_switches.avg
908098 ± 4% +11.5% 1012241 ± 5% sched_debug.cpu.nr_switches.stddev
35.76 ± 6% +70.7% 61.02 ± 36% perf-sched.wait_and_delay.avg.ms.schedule_hrtimeout_range_clock.poll_schedule_timeout.constprop.0.do_poll
438.60 ± 6% -42.6% 251.80 ± 28% perf-sched.wait_and_delay.count.schedule_hrtimeout_range_clock.poll_schedule_timeout.constprop.0.do_poll
35.75 ± 6% +70.7% 61.01 ± 36% perf-sched.wait_time.avg.ms.schedule_hrtimeout_range_clock.poll_schedule_timeout.constprop.0.do_poll
6.834e+09 ± 2% +8.8% 7.438e+09 ± 2% perf-stat.i.branch-instructions
4003246 ± 3% +9.0% 4365130 ± 4% perf-stat.i.context-switches
14211 ± 7% +30.7% 18567 ± 6% perf-stat.i.cycles-between-cache-misses
3.305e+10 +7.8% 3.563e+10 ± 2% perf-stat.i.instructions
17.81 ± 3% +9.1% 19.42 ± 4% perf-stat.i.metric.K/sec
6.738e+09 ± 2% +8.8% 7.328e+09 ± 2% perf-stat.ps.branch-instructions
3917194 ± 3% +9.0% 4267905 ± 4% perf-stat.ps.context-switches
3.257e+10 +7.8% 3.51e+10 ± 2% perf-stat.ps.instructions
4.907e+12 ± 5% +11.7% 5.481e+12 ± 3% perf-stat.total.instructions
***************************************************************************************************
lkp-spr-2sp4: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 512G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase:
gcc-14/performance/x86_64-rhel-9.4/debian-13-x86_64-20250902.cgz/lkp-spr-2sp4/context_switch1/will-it-scale
commit:
v6.18-rc3
db86f55bf8 ("cpuidle: governors: menu: Select polling state in some more cases")
v6.18-rc3 db86f55bf81a3a297be05ee8775
---------------- ---------------------------
%stddev %change %stddev
\ | \
8360618 ± 14% +37.6% 11502592 ± 9% meminfo.DirectMap2M
0.52 ± 4% +0.2 0.68 ± 2% mpstat.cpu.all.irq%
0.07 ± 2% +0.0 0.08 ± 2% mpstat.cpu.all.soft%
24410019 +11.1% 27123411 sched_debug.cpu.nr_switches.avg
56464105 ± 3% +17.9% 66574674 ± 5% sched_debug.cpu.nr_switches.max
473411 +2.6% 485915 proc-vmstat.nr_active_anon
1205948 +1.0% 1218407 proc-vmstat.nr_file_pages
292446 +4.3% 304940 proc-vmstat.nr_shmem
473411 +2.6% 485915 proc-vmstat.nr_zone_active_anon
4.03 ± 3% -2.5 1.49 ± 21% turbostat.C1%
4.83 ± 9% -1.0 3.86 ± 13% turbostat.C1E%
1.087e+08 ± 3% +59.1% 1.729e+08 ± 3% turbostat.IRQ
0.03 ± 13% +2.0 2.03 turbostat.POLL%
4.745e+10 +8.5% 5.147e+10 perf-stat.i.branch-instructions
0.94 -0.0 0.90 ± 3% perf-stat.i.branch-miss-rate%
2.567e+08 +9.1% 2.8e+08 perf-stat.i.branch-misses
48551481 +8.1% 52493759 perf-stat.i.context-switches
2.76 -4.8% 2.63 ± 2% perf-stat.i.cpi
593.92 +5.6% 627.34 perf-stat.i.cpu-migrations
2.367e+11 +8.2% 2.561e+11 perf-stat.i.instructions
0.62 +9.8% 0.68 perf-stat.i.ipc
216.75 +8.1% 234.36 perf-stat.i.metric.K/sec
1.85 -7.0% 1.73 perf-stat.overall.cpi
0.54 +7.5% 0.58 perf-stat.overall.ipc
204618 +2.0% 208750 perf-stat.overall.path-length
4.674e+10 +8.4% 5.066e+10 perf-stat.ps.branch-instructions
2.532e+08 +9.0% 2.76e+08 perf-stat.ps.branch-misses
47800355 +8.1% 51652366 perf-stat.ps.context-switches
591.27 +5.5% 623.50 perf-stat.ps.cpu-migrations
2.332e+11 +8.1% 2.521e+11 perf-stat.ps.instructions
7.417e+13 +8.1% 8.019e+13 perf-stat.total.instructions
534510 ± 9% +24.8% 666973 ± 8% will-it-scale.1.linear
471854 ± 2% +26.9% 598959 ± 13% will-it-scale.1.processes
534510 ± 9% +24.8% 666973 ± 8% will-it-scale.1.threads
59865194 ± 9% +24.8% 74700994 ± 8% will-it-scale.112.linear
52446572 ± 3% +17.2% 61467049 will-it-scale.112.processes
52.12 ± 2% -11.2% 46.28 will-it-scale.112.processes_idle
89797792 ± 9% +24.8% 1.121e+08 ± 8% will-it-scale.168.linear
90159107 +5.3% 94951409 will-it-scale.168.processes
23.53 -8.4% 21.56 will-it-scale.168.processes_idle
1.197e+08 ± 9% +24.8% 1.494e+08 ± 8% will-it-scale.224.linear
29932597 ± 9% +24.8% 37350497 ± 8% will-it-scale.56.linear
24228097 ± 2% +22.6% 29712218 ± 2% will-it-scale.56.processes
80.80 -3.6% 77.89 will-it-scale.56.processes_idle
495994 +13.5% 562996 ± 2% will-it-scale.per_process_ops
211008 ± 5% +13.4% 239359 ± 4% will-it-scale.per_thread_ops
5748 +1.7% 5848 will-it-scale.time.percent_of_cpu_this_job_got
16823 +1.5% 17082 will-it-scale.time.system_time
1410 +4.2% 1469 will-it-scale.time.user_time
3.625e+08 +6.0% 3.842e+08 will-it-scale.workload
6.75 ± 9% -3.4 3.33 ± 4% perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
8.01 ± 8% -1.7 6.26 ± 6% perf-profile.calltrace.cycles-pp.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary
7.89 ± 8% -1.7 6.14 ± 6% perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry
10.30 ± 6% -1.7 8.60 ± 4% perf-profile.calltrace.cycles-pp.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary.common_startup_64
4.36 ± 2% -0.4 3.99 ± 5% perf-profile.calltrace.cycles-pp.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
3.59 ± 2% -0.4 3.23 ± 5% perf-profile.calltrace.cycles-pp.anon_pipe_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
3.91 ± 2% -0.4 3.55 ± 5% perf-profile.calltrace.cycles-pp.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
2.75 ± 2% -0.4 2.39 ± 6% perf-profile.calltrace.cycles-pp.__wake_up_sync_key.anon_pipe_write.vfs_write.ksys_write.do_syscall_64
2.50 ± 2% -0.4 2.14 ± 6% perf-profile.calltrace.cycles-pp.autoremove_wake_function.__wake_up_common.__wake_up_sync_key.anon_pipe_write.vfs_write
2.44 ± 2% -0.4 2.09 ± 6% perf-profile.calltrace.cycles-pp.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_sync_key.anon_pipe_write
2.56 ± 2% -0.4 2.21 ± 6% perf-profile.calltrace.cycles-pp.__wake_up_common.__wake_up_sync_key.anon_pipe_write.vfs_write.ksys_write
1.73 ± 2% -0.3 1.39 ± 8% perf-profile.calltrace.cycles-pp.ttwu_queue_wakelist.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_sync_key
1.48 ± 2% -0.3 1.14 ± 10% perf-profile.calltrace.cycles-pp.__smp_call_single_queue.ttwu_queue_wakelist.try_to_wake_up.autoremove_wake_function.__wake_up_common
1.35 ± 2% -0.3 1.02 ± 11% perf-profile.calltrace.cycles-pp.call_function_single_prep_ipi.__smp_call_single_queue.ttwu_queue_wakelist.try_to_wake_up.autoremove_wake_function
61.64 +1.2 62.81 perf-profile.calltrace.cycles-pp.__schedule.schedule.anon_pipe_read.vfs_read.ksys_read
62.24 +1.2 63.45 perf-profile.calltrace.cycles-pp.schedule.anon_pipe_read.vfs_read.ksys_read.do_syscall_64
0.00 +1.7 1.72 ± 23% perf-profile.calltrace.cycles-pp.poll_idle.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
6.82 ± 9% -3.5 3.36 ± 4% perf-profile.children.cycles-pp.intel_idle
8.10 ± 8% -1.8 6.34 ± 6% perf-profile.children.cycles-pp.cpuidle_enter
8.04 ± 8% -1.8 6.28 ± 6% perf-profile.children.cycles-pp.cpuidle_enter_state
10.45 ± 6% -1.7 8.72 ± 4% perf-profile.children.cycles-pp.cpuidle_idle_call
4.41 ± 2% -0.4 4.04 ± 5% perf-profile.children.cycles-pp.ksys_write
2.76 ± 2% -0.4 2.40 ± 5% perf-profile.children.cycles-pp.__wake_up_sync_key
3.63 ± 2% -0.4 3.27 ± 5% perf-profile.children.cycles-pp.anon_pipe_write
2.51 ± 2% -0.4 2.15 ± 6% perf-profile.children.cycles-pp.autoremove_wake_function
2.47 ± 2% -0.4 2.11 ± 6% perf-profile.children.cycles-pp.try_to_wake_up
3.94 ± 2% -0.4 3.58 ± 5% perf-profile.children.cycles-pp.vfs_write
2.57 ± 2% -0.4 2.22 ± 6% perf-profile.children.cycles-pp.__wake_up_common
1.74 ± 2% -0.3 1.40 ± 8% perf-profile.children.cycles-pp.ttwu_queue_wakelist
1.49 ± 2% -0.3 1.15 ± 10% perf-profile.children.cycles-pp.__smp_call_single_queue
1.36 ± 3% -0.3 1.03 ± 11% perf-profile.children.cycles-pp.call_function_single_prep_ipi
0.22 ± 2% +0.0 0.25 ± 8% perf-profile.children.cycles-pp.switch_mm_irqs_off
0.23 ± 4% +0.0 0.28 ± 3% perf-profile.children.cycles-pp.local_clock_noinstr
62.26 +1.2 63.47 perf-profile.children.cycles-pp.schedule
66.06 +1.4 67.43 perf-profile.children.cycles-pp.__schedule
0.00 +1.8 1.75 ± 23% perf-profile.children.cycles-pp.poll_idle
6.82 ± 9% -3.5 3.36 ± 4% perf-profile.self.cycles-pp.intel_idle
1.35 ± 3% -0.3 1.02 ± 11% perf-profile.self.cycles-pp.call_function_single_prep_ipi
0.26 ± 4% -0.1 0.16 ± 4% perf-profile.self.cycles-pp.flush_smp_call_function_queue
0.24 ± 3% -0.0 0.20 ± 3% perf-profile.self.cycles-pp.set_next_entity
0.05 +0.0 0.06 perf-profile.self.cycles-pp.local_clock_noinstr
0.00 +1.7 1.66 ± 23% perf-profile.self.cycles-pp.poll_idle
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
Powered by blists - more mailing lists