[<prev] [next>] [day] [month] [year] [list]
Message-ID: <202501072048.6f381b3a-lkp@intel.com>
Date: Tue, 7 Jan 2025 21:01:27 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Vincent Guittot <vincent.guittot@...aro.org>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, <linux-kernel@...r.kernel.org>,
<x86@...nel.org>, Peter Zijlstra <peterz@...radead.org>, Dietmar Eggemann
<dietmar.eggemann@....com>, <aubrey.li@...ux.intel.com>,
<yu.c.chen@...el.com>, <oliver.sang@...el.com>
Subject: [tip:sched/core] [sched/fair] 61b82dfb6b: aim7.jobs-per-min 2.1%
improvement
Hello,
kernel test robot noticed a 2.1% improvement of aim7.jobs-per-min on:
commit: 61b82dfb6b7e1f951fd1e95198a2aee2ccf6a167 ("sched/fair: Do not try to migrate delayed dequeue task")
https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git sched/core
testcase: aim7
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 128 threads 2 sockets Intel(R) Xeon(R) Gold 6338 CPU @ 2.00GHz (Ice Lake) with 256G memory
parameters:
disk: 1BRD_48G
fs: xfs
test: sync_disk_rw
load: 600
cpufreq_governor: performance
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250107/202501072048.6f381b3a-lkp@intel.com
=========================================================================================
compiler/cpufreq_governor/disk/fs/kconfig/load/rootfs/tbox_group/test/testcase:
gcc-12/performance/1BRD_48G/xfs/x86_64-rhel-9.4/600/debian-12-x86_64-20240206.cgz/lkp-icl-2sp2/sync_disk_rw/aim7
commit:
736c55a02c ("sched/fair: Rename cfs_rq.nr_running into nr_queued")
61b82dfb6b ("sched/fair: Do not try to migrate delayed dequeue task")
736c55a02c477ad3 61b82dfb6b7e1f951fd1e95198a
---------------- ---------------------------
%stddev %change %stddev
\ | \
0.05 ± 31% +0.2 0.20 ±140% mpstat.cpu.all.iowait%
21.83 ± 18% -51.5% 10.58 ± 79% sched_debug.cfs_rq:/.load_avg.min
38.33 +1.9% 39.07 iostat.cpu.idle
60.71 -1.5% 59.80 iostat.cpu.system
47551 +2.1% 48556 aim7.jobs-per-min
7398 -2.0% 7250 aim7.time.percent_of_cpu_this_job_got
5583 -4.1% 5357 aim7.time.system_time
34860959 -1.5% 34349280 aim7.time.voluntary_context_switches
1.91 +2.6% 1.96 perf-stat.i.MPKI
1.319e+08 +2.1% 1.347e+08 perf-stat.i.cache-misses
837578 +1.7% 851427 perf-stat.i.context-switches
2.132e+11 -1.6% 2.099e+11 perf-stat.i.cpu-cycles
177061 -3.6% 170730 perf-stat.i.cpu-migrations
0.45 ± 2% +2.4% 0.46 perf-stat.i.ipc
2.13 +2.8% 2.19 perf-stat.overall.MPKI
1618 -3.6% 1560 perf-stat.overall.cycles-between-cache-misses
1.301e+08 +2.2% 1.331e+08 perf-stat.ps.cache-misses
826542 +1.7% 840923 perf-stat.ps.context-switches
2.106e+11 -1.4% 2.077e+11 perf-stat.ps.cpu-cycles
174640 -3.5% 168471 perf-stat.ps.cpu-migrations
4.691e+12 -2.2% 4.587e+12 perf-stat.total.instructions
0.27 ± 6% -15.9% 0.22 ± 7% perf-sched.sch_delay.avg.ms.__cond_resched.down_write.xfs_ilock_for_iomap.xfs_buffered_write_iomap_begin.iomap_iter
0.14 ± 76% -99.8% 0.00 ±223% perf-sched.sch_delay.avg.ms.__cond_resched.kmem_cache_alloc_noprof.alloc_empty_file.path_openat.do_filp_open
0.08 ± 2% -14.7% 0.07 ± 2% perf-sched.sch_delay.avg.ms.io_schedule.folio_wait_bit_common.folio_wait_writeback.__filemap_fdatawait_range
0.08 ± 4% -14.3% 0.07 ± 3% perf-sched.sch_delay.avg.ms.schedule_timeout.__wait_for_common.__flush_workqueue.xlog_cil_push_now.isra
0.01 ± 52% -52.7% 0.01 ± 26% perf-sched.sch_delay.avg.ms.wait_for_partner.fifo_open.do_dentry_open.vfs_open
0.16 ± 85% -99.8% 0.00 ±223% perf-sched.sch_delay.max.ms.__cond_resched.kmem_cache_alloc_noprof.alloc_empty_file.path_openat.do_filp_open
1.62 ± 45% -53.6% 0.75 ± 14% perf-sched.sch_delay.max.ms.__cond_resched.writeback_get_folio.writeback_iter.iomap_writepages.xfs_vm_writepages
0.02 ± 15% +130.5% 0.04 ± 58% perf-sched.sch_delay.max.ms.schedule_timeout.kcompactd.kthread.ret_from_fork
0.21 ±105% -88.3% 0.02 ± 64% perf-sched.sch_delay.max.ms.wait_for_partner.fifo_open.do_dentry_open.vfs_open
0.07 ± 4% -12.0% 0.06 ± 2% perf-sched.total_sch_delay.average.ms
750.06 ± 6% -11.1% 666.52 ± 18% perf-sched.wait_and_delay.avg.ms.__x64_sys_pause.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
168.85 ±141% +205.4% 515.61 perf-sched.wait_and_delay.avg.ms.do_task_dead.do_exit.__x64_sys_exit.x64_sys_call.do_syscall_64
31.35 ± 7% -10.6% 28.01 perf-sched.wait_and_delay.avg.ms.do_task_dead.do_exit.do_group_exit.__x64_sys_exit_group.x64_sys_call
832.00 ± 17% +20.6% 1003 ± 4% perf-sched.wait_and_delay.count.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity
2113 ± 10% +15.7% 2445 ± 2% perf-sched.wait_and_delay.count.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
75.84 ± 18% -14.9% 64.53 ± 4% perf-sched.wait_and_delay.max.ms.__cond_resched.__wait_for_common.__flush_workqueue.xlog_cil_push_now.isra
168.85 ±141% +205.4% 515.61 perf-sched.wait_and_delay.max.ms.do_task_dead.do_exit.__x64_sys_exit.x64_sys_call.do_syscall_64
1.15 ± 9% -33.4% 0.77 ± 24% perf-sched.wait_time.avg.ms.__cond_resched.__wait_for_common.wait_for_completion_state.kernel_clone.__x64_sys_vfork
0.96 ± 17% -25.9% 0.71 ± 7% perf-sched.wait_time.avg.ms.__cond_resched.writeback_get_folio.writeback_iter.iomap_writepages.xfs_vm_writepages
168.85 ±141% +205.4% 515.61 perf-sched.wait_time.avg.ms.do_task_dead.do_exit.__x64_sys_exit.x64_sys_call.do_syscall_64
31.35 ± 7% -10.6% 28.01 perf-sched.wait_time.avg.ms.do_task_dead.do_exit.do_group_exit.__x64_sys_exit_group.x64_sys_call
9.88 ±118% -76.9% 2.28 ± 55% perf-sched.wait_time.max.ms.__cond_resched.writeback_get_folio.writeback_iter.iomap_writepages.xfs_vm_writepages
168.85 ±141% +205.4% 515.61 perf-sched.wait_time.max.ms.do_task_dead.do_exit.__x64_sys_exit.x64_sys_call.do_syscall_64
226.05 ±153% -70.8% 65.98 ± 4% perf-sched.wait_time.max.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.__flush_workqueue
333.54 ±141% -100.0% 0.02 ± 64% perf-sched.wait_time.max.ms.wait_for_partner.fifo_open.do_dentry_open.vfs_open
70.43 -1.1 69.31 perf-profile.calltrace.cycles-pp.__mutex_lock.__flush_workqueue.xlog_cil_push_now.xlog_cil_force_seq.xfs_log_force_seq
68.32 -1.1 67.21 perf-profile.calltrace.cycles-pp.osq_lock.__mutex_lock.__flush_workqueue.xlog_cil_push_now.xlog_cil_force_seq
74.65 -1.0 73.63 perf-profile.calltrace.cycles-pp.__flush_workqueue.xlog_cil_push_now.xlog_cil_force_seq.xfs_log_force_seq.xfs_file_fsync
81.36 -1.0 80.34 perf-profile.calltrace.cycles-pp.xfs_log_force_seq.xfs_file_fsync.xfs_file_buffered_write.vfs_write.ksys_write
75.60 -1.0 74.61 perf-profile.calltrace.cycles-pp.xlog_cil_push_now.xlog_cil_force_seq.xfs_log_force_seq.xfs_file_fsync.xfs_file_buffered_write
77.63 -1.0 76.66 perf-profile.calltrace.cycles-pp.xlog_cil_force_seq.xfs_log_force_seq.xfs_file_fsync.xfs_file_buffered_write.vfs_write
90.80 -0.4 90.38 perf-profile.calltrace.cycles-pp.xfs_file_fsync.xfs_file_buffered_write.vfs_write.ksys_write.do_syscall_64
93.24 -0.3 92.92 perf-profile.calltrace.cycles-pp.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
93.52 -0.3 93.20 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
93.27 -0.3 92.95 perf-profile.calltrace.cycles-pp.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
93.52 -0.3 93.21 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.write
93.12 -0.3 92.81 perf-profile.calltrace.cycles-pp.xfs_file_buffered_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
93.75 -0.3 93.44 perf-profile.calltrace.cycles-pp.write
0.52 +0.0 0.56 perf-profile.calltrace.cycles-pp.__sysvec_call_function_single.sysvec_call_function_single.asm_sysvec_call_function_single.acpi_safe_halt.acpi_idle_do_entry
0.64 +0.0 0.68 perf-profile.calltrace.cycles-pp.sysvec_call_function_single.asm_sysvec_call_function_single.acpi_safe_halt.acpi_idle_do_entry.acpi_idle_enter
0.96 +0.0 1.00 perf-profile.calltrace.cycles-pp.iomap_write_iter.iomap_file_buffered_write.xfs_file_buffered_write.vfs_write.ksys_write
1.24 +0.0 1.29 perf-profile.calltrace.cycles-pp.iomap_file_buffered_write.xfs_file_buffered_write.vfs_write.ksys_write.do_syscall_64
0.88 +0.1 0.93 perf-profile.calltrace.cycles-pp.acpi_safe_halt.acpi_idle_do_entry.acpi_idle_enter.cpuidle_enter_state.cpuidle_enter
0.85 +0.1 0.92 ± 4% perf-profile.calltrace.cycles-pp.xfs_iomap_write_unwritten.xfs_end_ioend.xfs_end_io.process_one_work.worker_thread
1.10 ± 5% +0.1 1.17 ± 3% perf-profile.calltrace.cycles-pp.__folio_end_writeback.folio_end_writeback.iomap_finish_ioend.iomap_finish_ioends.xfs_end_ioend
1.31 ± 3% +0.1 1.40 ± 2% perf-profile.calltrace.cycles-pp.iomap_finish_ioends.xfs_end_ioend.xfs_end_io.process_one_work.worker_thread
1.29 ± 3% +0.1 1.38 ± 2% perf-profile.calltrace.cycles-pp.folio_end_writeback.iomap_finish_ioend.iomap_finish_ioends.xfs_end_ioend.xfs_end_io
1.31 ± 3% +0.1 1.40 ± 2% perf-profile.calltrace.cycles-pp.iomap_finish_ioend.iomap_finish_ioends.xfs_end_ioend.xfs_end_io.process_one_work
1.54 +0.1 1.63 perf-profile.calltrace.cycles-pp.acpi_idle_do_entry.acpi_idle_enter.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call
1.55 +0.1 1.64 perf-profile.calltrace.cycles-pp.acpi_idle_enter.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
1.60 +0.1 1.70 perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry
1.62 +0.1 1.71 perf-profile.calltrace.cycles-pp.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary
2.06 +0.1 2.16 perf-profile.calltrace.cycles-pp.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary.common_startup_64
1.94 +0.1 2.05 perf-profile.calltrace.cycles-pp.asm_sysvec_call_function_single.acpi_safe_halt.acpi_idle_do_entry.acpi_idle_enter.cpuidle_enter_state
0.42 ± 44% +0.1 0.54 perf-profile.calltrace.cycles-pp.__flush_smp_call_function_queue.__sysvec_call_function_single.sysvec_call_function_single.asm_sysvec_call_function_single.acpi_safe_halt
2.46 +0.1 2.58 perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.common_startup_64
2.46 +0.1 2.59 perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.common_startup_64
2.46 +0.1 2.59 perf-profile.calltrace.cycles-pp.start_secondary.common_startup_64
2.49 +0.1 2.62 perf-profile.calltrace.cycles-pp.common_startup_64
2.18 ± 2% +0.2 2.33 ± 2% perf-profile.calltrace.cycles-pp.xfs_end_io.process_one_work.worker_thread.kthread.ret_from_fork
2.17 ± 2% +0.2 2.32 ± 2% perf-profile.calltrace.cycles-pp.xfs_end_ioend.xfs_end_io.process_one_work.worker_thread.kthread
2.92 +0.2 3.08 ± 2% perf-profile.calltrace.cycles-pp.process_one_work.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
3.22 +0.2 3.38 perf-profile.calltrace.cycles-pp.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
3.23 +0.2 3.40 perf-profile.calltrace.cycles-pp.kthread.ret_from_fork.ret_from_fork_asm
3.23 +0.2 3.40 perf-profile.calltrace.cycles-pp.ret_from_fork.ret_from_fork_asm
3.23 +0.2 3.40 perf-profile.calltrace.cycles-pp.ret_from_fork_asm
3.58 ± 5% +0.2 3.83 ± 3% perf-profile.calltrace.cycles-pp.__folio_end_writeback.folio_end_writeback.iomap_finish_ioend.__submit_bio.__submit_bio_noacct
3.71 ± 5% +0.3 3.97 ± 3% perf-profile.calltrace.cycles-pp.folio_end_writeback.iomap_finish_ioend.__submit_bio.__submit_bio_noacct.iomap_submit_ioend
3.74 ± 5% +0.3 4.00 ± 3% perf-profile.calltrace.cycles-pp.iomap_finish_ioend.__submit_bio.__submit_bio_noacct.iomap_submit_ioend.iomap_writepages
4.46 ± 4% +0.3 4.75 ± 2% perf-profile.calltrace.cycles-pp.iomap_submit_ioend.iomap_writepages.xfs_vm_writepages.do_writepages.filemap_fdatawrite_wbc
4.41 ± 4% +0.3 4.70 ± 2% perf-profile.calltrace.cycles-pp.__submit_bio_noacct.iomap_submit_ioend.iomap_writepages.xfs_vm_writepages.do_writepages
4.40 ± 4% +0.3 4.69 ± 2% perf-profile.calltrace.cycles-pp.__submit_bio.__submit_bio_noacct.iomap_submit_ioend.iomap_writepages.xfs_vm_writepages
8.96 ± 5% +0.6 9.55 ± 3% perf-profile.calltrace.cycles-pp.do_writepages.filemap_fdatawrite_wbc.__filemap_fdatawrite_range.file_write_and_wait_range.xfs_file_fsync
9.01 ± 5% +0.6 9.60 ± 3% perf-profile.calltrace.cycles-pp.filemap_fdatawrite_wbc.__filemap_fdatawrite_range.file_write_and_wait_range.xfs_file_fsync.xfs_file_buffered_write
8.92 ± 5% +0.6 9.51 ± 3% perf-profile.calltrace.cycles-pp.iomap_writepages.xfs_vm_writepages.do_writepages.filemap_fdatawrite_wbc.__filemap_fdatawrite_range
9.02 ± 5% +0.6 9.61 ± 3% perf-profile.calltrace.cycles-pp.__filemap_fdatawrite_range.file_write_and_wait_range.xfs_file_fsync.xfs_file_buffered_write.vfs_write
8.94 ± 5% +0.6 9.53 ± 3% perf-profile.calltrace.cycles-pp.xfs_vm_writepages.do_writepages.filemap_fdatawrite_wbc.__filemap_fdatawrite_range.file_write_and_wait_range
9.28 ± 5% +0.6 9.88 ± 3% perf-profile.calltrace.cycles-pp.file_write_and_wait_range.xfs_file_fsync.xfs_file_buffered_write.vfs_write.ksys_write
70.43 -1.1 69.31 perf-profile.children.cycles-pp.__mutex_lock
68.34 -1.1 67.23 perf-profile.children.cycles-pp.osq_lock
74.66 -1.0 73.63 perf-profile.children.cycles-pp.__flush_workqueue
81.36 -1.0 80.34 perf-profile.children.cycles-pp.xfs_log_force_seq
75.60 -1.0 74.61 perf-profile.children.cycles-pp.xlog_cil_push_now
77.64 -1.0 76.66 perf-profile.children.cycles-pp.xlog_cil_force_seq
90.80 -0.4 90.38 perf-profile.children.cycles-pp.xfs_file_fsync
93.30 -0.3 92.98 perf-profile.children.cycles-pp.ksys_write
93.27 -0.3 92.95 perf-profile.children.cycles-pp.vfs_write
93.13 -0.3 92.81 perf-profile.children.cycles-pp.xfs_file_buffered_write
93.82 -0.3 93.51 perf-profile.children.cycles-pp.write
93.81 -0.3 93.51 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
93.78 -0.3 93.49 perf-profile.children.cycles-pp.do_syscall_64
0.20 +0.0 0.21 perf-profile.children.cycles-pp.writeback_iter
0.31 +0.0 0.33 ± 2% perf-profile.children.cycles-pp.iomap_write_begin
0.69 +0.0 0.71 perf-profile.children.cycles-pp.ttwu_do_activate
0.35 +0.0 0.37 perf-profile.children.cycles-pp.iomap_write_end
0.55 +0.0 0.58 perf-profile.children.cycles-pp.sched_ttwu_pending
0.64 +0.0 0.66 perf-profile.children.cycles-pp.__flush_smp_call_function_queue
0.65 +0.0 0.68 perf-profile.children.cycles-pp.__sysvec_call_function_single
0.79 +0.0 0.82 perf-profile.children.cycles-pp.sysvec_call_function_single
0.96 +0.0 1.00 perf-profile.children.cycles-pp.iomap_write_iter
1.24 +0.0 1.29 perf-profile.children.cycles-pp.iomap_file_buffered_write
1.18 ± 2% +0.1 1.23 ± 2% perf-profile.children.cycles-pp.__xfs_trans_commit
1.48 +0.1 1.54 perf-profile.children.cycles-pp.asm_sysvec_call_function_single
0.85 +0.1 0.92 ± 4% perf-profile.children.cycles-pp.xfs_iomap_write_unwritten
1.31 ± 3% +0.1 1.40 ± 2% perf-profile.children.cycles-pp.iomap_finish_ioends
1.57 +0.1 1.66 perf-profile.children.cycles-pp.acpi_idle_enter
1.57 +0.1 1.66 perf-profile.children.cycles-pp.acpi_safe_halt
1.63 +0.1 1.72 perf-profile.children.cycles-pp.cpuidle_enter_state
1.64 +0.1 1.73 perf-profile.children.cycles-pp.cpuidle_enter
1.57 +0.1 1.66 perf-profile.children.cycles-pp.acpi_idle_do_entry
2.08 +0.1 2.19 perf-profile.children.cycles-pp.cpuidle_idle_call
2.49 +0.1 2.62 perf-profile.children.cycles-pp.do_idle
2.46 +0.1 2.59 perf-profile.children.cycles-pp.start_secondary
2.49 +0.1 2.62 perf-profile.children.cycles-pp.common_startup_64
2.49 +0.1 2.62 perf-profile.children.cycles-pp.cpu_startup_entry
2.17 ± 2% +0.2 2.32 ± 2% perf-profile.children.cycles-pp.xfs_end_ioend
2.18 ± 2% +0.2 2.34 ± 2% perf-profile.children.cycles-pp.xfs_end_io
2.92 +0.2 3.08 ± 2% perf-profile.children.cycles-pp.process_one_work
3.24 +0.2 3.40 perf-profile.children.cycles-pp.ret_from_fork
3.24 +0.2 3.40 perf-profile.children.cycles-pp.ret_from_fork_asm
3.23 +0.2 3.40 perf-profile.children.cycles-pp.kthread
3.22 +0.2 3.39 perf-profile.children.cycles-pp.worker_thread
4.46 ± 4% +0.3 4.75 ± 2% perf-profile.children.cycles-pp.iomap_submit_ioend
4.46 ± 4% +0.3 4.75 ± 2% perf-profile.children.cycles-pp.__submit_bio_noacct
4.46 ± 4% +0.3 4.74 ± 2% perf-profile.children.cycles-pp.__submit_bio
4.68 ± 5% +0.3 5.01 ± 3% perf-profile.children.cycles-pp.__folio_end_writeback
5.00 ± 5% +0.3 5.34 ± 3% perf-profile.children.cycles-pp.folio_end_writeback
5.05 ± 5% +0.3 5.40 ± 3% perf-profile.children.cycles-pp.iomap_finish_ioend
8.96 ± 5% +0.6 9.55 ± 3% perf-profile.children.cycles-pp.do_writepages
9.01 ± 5% +0.6 9.60 ± 3% perf-profile.children.cycles-pp.filemap_fdatawrite_wbc
9.02 ± 5% +0.6 9.61 ± 3% perf-profile.children.cycles-pp.__filemap_fdatawrite_range
8.94 ± 5% +0.6 9.53 ± 3% perf-profile.children.cycles-pp.xfs_vm_writepages
8.92 ± 5% +0.6 9.52 ± 3% perf-profile.children.cycles-pp.iomap_writepages
9.28 ± 5% +0.6 9.88 ± 3% perf-profile.children.cycles-pp.file_write_and_wait_range
67.44 -1.1 66.35 perf-profile.self.cycles-pp.osq_lock
0.21 +0.0 0.22 perf-profile.self.cycles-pp.iomap_set_range_uptodate
0.64 +0.0 0.66 perf-profile.self.cycles-pp._raw_spin_lock_irqsave
0.75 +0.0 0.79 perf-profile.self.cycles-pp.acpi_safe_halt
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
Powered by blists - more mailing lists