[<prev] [next>] [day] [month] [year] [list]
Message-ID: <202504241314.fe89a536-lkp@intel.com>
Date: Thu, 24 Apr 2025 13:49:23 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Christian Loehle <christian.loehle@....com>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, <linux-kernel@...r.kernel.org>,
"Rafael J. Wysocki" <rafael.j.wysocki@...el.com>, <linux-pm@...r.kernel.org>,
<oliver.sang@...el.com>
Subject: [linus:master] [cpuidle] 38f83090f5: fsmark.files_per_sec 5.1%
regression
Hello,
back in last Oct, we reported
"[linux-next:master] [cpuidle] 38f83090f5: fsmark.app_overhead 51.9% regression"
(https://lore.kernel.org/all/202410072214.11d18a3c-oliver.sang@intel.com/)
but there is no obvious fsmark.files_per_sec difference at that time.
now on a different platform and with different fsmark parameters, we notice
a small regression of fsmark.files_per_sec. but no obvious fsmark.app_overhead
difference this time (so does not show in below detail table).
last Oct report seems cause some confusion. so for this one, we try to rebuild
kernels and run more times to confirm the configs are same for parent/38f83090f5
and data is stable.
just FYI what we observed in our tests.
kernel test robot noticed a 5.1% regression of fsmark.files_per_sec on:
commit: 38f83090f515b4b5d59382dfada1e7457f19aa47 ("cpuidle: menu: Remove iowait influence")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
[test failed on linus/master 6fea5fabd3323cd27b2ab5143263f37ff29550cb]
[test failed on linux-next/master bc8aa6cdadcc00862f2b5720e5de2e17f696a081]
testcase: fsmark
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 512G memory
parameters:
iterations: 8
disk: 1SSD
nr_threads: 4
fs: btrfs
filesize: 9B
test_size: 16G
sync_method: fsyncBeforeClose
nr_directories: 16d
nr_files_per_directory: 256fpd
cpufreq_governor: performance
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@...el.com>
| Closes: https://lore.kernel.org/oe-lkp/202504241314.fe89a536-lkp@intel.com
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250424/202504241314.fe89a536-lkp@intel.com
=========================================================================================
compiler/cpufreq_governor/disk/filesize/fs/iterations/kconfig/nr_directories/nr_files_per_directory/nr_threads/rootfs/sync_method/tbox_group/test_size/testcase:
gcc-12/performance/1SSD/9B/btrfs/8/x86_64-rhel-9.4/16d/256fpd/4/debian-12-x86_64-20240206.cgz/fsyncBeforeClose/lkp-spr-2sp4/16G/fsmark
commit:
v6.12-rc1
38f83090f5 ("cpuidle: menu: Remove iowait influence")
v6.12-rc1 38f83090f515b4b5d59382dfada
---------------- ---------------------------
%stddev %change %stddev
\ | \
0.12 ± 4% +0.0 0.14 ± 2% mpstat.cpu.all.iowait%
940771 +2.8% 967141 proc-vmstat.pgfault
120077 -5.7% 113267 vmstat.system.cs
72197 ± 2% -14.2% 61965 ± 2% vmstat.system.in
20496 -5.1% 19456 fsmark.files_per_sec
219.58 +5.0% 230.50 fsmark.time.elapsed_time
219.58 +5.0% 230.50 fsmark.time.elapsed_time.max
0.02 ± 9% +22.5% 0.03 ± 10% perf-sched.wait_and_delay.avg.ms.wait_log_commit.btrfs_sync_log.btrfs_sync_file.do_fsync
15676 ± 9% -25.2% 11732 ± 14% perf-sched.wait_and_delay.count.btrfs_sync_log.btrfs_sync_file.do_fsync.__x64_sys_fsync
130219 ± 7% -14.8% 110942 ± 8% perf-sched.wait_and_delay.count.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read.btrfs_tree_read_lock_nested
0.02 ± 10% +27.8% 0.02 ± 12% perf-sched.wait_time.avg.ms.wait_log_commit.btrfs_sync_log.btrfs_sync_file.do_fsync
1.457e+09 -4.4% 1.393e+09 perf-stat.i.branch-instructions
14767792 -4.6% 14093766 perf-stat.i.branch-misses
71872875 -3.4% 69406875 perf-stat.i.cache-references
121769 -5.8% 114744 perf-stat.i.context-switches
8.574e+09 -7.2% 7.96e+09 perf-stat.i.cpu-cycles
7.979e+09 -3.6% 7.691e+09 perf-stat.i.instructions
3773 -2.0% 3697 perf-stat.i.minor-faults
3773 -2.0% 3697 perf-stat.i.page-faults
1.45e+09 -4.4% 1.386e+09 perf-stat.ps.branch-instructions
14670086 -4.5% 14003599 perf-stat.ps.branch-misses
71521482 -3.4% 69080547 perf-stat.ps.cache-references
121170 -5.7% 114204 perf-stat.ps.context-switches
8.537e+09 -7.2% 7.925e+09 perf-stat.ps.cpu-cycles
7.938e+09 -3.6% 7.654e+09 perf-stat.ps.instructions
3717 -1.9% 3645 perf-stat.ps.minor-faults
3717 -1.9% 3645 perf-stat.ps.page-faults
9.65 ± 14% -9.4 0.22 ±123% perf-profile.calltrace.cycles-pp.poll_idle.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
18.61 ± 6% -4.7 13.95 ± 6% perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry
18.05 ± 7% -4.5 13.51 ± 7% perf-profile.calltrace.cycles-pp.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary
18.73 ± 7% -4.5 14.22 ± 7% perf-profile.calltrace.cycles-pp.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary.common_startup_64
22.60 ± 4% -4.5 18.11 ± 3% perf-profile.calltrace.cycles-pp.common_startup_64
21.55 ± 5% -4.5 17.08 ± 4% perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.common_startup_64
21.57 ± 5% -4.5 17.11 ± 4% perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.common_startup_64
21.58 ± 5% -4.5 17.12 ± 4% perf-profile.calltrace.cycles-pp.start_secondary.common_startup_64
2.39 ± 6% -0.6 1.78 ± 7% perf-profile.calltrace.cycles-pp.btrfs_clone_write_end_io.blk_mq_end_request_batch.nvme_irq.__handle_irq_event_percpu.handle_irq_event
2.32 ± 7% -0.6 1.72 ± 7% perf-profile.calltrace.cycles-pp.btrfs_orig_write_end_io.btrfs_clone_write_end_io.blk_mq_end_request_batch.nvme_irq.__handle_irq_event_percpu
2.21 ± 7% -0.6 1.64 ± 7% perf-profile.calltrace.cycles-pp.end_bbio_meta_write.btrfs_orig_write_end_io.btrfs_clone_write_end_io.blk_mq_end_request_batch.nvme_irq
0.52 ± 27% +0.1 0.63 ± 6% perf-profile.calltrace.cycles-pp.__blk_flush_plug.blk_finish_plug.btrfs_sync_log.btrfs_sync_file.do_fsync
0.52 ± 27% +0.1 0.63 ± 6% perf-profile.calltrace.cycles-pp.blk_finish_plug.btrfs_sync_log.btrfs_sync_file.do_fsync.__x64_sys_fsync
0.95 ± 6% +0.2 1.10 ± 7% perf-profile.calltrace.cycles-pp.__btrfs_wait_marked_extents.btrfs_wait_tree_log_extents.btrfs_sync_log.btrfs_sync_file.do_fsync
1.04 ± 6% +0.2 1.21 ± 7% perf-profile.calltrace.cycles-pp.btrfs_wait_tree_log_extents.btrfs_sync_log.btrfs_sync_file.do_fsync.__x64_sys_fsync
0.51 ± 27% +0.2 0.68 ± 5% perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
0.36 ± 70% +0.3 0.62 ± 6% perf-profile.calltrace.cycles-pp.btrfs_free_tree_block.btrfs_force_cow_block.btrfs_cow_block.btrfs_search_slot.btrfs_insert_empty_items
0.36 ± 70% +0.3 0.65 ± 5% perf-profile.calltrace.cycles-pp.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call
2.20 ± 8% +0.6 2.77 ± 11% perf-profile.calltrace.cycles-pp.copy_extent_buffer_full.btrfs_force_cow_block.btrfs_cow_block.btrfs_search_slot.btrfs_insert_empty_items
0.00 +0.8 0.76 ± 10% perf-profile.calltrace.cycles-pp.folio_end_writeback.end_bbio_meta_write.btrfs_orig_write_end_io.blk_mq_end_request_batch.nvme_irq
0.00 +0.8 0.85 ± 9% perf-profile.calltrace.cycles-pp.end_bbio_meta_write.btrfs_orig_write_end_io.blk_mq_end_request_batch.nvme_irq.__handle_irq_event_percpu
0.00 +0.9 0.89 ± 9% perf-profile.calltrace.cycles-pp.btrfs_orig_write_end_io.blk_mq_end_request_batch.nvme_irq.__handle_irq_event_percpu.handle_irq_event
7.54 ± 5% +1.1 8.69 ± 4% perf-profile.calltrace.cycles-pp.btrfs_force_cow_block.btrfs_cow_block.btrfs_search_slot.btrfs_insert_empty_items.copy_items
7.56 ± 5% +1.1 8.71 ± 4% perf-profile.calltrace.cycles-pp.btrfs_cow_block.btrfs_search_slot.btrfs_insert_empty_items.copy_items.copy_inode_items_to_log
3.57 ± 5% +1.3 4.87 ± 7% perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
1.99 ± 13% +2.8 4.76 ± 10% perf-profile.calltrace.cycles-pp.handle_edge_irq.__sysvec_posted_msi_notification.sysvec_posted_msi_notification.asm_sysvec_posted_msi_notification.cpuidle_enter_state
2.25 ± 14% +2.9 5.16 ± 11% perf-profile.calltrace.cycles-pp.__sysvec_posted_msi_notification.sysvec_posted_msi_notification.asm_sysvec_posted_msi_notification.cpuidle_enter_state.cpuidle_enter
2.35 ± 14% +3.0 5.32 ± 11% perf-profile.calltrace.cycles-pp.sysvec_posted_msi_notification.asm_sysvec_posted_msi_notification.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call
2.51 ± 15% +3.1 5.60 ± 11% perf-profile.calltrace.cycles-pp.asm_sysvec_posted_msi_notification.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
50.39 ± 2% +4.4 54.84 ± 2% perf-profile.calltrace.cycles-pp.__x64_sys_fsync.do_syscall_64.entry_SYSCALL_64_after_hwframe.fsync
50.38 ± 2% +4.4 54.82 ± 2% perf-profile.calltrace.cycles-pp.do_fsync.__x64_sys_fsync.do_syscall_64.entry_SYSCALL_64_after_hwframe.fsync
50.35 ± 2% +4.4 54.80 ± 2% perf-profile.calltrace.cycles-pp.btrfs_sync_file.do_fsync.__x64_sys_fsync.do_syscall_64.entry_SYSCALL_64_after_hwframe
50.84 ± 2% +4.5 55.30 perf-profile.calltrace.cycles-pp.fsync
50.57 ± 2% +4.5 55.03 ± 2% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.fsync
50.57 ± 2% +4.5 55.03 ± 2% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.fsync
9.92 ± 13% -9.4 0.52 ± 12% perf-profile.children.cycles-pp.poll_idle
18.84 ± 7% -4.6 14.26 ± 6% perf-profile.children.cycles-pp.cpuidle_enter_state
18.85 ± 7% -4.6 14.27 ± 6% perf-profile.children.cycles-pp.cpuidle_enter
19.59 ± 6% -4.5 15.04 ± 6% perf-profile.children.cycles-pp.cpuidle_idle_call
22.60 ± 4% -4.5 18.11 ± 3% perf-profile.children.cycles-pp.common_startup_64
22.60 ± 4% -4.5 18.11 ± 3% perf-profile.children.cycles-pp.cpu_startup_entry
22.58 ± 4% -4.5 18.09 ± 3% perf-profile.children.cycles-pp.do_idle
21.58 ± 5% -4.5 17.12 ± 4% perf-profile.children.cycles-pp.start_secondary
2.46 ± 6% -0.6 1.86 ± 7% perf-profile.children.cycles-pp.btrfs_clone_write_end_io
0.12 ± 13% -0.1 0.06 ± 15% perf-profile.children.cycles-pp.local_clock_noinstr
0.22 ± 8% +0.1 0.28 ± 6% perf-profile.children.cycles-pp.__xa_set_mark
0.57 ± 5% +0.1 0.64 ± 5% perf-profile.children.cycles-pp.blk_finish_plug
0.59 ± 5% +0.1 0.67 ± 5% perf-profile.children.cycles-pp.__blk_flush_plug
0.77 ± 5% +0.1 0.87 ± 4% perf-profile.children.cycles-pp.__folio_start_writeback
0.60 ± 6% +0.1 0.70 ± 6% perf-profile.children.cycles-pp.pin_down_extent
0.75 ± 6% +0.1 0.88 ± 7% perf-profile.children.cycles-pp.btrfs_free_tree_block
0.96 ± 5% +0.1 1.11 ± 7% perf-profile.children.cycles-pp.__btrfs_wait_marked_extents
1.11 ± 6% +0.2 1.29 ± 6% perf-profile.children.cycles-pp.set_extent_bit
1.21 ± 5% +0.2 1.38 ± 6% perf-profile.children.cycles-pp.__set_extent_bit
0.83 ± 5% +0.2 1.01 ± 6% perf-profile.children.cycles-pp.__folio_mark_dirty
1.25 ± 6% +0.2 1.47 ± 6% perf-profile.children.cycles-pp.__folio_end_writeback
2.28 ± 4% +0.3 2.56 ± 4% perf-profile.children.cycles-pp.__write_extent_buffer
3.81 ± 5% +0.5 4.34 ± 6% perf-profile.children.cycles-pp.btrfs_alloc_tree_block
3.73 ± 6% +0.7 4.42 ± 8% perf-profile.children.cycles-pp.copy_extent_buffer_full
5.21 ± 4% +0.8 5.97 ± 6% perf-profile.children.cycles-pp.__memcpy
3.62 ± 4% +1.3 4.92 ± 7% perf-profile.children.cycles-pp.intel_idle
10.48 ± 4% +1.5 11.94 ± 4% perf-profile.children.cycles-pp.btrfs_force_cow_block
10.51 ± 4% +1.5 11.97 ± 4% perf-profile.children.cycles-pp.btrfs_cow_block
50.40 ± 2% +4.4 54.84 ± 2% perf-profile.children.cycles-pp.__x64_sys_fsync
50.38 ± 2% +4.4 54.83 ± 2% perf-profile.children.cycles-pp.do_fsync
50.36 ± 2% +4.4 54.81 ± 2% perf-profile.children.cycles-pp.btrfs_sync_file
50.86 ± 2% +4.5 55.32 perf-profile.children.cycles-pp.fsync
69.02 ± 2% +4.6 73.66 perf-profile.children.cycles-pp.do_syscall_64
69.05 ± 2% +4.6 73.69 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
6.48 ± 13% -6.0 0.48 ± 12% perf-profile.self.cycles-pp.poll_idle
0.32 ± 8% +0.1 0.39 ± 11% perf-profile.self.cycles-pp.folio_mark_accessed
5.10 ± 4% +0.7 5.84 ± 6% perf-profile.self.cycles-pp.__memcpy
3.62 ± 4% +1.3 4.92 ± 7% perf-profile.self.cycles-pp.intel_idle
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
Powered by blists - more mailing lists