[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAL3q7H5XJbK-mscTqKbUJwrbKNR0mCjCOHjTigNWVCFupKt=Vw@mail.gmail.com>
Date: Tue, 10 Jun 2025 11:37:10 +0100
From: Filipe Manana <fdmanana@...nel.org>
To: kernel test robot <oliver.sang@...el.com>
Cc: Filipe Manana <fdmanana@...e.com>, oe-lkp@...ts.linux.dev, lkp@...el.com,
linux-kernel@...r.kernel.org, David Sterba <dsterba@...e.com>,
Boris Burkov <boris@....io>, linux-btrfs@...r.kernel.org
Subject: Re: [linus:master] [btrfs] 5e85262e54: stress-ng.fallocate.ops_per_sec
40.7% regression
On Tue, Jun 10, 2025 at 7:05 AM kernel test robot <oliver.sang@...el.com> wrote:
>
>
>
> Hello,
>
> kernel test robot noticed a 40.7% regression of stress-ng.fallocate.ops_per_sec on:
>
>
> commit: 5e85262e542d6da8898bb8563a724ad98f6fc936 ("btrfs: fix fsync of files with no hard links not persisting deletion")
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
This is expected. Before an fsync of a file with 0 links did nothing,
now it logs the inode and syncs the log.
Back in 2019 I made the fsync of files with 0 links do nothing and you
reported a +12461.6% gain for this test:
https://lore.kernel.org/all/20191027045312.GE29418@shao2-debian/
It was a correct change from the point of view of O_TMPFILE files, but
otherwise incorrect for non O_TMPFILE files.
But even the case fixed recently by 5e85262e542d ("btrfs: fix fsync of
files with no hard links not persisting deletion") did not work back
then, due to missing the flushing of the delayed inode.
So nothing unexpected here.
>
> [still regression on linus/master 4cb6c8af8591135ec000fbe4bb474139ceec595d]
> [still regression on linux-next/master 3a83b350b5be4b4f6bd895eecf9a92080200ee5d]
>
> testcase: stress-ng
> config: x86_64-rhel-9.4
> compiler: gcc-12
> test machine: 128 threads 2 sockets Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz (Ice Lake) with 128G memory
> parameters:
>
> nr_threads: 100%
> disk: 1HDD
> testtime: 60s
> fs: btrfs
> test: fallocate
> cpufreq_governor: performance
>
>
> In addition to that, the commit also has significant impact on the following tests:
>
> +------------------+------------------------------------------------------------------------------------------------+
> | testcase: change | stress-ng: stress-ng.copy-file.ops_per_sec 33.6% regression |
> | test machine | 128 threads 2 sockets Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz (Ice Lake) with 128G memory |
> | test parameters | cpufreq_governor=performance |
> | | disk=1HDD |
> | | fs=btrfs |
> | | nr_threads=100% |
> | | test=copy-file |
> | | testtime=60s |
> +------------------+------------------------------------------------------------------------------------------------+
>
>
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <oliver.sang@...el.com>
> | Closes: https://lore.kernel.org/oe-lkp/202506101357.7ada85f6-lkp@intel.com
>
>
> Details are as below:
> -------------------------------------------------------------------------------------------------->
>
>
> The kernel config and materials to reproduce are available at:
> https://download.01.org/0day-ci/archive/20250610/202506101357.7ada85f6-lkp@intel.com
>
> =========================================================================================
> compiler/cpufreq_governor/disk/fs/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
> gcc-12/performance/1HDD/btrfs/x86_64-rhel-9.4/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp4/fallocate/stress-ng/60s
>
> commit:
> 846b534075 ("btrfs: fix typo in space info explanation")
> 5e85262e54 ("btrfs: fix fsync of files with no hard links not persisting deletion")
>
> 846b534075f45d5b 5e85262e542d6da8898bb8563a7
> ---------------- ---------------------------
> %stddev %change %stddev
> \ | \
> 6816 ą 3% +54.4% 10526 ą 4% uptime.idle
> 8.564e+09 -8.8% 7.813e+09 cpuidle..time
> 358855 ą 2% +40.2% 503142 cpuidle..usage
> 36.53 +3.3% 37.75 boot-time.boot
> 27.16 +3.6% 28.15 boot-time.dhcp
> 4346 +3.3% 4489 boot-time.idle
> 26.69 ą 8% +173.5% 72.99 ą 8% iostat.cpu.idle
> 72.51 ą 3% -65.0% 25.36 ą 22% iostat.cpu.iowait
> 0.47 ą 4% +170.8% 1.27 ą 17% iostat.cpu.system
> 1260573 ą 8% -36.7% 798065 ą 4% numa-numastat.node0.local_node
> 1333347 ą 7% -35.6% 859241 ą 5% numa-numastat.node0.numa_hit
> 1310625 ą 8% -43.1% 745116 ą 4% numa-numastat.node1.local_node
> 1370277 ą 7% -40.4% 816368 ą 5% numa-numastat.node1.numa_hit
> 72.60 ą 10% -36.6% 46.00 ą 22% perf-c2c.DRAM.local
> 76.80 ą 11% +328.5% 329.10 ą 30% perf-c2c.DRAM.remote
> 36.50 ą 19% +864.1% 351.90 ą 29% perf-c2c.HITM.local
> 31.00 ą 15% +522.3% 192.90 ą 31% perf-c2c.HITM.remote
> 4718134 -11.1% 4195454 meminfo.Cached
> 325934 -84.9% 49200 ą 24% meminfo.Dirty
> 977794 -54.8% 442223 ą 15% meminfo.Inactive
> 977794 -54.8% 442223 ą 15% meminfo.Inactive(file)
> 114582 +16.1% 133034 meminfo.Shmem
> 414480 -53.3% 193720 ą 23% meminfo.Writeback
> 26.65 ą 8% +173.9% 72.98 ą 8% vmstat.cpu.id
> 72.54 ą 3% -64.9% 25.47 ą 23% vmstat.cpu.wa
> 107799 -37.1% 67824 vmstat.io.bo
> 117.81 -72.1% 32.92 ą 23% vmstat.procs.b
> 3650 ą 2% +78.2% 6506 vmstat.system.cs
> 6900 ą 3% +70.7% 11777 ą 7% vmstat.system.in
> 24.55 ą 9% +47.7 72.22 ą 8% mpstat.cpu.all.idle%
> 74.66 ą 3% -48.5 26.12 ą 22% mpstat.cpu.all.iowait%
> 0.02 ą 5% +0.0 0.03 ą 4% mpstat.cpu.all.irq%
> 0.02 ą 2% +0.0 0.03 ą 5% mpstat.cpu.all.soft%
> 0.41 ą 5% +0.8 1.22 ą 18% mpstat.cpu.all.sys%
> 0.34 +0.0 0.38 ą 2% mpstat.cpu.all.usr%
> 2.00 +445.0% 10.90 ą 59% mpstat.max_utilization.seconds
> 158860 ą 7% -85.3% 23387 ą 26% numa-meminfo.node0.Dirty
> 474882 ą 5% -52.9% 223821 ą 14% numa-meminfo.node0.Inactive
> 474882 ą 5% -52.9% 223821 ą 14% numa-meminfo.node0.Inactive(file)
> 201670 ą 5% -52.3% 96112 ą 20% numa-meminfo.node0.Writeback
> 166889 ą 6% -84.6% 25642 ą 28% numa-meminfo.node1.Dirty
> 502047 ą 5% -56.4% 218867 ą 17% numa-meminfo.node1.Inactive
> 502047 ą 5% -56.4% 218867 ą 17% numa-meminfo.node1.Inactive(file)
> 212352 ą 5% -54.0% 97712 ą 29% numa-meminfo.node1.Writeback
> 906.60 -43.5% 512.00 stress-ng.fallocate.ops
> 14.30 -40.7% 8.48 stress-ng.fallocate.ops_per_sec
> 65.75 -8.0% 60.48 stress-ng.time.elapsed_time
> 65.75 -8.0% 60.48 stress-ng.time.elapsed_time.max
> 14854172 -42.8% 8502297 stress-ng.time.file_system_outputs
> 750.40 ą 5% +130.1% 1727 ą 9% stress-ng.time.involuntary_context_switches
> 40.40 ą 6% +263.4% 146.80 ą 20% stress-ng.time.percent_of_cpu_this_job_got
> 26.79 ą 6% +231.8% 88.89 ą 20% stress-ng.time.system_time
> 46451 ą 4% +190.9% 135122 ą 2% stress-ng.time.voluntary_context_switches
> 197148 +2.7% 202432 proc-vmstat.nr_active_anon
> 1859510 -42.8% 1064284 proc-vmstat.nr_dirtied
> 81387 -84.9% 12287 ą 26% proc-vmstat.nr_dirty
> 1179517 -11.0% 1049203 proc-vmstat.nr_file_pages
> 244166 -54.7% 110697 ą 15% proc-vmstat.nr_inactive_file
> 43062 +1.4% 43664 proc-vmstat.nr_mapped
> 28656 +15.9% 33198 proc-vmstat.nr_shmem
> 103481 ą 2% -53.2% 48448 ą 23% proc-vmstat.nr_writeback
> 1855961 -42.8% 1062406 proc-vmstat.nr_written
> 197148 +2.7% 202432 proc-vmstat.nr_zone_active_anon
> 244166 -54.7% 110697 ą 15% proc-vmstat.nr_zone_inactive_file
> 184834 -67.1% 60733 ą 23% proc-vmstat.nr_zone_write_pending
> 2705223 -38.0% 1677261 proc-vmstat.numa_hit
> 2572798 -40.0% 1544833 proc-vmstat.numa_local
> 2754538 -37.3% 1726600 proc-vmstat.pgalloc_normal
> 2722111 -39.5% 1645913 ą 3% proc-vmstat.pgfree
> 7427330 -42.0% 4306316 proc-vmstat.pgpgout
> 908534 ą 5% -40.5% 540322 numa-vmstat.node0.nr_dirtied
> 39714 ą 7% -85.3% 5835 ą 26% numa-vmstat.node0.nr_dirty
> 118740 ą 5% -52.7% 56112 ą 14% numa-vmstat.node0.nr_inactive_file
> 50433 ą 5% -52.3% 24033 ą 20% numa-vmstat.node0.nr_writeback
> 906619 ą 5% -40.5% 539353 numa-vmstat.node0.nr_written
> 118740 ą 5% -52.7% 56112 ą 14% numa-vmstat.node0.nr_zone_inactive_file
> 90132 ą 5% -66.9% 29866 ą 21% numa-vmstat.node0.nr_zone_write_pending
> 1333077 ą 7% -35.5% 859486 ą 5% numa-vmstat.node0.numa_hit
> 1260303 ą 8% -36.7% 798310 ą 4% numa-vmstat.node0.numa_local
> 950974 ą 5% -44.9% 523966 numa-vmstat.node1.nr_dirtied
> 41787 ą 6% -84.6% 6454 ą 29% numa-vmstat.node1.nr_dirty
> 125464 ą 5% -56.2% 55001 ą 17% numa-vmstat.node1.nr_inactive_file
> 52957 ą 5% -53.9% 24413 ą 29% numa-vmstat.node1.nr_writeback
> 948799 ą 5% -44.9% 523041 numa-vmstat.node1.nr_written
> 125464 ą 5% -56.2% 55001 ą 17% numa-vmstat.node1.nr_zone_inactive_file
> 94729 ą 5% -67.4% 30867 ą 28% numa-vmstat.node1.nr_zone_write_pending
> 1369690 ą 7% -40.4% 815898 ą 5% numa-vmstat.node1.numa_hit
> 1310039 ą 8% -43.2% 744646 ą 4% numa-vmstat.node1.numa_local
> 2.21 -8.8% 2.02 ą 4% perf-stat.i.MPKI
> 6.4e+08 +15.0% 7.358e+08 ą 3% perf-stat.i.branch-instructions
> 2.82 +0.9 3.76 perf-stat.i.branch-miss-rate%
> 28372748 +15.8% 32852598 perf-stat.i.branch-misses
> 13.29 -3.9 9.40 ą 2% perf-stat.i.cache-miss-rate%
> 22292540 +32.9% 29636364 perf-stat.i.cache-references
> 3470 +84.5% 6402 perf-stat.i.context-switches
> 0.96 +106.4% 1.99 ą 4% perf-stat.i.cpi
> 3.473e+09 ą 2% +100.3% 6.957e+09 ą 14% perf-stat.i.cpu-cycles
> 176.57 +32.0% 233.00 ą 2% perf-stat.i.cpu-migrations
> 885.32 ą 3% +109.0% 1850 ą 8% perf-stat.i.cycles-between-cache-misses
> 3.127e+09 +13.2% 3.541e+09 ą 3% perf-stat.i.instructions
> 1.17 -32.0% 0.79 perf-stat.i.ipc
> 4118 +6.8% 4398 perf-stat.i.minor-faults
> 4118 +6.8% 4398 perf-stat.i.page-faults
> 0.96 -7.9% 0.88 ą 3% perf-stat.overall.MPKI
> 13.47 -2.9 10.56 ą 2% perf-stat.overall.cache-miss-rate%
> 1.11 ą 2% +77.0% 1.96 ą 11% perf-stat.overall.cpi
> 1153 ą 2% +92.8% 2225 ą 13% perf-stat.overall.cycles-between-cache-misses
> 0.90 ą 2% -42.8% 0.52 ą 11% perf-stat.overall.ipc
> 6.293e+08 +14.9% 7.231e+08 ą 3% perf-stat.ps.branch-instructions
> 27913642 +15.6% 32280223 perf-stat.ps.branch-misses
> 21928606 +32.8% 29130512 perf-stat.ps.cache-references
> 3413 +84.3% 6292 perf-stat.ps.context-switches
> 3.407e+09 ą 2% +100.9% 6.847e+09 ą 14% perf-stat.ps.cpu-cycles
> 173.70 +31.8% 229.01 ą 2% perf-stat.ps.cpu-migrations
> 3.075e+09 +13.2% 3.48e+09 ą 3% perf-stat.ps.instructions
> 4028 +6.6% 4295 perf-stat.ps.minor-faults
> 4028 +6.6% 4295 perf-stat.ps.page-faults
> 0.04 ą 56% -100.0% 0.00 perf-sched.sch_delay.avg.ms.__cond_resched.kmem_cache_alloc_noprof.alloc_extent_state.__clear_extent_bit.btrfs_dirty_folio
> 0.09 ą 10% -90.6% 0.01 ą299% perf-sched.sch_delay.avg.ms.__cond_resched.submit_bio_noacct.btrfs_submit_chunk.btrfs_submit_bbio.submit_one_bio
> 0.17 ą 33% -100.0% 0.00 perf-sched.sch_delay.avg.ms.btrfs_start_ordered_extent_nowriteback.btrfs_wait_ordered_range.btrfs_sync_file.do_fsync
> 0.03 ą 53% +311.8% 0.11 ą 3% perf-sched.sch_delay.avg.ms.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read.btrfs_tree_read_lock_nested
> 0.01 ą286% +1043.8% 0.11 ą 2% perf-sched.sch_delay.avg.ms.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write.btrfs_tree_lock_nested
> 0.12 ą 15% -91.4% 0.01 ą300% perf-sched.sch_delay.max.ms.__cond_resched.extent_write_cache_pages.btrfs_writepages.do_writepages.filemap_fdatawrite_wbc
> 0.07 ą 51% -100.0% 0.00 perf-sched.sch_delay.max.ms.__cond_resched.kmem_cache_alloc_noprof.alloc_extent_state.__clear_extent_bit.btrfs_dirty_folio
> 0.14 ą 24% -93.7% 0.01 ą299% perf-sched.sch_delay.max.ms.__cond_resched.submit_bio_noacct.btrfs_submit_chunk.btrfs_submit_bbio.submit_one_bio
> 0.91 ą 29% -100.0% 0.00 perf-sched.sch_delay.max.ms.btrfs_start_ordered_extent_nowriteback.btrfs_wait_ordered_range.btrfs_sync_file.do_fsync
> 0.09 ą 46% +3267.9% 3.02 ą230% perf-sched.sch_delay.max.ms.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read.btrfs_tree_read_lock_nested
> 0.01 ą277% +6419.2% 0.65 ą123% perf-sched.sch_delay.max.ms.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write.btrfs_tree_lock_nested
> 0.18 ą 11% +40.1% 0.25 ą 13% perf-sched.sch_delay.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
> 311.05 ą 4% -79.9% 62.61 ą 2% perf-sched.total_wait_and_delay.average.ms
> 6694 ą 3% +331.4% 28884 ą 2% perf-sched.total_wait_and_delay.count.ms
> 310.98 ą 4% -79.9% 62.51 ą 2% perf-sched.total_wait_time.average.ms
> 258.61 ą 12% -100.0% 0.00 perf-sched.wait_and_delay.avg.ms.btrfs_start_ordered_extent_nowriteback.btrfs_wait_ordered_range.btrfs_sync_file.do_fsync
> 31.59 ą 7% -100.0% 0.00 perf-sched.wait_and_delay.avg.ms.do_task_dead.do_exit.do_group_exit.__x64_sys_exit_group.x64_sys_call
> 0.25 ą 8% -100.0% 0.00 perf-sched.wait_and_delay.avg.ms.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 517.55 ą 4% -100.0% 0.00 perf-sched.wait_and_delay.avg.ms.io_schedule.blk_mq_get_tag.__blk_mq_alloc_requests.blk_mq_submit_bio
> 61.81 ą 21% -96.4% 2.24 ą 62% perf-sched.wait_and_delay.avg.ms.io_schedule.folio_wait_bit_common.folio_wait_writeback.__filemap_fdatawait_range
> 4.06 -100.0% 0.00 perf-sched.wait_and_delay.avg.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
> 0.85 ą 4% -100.0% 0.00 perf-sched.wait_and_delay.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
> 16.81 ą 7% -40.1% 10.06 ą 5% perf-sched.wait_and_delay.avg.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
> 0.08 ą 7% -100.0% 0.00 perf-sched.wait_and_delay.avg.ms.wait_for_partner.fifo_open.do_dentry_open.vfs_open
> 71.10 ą 7% -100.0% 0.00 perf-sched.wait_and_delay.count.btrfs_start_ordered_extent_nowriteback.btrfs_wait_ordered_range.btrfs_sync_file.do_fsync
> 124.00 -100.0% 0.00 perf-sched.wait_and_delay.count.do_task_dead.do_exit.do_group_exit.__x64_sys_exit_group.x64_sys_call
> 93.80 ą 3% -100.0% 0.00 perf-sched.wait_and_delay.count.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 816.00 ą 3% -100.0% 0.00 perf-sched.wait_and_delay.count.io_schedule.blk_mq_get_tag.__blk_mq_alloc_requests.blk_mq_submit_bio
> 896.30 ą 20% -53.8% 414.40 ą 31% perf-sched.wait_and_delay.count.io_schedule.folio_wait_bit_common.folio_wait_writeback.__filemap_fdatawait_range
> 44.40 ą 10% -100.0% 0.00 perf-sched.wait_and_delay.count.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
> 87.00 -100.0% 0.00 perf-sched.wait_and_delay.count.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
> 283.50 ą 6% +59.7% 452.70 ą 4% perf-sched.wait_and_delay.count.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
> 36.00 -100.0% 0.00 perf-sched.wait_and_delay.count.wait_for_partner.fifo_open.do_dentry_open.vfs_open
> 2195 ą 2% -42.3% 1266 ą 6% perf-sched.wait_and_delay.count.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
> 1312 ą 25% -100.0% 0.00 perf-sched.wait_and_delay.max.ms.btrfs_start_ordered_extent_nowriteback.btrfs_wait_ordered_range.btrfs_sync_file.do_fsync
> 1503 ą 32% -100.0% 0.00 perf-sched.wait_and_delay.max.ms.do_task_dead.do_exit.do_group_exit.__x64_sys_exit_group.x64_sys_call
> 0.93 ą 21% -100.0% 0.00 perf-sched.wait_and_delay.max.ms.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 1775 ą 14% -100.0% 0.00 perf-sched.wait_and_delay.max.ms.io_schedule.blk_mq_get_tag.__blk_mq_alloc_requests.blk_mq_submit_bio
> 1699 ą 13% -92.7% 123.85 ą 56% perf-sched.wait_and_delay.max.ms.io_schedule.folio_wait_bit_common.folio_wait_writeback.__filemap_fdatawait_range
> 5.24 ą 32% -100.0% 0.00 perf-sched.wait_and_delay.max.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
> 1.94 ą 9% -100.0% 0.00 perf-sched.wait_and_delay.max.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
> 0.18 ą 18% -100.0% 0.00 perf-sched.wait_and_delay.max.ms.wait_for_partner.fifo_open.do_dentry_open.vfs_open
> 520.63 ą 38% -99.6% 2.04 ą300% perf-sched.wait_time.avg.ms.__cond_resched.extent_write_cache_pages.btrfs_writepages.do_writepages.filemap_fdatawrite_wbc
> 12.11 ą297% -100.0% 0.00 perf-sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc_noprof.alloc_extent_state.__clear_extent_bit.btrfs_dirty_folio
> 550.61 ą 26% -99.8% 0.86 ą300% perf-sched.wait_time.avg.ms.__cond_resched.submit_bio_noacct.btrfs_submit_chunk.btrfs_submit_bbio.submit_one_bio
> 258.43 ą 12% -100.0% 0.00 perf-sched.wait_time.avg.ms.btrfs_start_ordered_extent_nowriteback.btrfs_wait_ordered_range.btrfs_sync_file.do_fsync
> 517.46 ą 4% -98.8% 6.46 ą116% perf-sched.wait_time.avg.ms.io_schedule.blk_mq_get_tag.__blk_mq_alloc_requests.blk_mq_submit_bio
> 61.80 ą 21% -96.4% 2.22 ą 62% perf-sched.wait_time.avg.ms.io_schedule.folio_wait_bit_common.folio_wait_writeback.__filemap_fdatawait_range
> 6.98 ą198% +503.0% 42.09 ą 4% perf-sched.wait_time.avg.ms.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read.btrfs_tree_read_lock_nested
> 0.11 ą298% +5798.7% 6.59 ą 16% perf-sched.wait_time.avg.ms.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write.btrfs_tree_lock_nested
> 16.72 ą 7% -40.3% 9.98 ą 5% perf-sched.wait_time.avg.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
> 1160 ą 36% -99.8% 2.04 ą300% perf-sched.wait_time.max.ms.__cond_resched.extent_write_cache_pages.btrfs_writepages.do_writepages.filemap_fdatawrite_wbc
> 36.11 ą298% -100.0% 0.00 perf-sched.wait_time.max.ms.__cond_resched.kmem_cache_alloc_noprof.alloc_extent_state.__clear_extent_bit.btrfs_dirty_folio
> 1344 ą 25% -99.9% 0.86 ą300% perf-sched.wait_time.max.ms.__cond_resched.submit_bio_noacct.btrfs_submit_chunk.btrfs_submit_bbio.submit_one_bio
> 1312 ą 25% -100.0% 0.00 perf-sched.wait_time.max.ms.btrfs_start_ordered_extent_nowriteback.btrfs_wait_ordered_range.btrfs_sync_file.do_fsync
> 1775 ą 14% -96.5% 62.99 ą153% perf-sched.wait_time.max.ms.io_schedule.blk_mq_get_tag.__blk_mq_alloc_requests.blk_mq_submit_bio
> 1699 ą 13% -92.7% 123.76 ą 56% perf-sched.wait_time.max.ms.io_schedule.folio_wait_bit_common.folio_wait_writeback.__filemap_fdatawait_range
> 87.05 ą203% +562.4% 576.63 ą 58% perf-sched.wait_time.max.ms.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read.btrfs_tree_read_lock_nested
> 0.11 ą297% +2.5e+05% 276.17 ą 46% perf-sched.wait_time.max.ms.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write.btrfs_tree_lock_nested
>
>
> ***************************************************************************************************
> lkp-icl-2sp4: 128 threads 2 sockets Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz (Ice Lake) with 128G memory
> =========================================================================================
> compiler/cpufreq_governor/disk/fs/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
> gcc-12/performance/1HDD/btrfs/x86_64-rhel-9.4/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp4/copy-file/stress-ng/60s
>
> commit:
> 846b534075 ("btrfs: fix typo in space info explanation")
> 5e85262e54 ("btrfs: fix fsync of files with no hard links not persisting deletion")
>
> 846b534075f45d5b 5e85262e542d6da8898bb8563a7
> ---------------- ---------------------------
> %stddev %change %stddev
> \ | \
> 823946 -46.9% 437583 ą 2% cpuidle..usage
> 8144 ą 5% +24.5% 10135 ą 2% uptime.idle
> 2739200 ą 13% -35.9% 1754761 ą 4% numa-numastat.node0.local_node
> 2802155 ą 12% -35.3% 1811817 ą 4% numa-numastat.node0.numa_hit
> 45.55 ą 11% +52.7% 69.54 ą 3% iostat.cpu.idle
> 53.51 ą 9% -47.1% 28.33 ą 8% iostat.cpu.iowait
> 0.58 ą 2% +205.6% 1.76 ą 8% iostat.cpu.system
> 214.90 ą 10% -33.1% 143.70 ą 9% perf-c2c.DRAM.local
> 309.50 ą 17% +44.9% 448.40 ą 17% perf-c2c.DRAM.remote
> 141.70 ą 19% +79.3% 254.00 ą 21% perf-c2c.HITM.remote
> 4909242 ą 16% -38.3% 3030044 ą 5% numa-meminfo.node0.Inactive
> 4909242 ą 16% -38.3% 3030044 ą 5% numa-meminfo.node0.Inactive(file)
> 14849 ą 14% -77.3% 3372 ą 19% numa-meminfo.node0.Writeback
> 4951 ą 24% +55.6% 7704 ą 5% numa-meminfo.node1.Dirty
> 8065 ą 24% -58.9% 3316 ą 21% numa-meminfo.node1.Writeback
> 45.59 ą 11% +52.6% 69.57 ą 3% vmstat.cpu.id
> 53.47 ą 9% -47.1% 28.30 ą 8% vmstat.cpu.wa
> 112259 -32.6% 75701 vmstat.io.bo
> 119.44 -68.2% 37.98 ą 15% vmstat.procs.b
> 6317 -13.9% 5436 ą 2% vmstat.system.cs
> 14212 -14.9% 12092 ą 3% vmstat.system.in
> 11915567 -21.0% 9413763 meminfo.Cached
> 13350 ą 2% +22.6% 16366 ą 3% meminfo.Dirty
> 8170490 -30.6% 5668142 meminfo.Inactive
> 8170490 -30.6% 5668142 meminfo.Inactive(file)
> 14292973 -15.2% 12125438 meminfo.Memused
> 22993 -70.2% 6859 ą 19% meminfo.Writeback
> 14412570 -14.6% 12307953 meminfo.max_used_kB
> 43.84 ą 12% +24.8 68.62 ą 3% mpstat.cpu.all.idle%
> 55.23 ą 9% -26.0 29.23 ą 8% mpstat.cpu.all.iowait%
> 0.05 ą 3% -0.0 0.03 ą 2% mpstat.cpu.all.irq%
> 0.05 ą 6% -0.0 0.03 ą 5% mpstat.cpu.all.soft%
> 0.45 ą 2% +1.3 1.72 ą 9% mpstat.cpu.all.sys%
> 1.00 +6100.0% 62.00 mpstat.max_utilization.seconds
> 3.66 ą 2% +94.2% 7.10 ą 30% mpstat.max_utilization_pct
> 0.59 +217.1% 1.87 ą 7% stress-ng.copy-file.MB_per_sec_copy_rate
> 18179 -33.4% 12113 stress-ng.copy-file.ops
> 302.23 -33.6% 200.60 stress-ng.copy-file.ops_per_sec
> 14350748 -33.0% 9617412 stress-ng.time.file_system_outputs
> 1420 ą 8% +484.5% 8300 ą 6% stress-ng.time.involuntary_context_switches
> 41.30 ą 2% +407.0% 209.40 ą 9% stress-ng.time.percent_of_cpu_this_job_got
> 24.57 ą 2% +415.4% 126.64 ą 9% stress-ng.time.system_time
> 89786 ą 2% -17.3% 74281 ą 5% stress-ng.time.voluntary_context_switches
> 1160907 ą 13% -45.0% 638139 ą 3% numa-vmstat.node0.nr_dirtied
> 1213548 ą 16% -37.3% 761371 ą 5% numa-vmstat.node0.nr_inactive_file
> 3666 ą 14% -76.8% 852.16 ą 15% numa-vmstat.node0.nr_writeback
> 1156044 ą 13% -45.0% 635340 ą 3% numa-vmstat.node0.nr_written
> 1213551 ą 16% -37.3% 761371 ą 5% numa-vmstat.node0.nr_zone_inactive_file
> 5755 ą 13% -47.1% 3041 ą 5% numa-vmstat.node0.nr_zone_write_pending
> 2803191 ą 12% -35.4% 1810577 ą 4% numa-vmstat.node0.numa_hit
> 2740236 ą 13% -36.0% 1753521 ą 4% numa-vmstat.node0.numa_local
> 1244 ą 24% +56.9% 1953 ą 5% numa-vmstat.node1.nr_dirty
> 1981 ą 24% -59.4% 804.96 ą 14% numa-vmstat.node1.nr_writeback
> 1799013 -32.9% 1206900 proc-vmstat.nr_dirtied
> 3360 ą 2% +21.6% 4086 ą 3% proc-vmstat.nr_dirty
> 2971433 -20.8% 2353137 proc-vmstat.nr_file_pages
> 29348580 +1.8% 29883631 proc-vmstat.nr_free_pages
> 29282199 +1.9% 29836480 proc-vmstat.nr_free_pages_blocks
> 2034863 -30.4% 1416458 proc-vmstat.nr_inactive_file
> 38408 -6.3% 36004 proc-vmstat.nr_slab_reclaimable
> 5706 -70.1% 1708 ą 15% proc-vmstat.nr_writeback
> 1790881 -32.9% 1201249 proc-vmstat.nr_written
> 2034863 -30.4% 1416458 proc-vmstat.nr_zone_inactive_file
> 8998 -35.6% 5795 ą 2% proc-vmstat.nr_zone_write_pending
> 4444162 -23.1% 3416163 proc-vmstat.numa_hit
> 4311404 -23.8% 3283749 proc-vmstat.numa_local
> 566.30 ą 89% +1255.8% 7678 ą247% proc-vmstat.numa_pte_updates
> 4505726 -22.8% 3477149 proc-vmstat.pgalloc_normal
> 7184379 -32.4% 4859319 proc-vmstat.pgpgout
> 2.68 ą 2% -25.7% 1.99 ą 4% perf-stat.i.MPKI
> 7.342e+08 +6.6% 7.826e+08 perf-stat.i.branch-instructions
> 3.45 -0.7 2.77 ą 2% perf-stat.i.branch-miss-rate%
> 33462192 -4.2% 32070836 perf-stat.i.branch-misses
> 10.22 +3.1 13.35 perf-stat.i.cache-miss-rate%
> 5043813 ą 2% -14.9% 4291664 perf-stat.i.cache-misses
> 50234628 -34.5% 32914395 perf-stat.i.cache-references
> 5523 ą 3% -8.1% 5077 perf-stat.i.context-switches
> 1.15 +236.5% 3.88 ą 6% perf-stat.i.cpi
> 3.417e+09 +149.6% 8.527e+09 ą 6% perf-stat.i.cpu-cycles
> 196.17 +12.1% 219.94 ą 2% perf-stat.i.cpu-migrations
> 615.76 +229.7% 2030 ą 7% perf-stat.i.cycles-between-cache-misses
> 0.97 -58.0% 0.41 ą 5% perf-stat.i.ipc
> 1.39 ą 2% -18.2% 1.14 perf-stat.overall.MPKI
> 4.56 -0.5 4.10 ą 2% perf-stat.overall.branch-miss-rate%
> 10.03 +3.0 13.04 perf-stat.overall.cache-miss-rate%
> 0.94 +140.0% 2.27 ą 6% perf-stat.overall.cpi
> 677.89 +193.3% 1988 ą 6% perf-stat.overall.cycles-between-cache-misses
> 1.06 -58.2% 0.44 ą 6% perf-stat.overall.ipc
> 7.214e+08 +6.6% 7.691e+08 perf-stat.ps.branch-instructions
> 32880299 -4.1% 31516354 perf-stat.ps.branch-misses
> 4956935 ą 2% -14.9% 4218155 perf-stat.ps.cache-misses
> 49395581 -34.5% 32356624 perf-stat.ps.cache-references
> 5429 ą 3% -8.1% 4992 perf-stat.ps.context-switches
> 3.359e+09 +149.7% 8.387e+09 ą 6% perf-stat.ps.cpu-cycles
> 192.84 +12.1% 216.24 ą 2% perf-stat.ps.cpu-migrations
> 0.09 ą 14% +156.9% 0.23 ą 61% perf-sched.sch_delay.avg.ms.__cond_resched.__alloc_frozen_pages_noprof.alloc_pages_mpol.folio_alloc_noprof.__filemap_get_folio
> 0.11 ą 30% +1780.8% 2.08 ą170% perf-sched.sch_delay.avg.ms.__cond_resched.__filemap_get_folio.prepare_one_folio.constprop.0
> 0.09 ą 28% +4092.1% 3.64 ą235% perf-sched.sch_delay.avg.ms.__cond_resched.kmem_cache_alloc_noprof.alloc_extent_state.__set_extent_bit.set_extent_bit
> 0.11 ą 37% -71.8% 0.03 ą153% perf-sched.sch_delay.avg.ms.__cond_resched.kmem_cache_alloc_noprof.start_transaction.btrfs_dirty_inode.touch_atime
> 0.23 ą 68% -60.2% 0.09 ą 3% perf-sched.sch_delay.avg.ms.btrfs_start_ordered_extent_nowriteback.btrfs_wait_ordered_range.btrfs_remap_file_range.vfs_copy_file_range
> 0.22 ą 85% -100.0% 0.00 perf-sched.sch_delay.avg.ms.btrfs_start_ordered_extent_nowriteback.btrfs_wait_ordered_range.btrfs_sync_file.do_fsync
> 0.04 ą 17% +268.7% 0.15 ą175% perf-sched.sch_delay.avg.ms.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write.btrfs_tree_lock_nested
> 0.16 ą 24% -79.4% 0.03 ą154% perf-sched.sch_delay.max.ms.__cond_resched.kmem_cache_alloc_noprof.start_transaction.btrfs_dirty_inode.touch_atime
> 0.11 ą 38% +44277.4% 48.59 ą284% perf-sched.sch_delay.max.ms.__cond_resched.lock_delalloc_folios.find_lock_delalloc_range.writepage_delalloc.extent_writepage
> 80.59 ą124% -100.0% 0.00 perf-sched.sch_delay.max.ms.btrfs_start_ordered_extent_nowriteback.btrfs_wait_ordered_range.btrfs_sync_file.do_fsync
> 154.78 ą 4% +15.1% 178.11 ą 4% perf-sched.total_wait_and_delay.average.ms
> 154.67 ą 4% +15.1% 177.96 ą 4% perf-sched.total_wait_time.average.ms
> 41.27 ą207% +533.8% 261.54 ą 17% perf-sched.wait_and_delay.avg.ms.__cond_resched.lock_delalloc_folios.find_lock_delalloc_range.writepage_delalloc.extent_writepage
> 200.09 ą 3% -68.0% 64.08 ą 14% perf-sched.wait_and_delay.avg.ms.btrfs_start_ordered_extent_nowriteback.btrfs_wait_ordered_range.btrfs_remap_file_range.vfs_copy_file_range
> 198.42 ą 3% -100.0% 0.00 perf-sched.wait_and_delay.avg.ms.btrfs_start_ordered_extent_nowriteback.btrfs_wait_ordered_range.btrfs_sync_file.do_fsync
> 37.96 ą 14% +150.7% 95.16 ą 22% perf-sched.wait_and_delay.avg.ms.io_schedule.folio_wait_bit_common.folio_wait_writeback.__filemap_fdatawait_range
> 18.02 ą 24% +270.4% 66.75 ą 16% perf-sched.wait_and_delay.avg.ms.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read.btrfs_tree_read_lock_nested
> 8.29 ą 68% +438.3% 44.64 ą 25% perf-sched.wait_and_delay.avg.ms.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write.btrfs_tree_lock_nested
> 7.15 ą 11% +199.9% 21.45 ą 2% perf-sched.wait_and_delay.avg.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
> 170.75 ą 10% +46.2% 249.63 ą 10% perf-sched.wait_and_delay.avg.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
> 0.50 ą204% +26480.0% 132.90 ą 28% perf-sched.wait_and_delay.count.__cond_resched.lock_delalloc_folios.find_lock_delalloc_range.writepage_delalloc.extent_writepage
> 1243 ą 5% -35.0% 808.80 ą 7% perf-sched.wait_and_delay.count.btrfs_start_ordered_extent_nowriteback.btrfs_wait_ordered_range.btrfs_remap_file_range.vfs_copy_file_range
> 1210 ą 3% -100.0% 0.00 perf-sched.wait_and_delay.count.btrfs_start_ordered_extent_nowriteback.btrfs_wait_ordered_range.btrfs_sync_file.do_fsync
> 3079 -30.7% 2134 perf-sched.wait_and_delay.count.io_schedule.folio_wait_bit_common.folio_wait_writeback.__filemap_fdatawait_range
> 379.00 ą 22% +107.1% 784.90 ą 11% perf-sched.wait_and_delay.count.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read.btrfs_tree_read_lock_nested
> 83.70 ą 60% +349.1% 375.90 ą 10% perf-sched.wait_and_delay.count.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write.btrfs_tree_lock_nested
> 673.70 ą 11% -67.7% 217.30 ą 3% perf-sched.wait_and_delay.count.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
> 74.87 ą216% +687.0% 589.20 ą 10% perf-sched.wait_and_delay.max.ms.__cond_resched.lock_delalloc_folios.find_lock_delalloc_range.writepage_delalloc.extent_writepage
> 422.52 ą 20% -45.8% 229.04 ą 12% perf-sched.wait_and_delay.max.ms.btrfs_start_ordered_extent_nowriteback.btrfs_wait_ordered_range.btrfs_remap_file_range.vfs_copy_file_range
> 501.20 ą 18% -100.0% 0.00 perf-sched.wait_and_delay.max.ms.btrfs_start_ordered_extent_nowriteback.btrfs_wait_ordered_range.btrfs_sync_file.do_fsync
> 28.77 ą 66% +1173.7% 366.39 ą 21% perf-sched.wait_time.avg.ms.__cond_resched.__alloc_frozen_pages_noprof.alloc_pages_mpol.folio_alloc_noprof.__filemap_get_folio
> 43.62 ą 74% +685.2% 342.46 ą 10% perf-sched.wait_time.avg.ms.__cond_resched.__filemap_get_folio.prepare_one_folio.constprop.0
> 38.14 ą 71% +829.9% 354.72 ą 18% perf-sched.wait_time.avg.ms.__cond_resched.btrfs_buffered_write.btrfs_do_write_iter.vfs_write.ksys_write
> 36.28 ą 66% +515.5% 223.32 ą 17% perf-sched.wait_time.avg.ms.__cond_resched.extent_write_cache_pages.btrfs_writepages.do_writepages.filemap_fdatawrite_wbc
> 38.49 ą 69% +694.7% 305.87 ą 17% perf-sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc_noprof.alloc_extent_state.__set_extent_bit.set_extent_bit
> 53.67 ą110% -96.4% 1.94 ą297% perf-sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc_noprof.start_transaction.btrfs_dirty_inode.touch_atime
> 52.47 ą158% +397.0% 260.79 ą 16% perf-sched.wait_time.avg.ms.__cond_resched.lock_delalloc_folios.find_lock_delalloc_range.writepage_delalloc.extent_writepage
> 199.86 ą 3% -68.0% 63.99 ą 14% perf-sched.wait_time.avg.ms.btrfs_start_ordered_extent_nowriteback.btrfs_wait_ordered_range.btrfs_remap_file_range.vfs_copy_file_range
> 198.21 ą 3% -100.0% 0.00 perf-sched.wait_time.avg.ms.btrfs_start_ordered_extent_nowriteback.btrfs_wait_ordered_range.btrfs_sync_file.do_fsync
> 37.86 ą 14% +150.9% 94.99 ą 22% perf-sched.wait_time.avg.ms.io_schedule.folio_wait_bit_common.folio_wait_writeback.__filemap_fdatawait_range
> 17.94 ą 24% +271.5% 66.66 ą 16% perf-sched.wait_time.avg.ms.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read.btrfs_tree_read_lock_nested
> 10.87 ą 37% +309.3% 44.50 ą 24% perf-sched.wait_time.avg.ms.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write.btrfs_tree_lock_nested
> 7.06 ą 11% +202.8% 21.37 ą 2% perf-sched.wait_time.avg.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
> 170.68 ą 10% +46.2% 249.55 ą 10% perf-sched.wait_time.avg.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
> 225.30 ą 43% +146.1% 554.37 ą 15% perf-sched.wait_time.max.ms.__cond_resched.__alloc_frozen_pages_noprof.alloc_pages_mpol.folio_alloc_noprof.__filemap_get_folio
> 179.96 ą 58% +225.3% 585.39 ą 12% perf-sched.wait_time.max.ms.__cond_resched.extent_write_cache_pages.btrfs_writepages.do_writepages.filemap_fdatawrite_wbc
> 243.25 ą 38% +140.0% 583.77 ą 10% perf-sched.wait_time.max.ms.__cond_resched.kmem_cache_alloc_noprof.alloc_extent_state.__set_extent_bit.set_extent_bit
> 108.35 ą100% -98.2% 1.98 ą297% perf-sched.wait_time.max.ms.__cond_resched.kmem_cache_alloc_noprof.start_transaction.btrfs_dirty_inode.touch_atime
> 119.47 ą140% +393.1% 589.11 ą 10% perf-sched.wait_time.max.ms.__cond_resched.lock_delalloc_folios.find_lock_delalloc_range.writepage_delalloc.extent_writepage
> 422.43 ą 20% -45.8% 228.95 ą 12% perf-sched.wait_time.max.ms.btrfs_start_ordered_extent_nowriteback.btrfs_wait_ordered_range.btrfs_remap_file_range.vfs_copy_file_range
> 501.11 ą 18% -100.0% 0.00 perf-sched.wait_time.max.ms.btrfs_start_ordered_extent_nowriteback.btrfs_wait_ordered_range.btrfs_sync_file.do_fsync
>
>
>
>
>
> Disclaimer:
> Results have been estimated based on internal Intel analysis and are provided
> for informational purposes only. Any difference in system hardware or software
> design or configuration may affect actual performance.
>
>
> --
> 0-DAY CI Kernel Test Service
> https://github.com/intel/lkp-tests/wiki
>
>
Powered by blists - more mailing lists