[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aAdYdBjy4ERcfv3P@li-dc0c254c-257c-11b2-a85c-98b6c1322444.ibm.com>
Date: Tue, 22 Apr 2025 14:21:00 +0530
From: Ojaswin Mujoo <ojaswin@...ux.ibm.com>
To: kernel test robot <oliver.sang@...el.com>
Cc: oe-lkp@...ts.linux.dev, lkp@...el.com, linux-kernel@...r.kernel.org,
"Theodore Ts'o" <tytso@....edu>, Jan Kara <jack@...e.cz>,
Ritesh Harjani <ritesh.list@...il.com>,
Baokun Li <libaokun1@...wei.com>, linux-ext4@...r.kernel.org
Subject: Re: [linus:master] [ext4] 896b02d0b9: stress-ng.fiemap.ops_per_sec
15.8% regression
On Mon, Apr 21, 2025 at 01:56:07PM +0800, kernel test robot wrote:
>
>
> Hello,
>
> kernel test robot noticed a 15.8% regression of stress-ng.fiemap.ops_per_sec on:
>
>
> commit: 896b02d0b9e7deb4a4eb365e13dd912b49916519 ("ext4: Make sb update interval tunable")
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>
> [test failed on linus/master 1a1d569a75f3ab2923cb62daf356d102e4df2b86]
> [test failed on linux-next/master f660850bc246fef15ba78c81f686860324396628]
>
> testcase: stress-ng
> config: x86_64-rhel-9.4
> compiler: gcc-12
> test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
> parameters:
>
> nr_threads: 100%
> disk: 1HDD
> testtime: 60s
> fs: ext4
> test: fiemap
> cpufreq_governor: performance
>
>
>
>
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <oliver.sang@...el.com>
> | Closes: https://lore.kernel.org/oe-lkp/202504211320.5025beac-lkp@intel.com
>
>
> Details are as below:
> -------------------------------------------------------------------------------------------------->
>
>
> The kernel config and materials to reproduce are available at:
> https://download.01.org/0day-ci/archive/20250421/202504211320.5025beac-lkp@intel.com
>
> =========================================================================================
> compiler/cpufreq_governor/disk/fs/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
> gcc-12/performance/1HDD/ext4/x86_64-rhel-9.4/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp8/fiemap/stress-ng/60s
>
> commit:
> ce2f26e737 ("ext4: avoid journaling sb update on error if journal is destroying")
> 896b02d0b9 ("ext4: Make sb update interval tunable")
Hmm this is strange since this patch doesn't intend to change any
functionality. Let me try out the reproducer and get back to you if
I see the regression as well.
Thanks,
ojaswin
>
> ce2f26e73783b4a7 896b02d0b9e7deb4a4eb365e13d
> ---------------- ---------------------------
> %stddev %change %stddev
> \ | \
> 2.29 ± 6% -13.2% 1.99 ± 7% iostat.cpu.user
> 1688381 ± 2% -16.4% 1411358 ± 4% vmstat.system.cs
> 0.01 ± 12% -0.0 0.01 ± 10% mpstat.cpu.all.soft%
> 2.33 ± 6% -0.3 2.01 ± 7% mpstat.cpu.all.usr%
> 28467 ± 22% +39.4% 39676 ± 13% numa-meminfo.node0.Inactive
> 28467 ± 22% +39.4% 39676 ± 13% numa-meminfo.node0.Inactive(file)
> 7099 ± 22% +40.0% 9937 ± 13% numa-vmstat.node0.nr_inactive_file
> 7099 ± 22% +40.0% 9937 ± 13% numa-vmstat.node0.nr_zone_inactive_file
> 3186734 ± 2% -13.0% 2770907 meminfo.Active
> 3186734 ± 2% -13.0% 2770907 meminfo.Active(anon)
> 2332511 ± 4% -17.1% 1933636 ± 2% meminfo.Shmem
> 829642 ± 3% -16.0% 696952 ± 4% sched_debug.cpu.nr_switches.avg
> 992823 ± 4% -14.5% 849332 ± 6% sched_debug.cpu.nr_switches.max
> 153915 ± 5% -15.2% 130547 ± 11% sched_debug.cpu.nr_switches.stddev
> 64971130 ± 3% -15.8% 54708915 ± 4% stress-ng.fiemap.ops
> 1067456 ± 3% -15.8% 898490 ± 4% stress-ng.fiemap.ops_per_sec
> 1.079e+08 ± 2% -16.0% 90643140 ± 4% stress-ng.time.involuntary_context_switches
> 67.64 ± 9% -16.5% 56.46 ± 9% stress-ng.time.user_time
> 797292 ± 2% -13.2% 692413 proc-vmstat.nr_active_anon
> 1499550 -6.7% 1398519 proc-vmstat.nr_file_pages
> 583908 ± 4% -17.3% 483006 ± 2% proc-vmstat.nr_shmem
> 797292 ± 2% -13.2% 692413 proc-vmstat.nr_zone_active_anon
> 1256873 ± 2% -12.3% 1102563 proc-vmstat.numa_hit
> 1190636 ± 2% -13.0% 1036297 proc-vmstat.numa_local
> 1367314 ± 2% -11.3% 1212232 ± 2% proc-vmstat.pgalloc_normal
> 369155 ± 5% -6.9% 343753 ± 2% proc-vmstat.pgfault
> 2.492e+10 ± 2% -18.5% 2.031e+10 perf-stat.i.branch-instructions
> 1.232e+08 ± 7% -15.2% 1.045e+08 ± 6% perf-stat.i.branch-misses
> 3.895e+08 ± 2% -8.8% 3.553e+08 perf-stat.i.cache-references
> 1759358 ± 2% -16.0% 1478535 ± 3% perf-stat.i.context-switches
> 1.80 ± 2% +22.5% 2.20 perf-stat.i.cpi
> 1.262e+11 ± 2% -18.5% 1.028e+11 perf-stat.i.instructions
> 0.56 ± 2% -18.5% 0.46 perf-stat.i.ipc
> 27.44 ± 2% -15.8% 23.10 ± 3% perf-stat.i.metric.K/sec
> 4132 ± 5% -8.9% 3766 ± 2% perf-stat.i.minor-faults
> 4132 ± 5% -8.9% 3766 ± 2% perf-stat.i.page-faults
> 1.79 ± 2% +22.8% 2.20 perf-stat.overall.cpi
> 0.56 ± 2% -18.6% 0.45 perf-stat.overall.ipc
> 2.45e+10 ± 2% -18.5% 1.998e+10 perf-stat.ps.branch-instructions
> 1.21e+08 ± 7% -15.0% 1.028e+08 ± 6% perf-stat.ps.branch-misses
> 3.828e+08 ± 2% -8.7% 3.496e+08 perf-stat.ps.cache-references
> 1729946 ± 2% -15.9% 1454325 ± 3% perf-stat.ps.context-switches
> 1.241e+11 ± 2% -18.5% 1.011e+11 perf-stat.ps.instructions
> 4060 ± 5% -8.8% 3703 ± 2% perf-stat.ps.minor-faults
> 4060 ± 5% -8.8% 3703 ± 2% perf-stat.ps.page-faults
> 7.699e+12 ± 2% -18.7% 6.262e+12 ± 2% perf-stat.total.instructions
> 0.02 ±194% +3583.1% 0.80 ±127% perf-sched.sch_delay.avg.ms.__cond_resched.zap_pte_range.zap_pmd_range.isra.0
> 0.14 ± 5% +21.4% 0.17 ± 2% perf-sched.sch_delay.avg.ms.__x64_sys_sched_yield.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
> 0.02 ± 12% +35.4% 0.03 ± 11% perf-sched.sch_delay.avg.ms.schedule_timeout.kcompactd.kthread.ret_from_fork
> 0.01 ± 3% +14.8% 0.01 ± 6% perf-sched.sch_delay.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
> 2.90 ±107% +259.8% 10.43 ± 35% perf-sched.sch_delay.max.ms.__cond_resched.down_write.shmem_file_write_iter.vfs_write.ksys_write
> 8.62 ± 28% +69.3% 14.58 ± 8% perf-sched.sch_delay.max.ms.__cond_resched.shmem_get_folio_gfp.shmem_write_begin.generic_perform_write.shmem_file_write_iter
> 0.02 ±194% +5175.4% 1.14 ± 88% perf-sched.sch_delay.max.ms.__cond_resched.zap_pte_range.zap_pmd_range.isra.0
> 395.95 ±185% -99.6% 1.67 ± 45% perf-sched.sch_delay.max.ms.do_nanosleep.hrtimer_nanosleep.common_nsleep.__x64_sys_clock_nanosleep
> 10.22 ± 15% +49.3% 15.26 ± 4% perf-sched.sch_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown]
> 0.05 ± 10% +29.0% 0.06 ± 13% perf-sched.sch_delay.max.ms.schedule_timeout.kcompactd.kthread.ret_from_fork
> 0.14 ± 5% +20.6% 0.17 ± 2% perf-sched.total_sch_delay.average.ms
> 0.44 ± 3% +19.7% 0.52 ± 3% perf-sched.total_wait_and_delay.average.ms
> 6639596 ± 3% -17.7% 5461347 ± 3% perf-sched.total_wait_and_delay.count.ms
> 0.30 ± 2% +19.3% 0.36 ± 3% perf-sched.total_wait_time.average.ms
> 0.28 ± 3% +20.8% 0.34 ± 2% perf-sched.wait_and_delay.avg.ms.__x64_sys_sched_yield.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
> 6614032 ± 3% -17.8% 5436664 ± 3% perf-sched.wait_and_delay.count.__x64_sys_sched_yield.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
> 5.80 ±107% +259.8% 20.86 ± 35% perf-sched.wait_and_delay.max.ms.__cond_resched.down_write.shmem_file_write_iter.vfs_write.ksys_write
> 17.23 ± 28% +69.3% 29.17 ± 8% perf-sched.wait_and_delay.max.ms.__cond_resched.shmem_get_folio_gfp.shmem_write_begin.generic_perform_write.shmem_file_write_iter
> 20.45 ± 15% +63.4% 33.41 ± 21% perf-sched.wait_and_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown]
> 0.02 ±223% +4040.9% 0.79 ±127% perf-sched.wait_time.avg.ms.__cond_resched.zap_pte_range.zap_pmd_range.isra.0
> 0.15 ± 2% +20.6% 0.18 ± 2% perf-sched.wait_time.avg.ms.__x64_sys_sched_yield.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
> 2.90 ±107% +259.8% 10.43 ± 35% perf-sched.wait_time.max.ms.__cond_resched.down_write.shmem_file_write_iter.vfs_write.ksys_write
> 8.62 ± 28% +69.3% 14.58 ± 8% perf-sched.wait_time.max.ms.__cond_resched.shmem_get_folio_gfp.shmem_write_begin.generic_perform_write.shmem_file_write_iter
> 0.02 ±223% +5863.5% 1.14 ± 88% perf-sched.wait_time.max.ms.__cond_resched.zap_pte_range.zap_pmd_range.isra.0
> 10.22 ± 15% +103.2% 20.77 ± 61% perf-sched.wait_time.max.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown]
>
>
>
>
> Disclaimer:
> Results have been estimated based on internal Intel analysis and are provided
> for informational purposes only. Any difference in system hardware or software
> design or configuration may affect actual performance.
>
>
> --
> 0-DAY CI Kernel Test Service
> https://github.com/intel/lkp-tests/wiki
>
Powered by blists - more mailing lists