[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <202501261527.c3bf4764-lkp@intel.com>
Date: Sun, 26 Jan 2025 16:25:58 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Jeff Layton <jlayton@...nel.org>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, <linux-kernel@...r.kernel.org>,
Christian Brauner <brauner@...nel.org>, Thomas Gleixner <tglx@...utronix.de>,
John Stultz <jstultz@...gle.com>, <oliver.sang@...el.com>
Subject: [linus:master] [timekeeping] ee3283c608:
will-it-scale.per_process_ops 4.8% regression
hi, Jeff Layton,
we make out below report just FYI since the results is stable in our tests.
we don't have enough knowledge if this regression is due to align.
+static __cacheline_aligned_in_smp atomic64_t mg_floor;
if low value, please just ignore. thanks a lot.
Hello,
kernel test robot noticed a 4.8% regression of will-it-scale.per_process_ops on:
commit: ee3283c608dfa21251b0821d7bb198c7ae3189f6 ("timekeeping: Add interfaces for handling timestamps with a floor value")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
[test failed on linus/master bc8198dc7ebc492ec3e9fa1617dcdfbe98e73b17]
[test failed on linux-next/master 5ffa57f6eecefababb8cbe327222ef171943b183]
testcase: will-it-scale
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 104 threads 2 sockets (Skylake) with 192G memory
parameters:
nr_task: 100%
mode: process
test: pwrite1
cpufreq_governor: performance
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@...el.com>
| Closes: https://lore.kernel.org/oe-lkp/202501261527.c3bf4764-lkp@intel.com
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250126/202501261527.c3bf4764-lkp@intel.com
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
gcc-12/performance/x86_64-rhel-9.4/process/100%/debian-12-x86_64-20240206.cgz/lkp-skl-fpga01/pwrite1/will-it-scale
commit:
v6.12-rc2
ee3283c608 ("timekeeping: Add interfaces for handling timestamps with a floor value")
v6.12-rc2 ee3283c608dfa21251b0821d7bb
---------------- ---------------------------
%stddev %change %stddev
\ | \
57550068 -4.8% 54794800 will-it-scale.104.processes
553365 -4.8% 526872 will-it-scale.per_process_ops
57550068 -4.8% 54794800 will-it-scale.workload
43.00 ± 27% -60.0% 17.20 ± 27% perf-c2c.DRAM.local
251.20 ± 23% -57.5% 106.80 ± 16% perf-c2c.DRAM.remote
520.00 ± 33% -70.3% 154.20 ± 13% perf-c2c.HITM.local
218.50 ± 25% -55.2% 97.80 ± 18% perf-c2c.HITM.remote
0.03 ± 14% +48.4% 0.04 ± 9% perf-sched.sch_delay.avg.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
4.18 ± 4% +21.5% 5.08 perf-sched.sch_delay.max.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
653.70 ± 5% +50.5% 983.70 ± 7% perf-sched.wait_and_delay.count.__cond_resched.generic_perform_write.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64
913.40 ± 6% -24.8% 686.80 ± 7% perf-sched.wait_and_delay.count.__cond_resched.shmem_get_folio_gfp.shmem_write_begin.generic_perform_write.shmem_file_write_iter
1.29 ± 81% +42618.3% 552.09 ± 74% perf-sched.wait_time.avg.ms.devkmsg_read.vfs_read.ksys_read.do_syscall_64
2.58 ± 81% +65403.1% 1692 ± 72% perf-sched.wait_time.max.ms.devkmsg_read.vfs_read.ksys_read.do_syscall_64
1.721e+10 -4.8% 1.639e+10 perf-stat.i.branch-instructions
1.66 +0.1 1.72 perf-stat.i.branch-miss-rate%
2.852e+08 -1.2% 2.818e+08 perf-stat.i.branch-misses
3.29 +4.9% 3.45 perf-stat.i.cpi
8.743e+10 -4.8% 8.327e+10 perf-stat.i.instructions
0.30 -4.7% 0.29 perf-stat.i.ipc
1.66 +0.1 1.72 perf-stat.overall.branch-miss-rate%
3.29 +4.9% 3.45 perf-stat.overall.cpi
0.30 -4.7% 0.29 perf-stat.overall.ipc
1.715e+10 -4.8% 1.634e+10 perf-stat.ps.branch-instructions
2.842e+08 -1.2% 2.809e+08 perf-stat.ps.branch-misses
8.714e+10 -4.8% 8.3e+10 perf-stat.ps.instructions
2.632e+13 -4.7% 2.508e+13 perf-stat.total.instructions
10.62 -4.8 5.81 perf-profile.calltrace.cycles-pp.shmem_write_begin.generic_perform_write.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64
8.89 ± 2% -4.6 4.25 perf-profile.calltrace.cycles-pp.shmem_get_folio_gfp.shmem_write_begin.generic_perform_write.shmem_file_write_iter.vfs_write
5.98 ± 3% -4.2 1.79 ± 2% perf-profile.calltrace.cycles-pp.filemap_get_entry.shmem_get_folio_gfp.shmem_write_begin.generic_perform_write.shmem_file_write_iter
13.24 -1.4 11.88 perf-profile.calltrace.cycles-pp.entry_SYSRETQ_unsafe_stack.__libc_pwrite
16.62 -1.2 15.42 perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.__libc_pwrite
2.90 -1.2 1.74 perf-profile.calltrace.cycles-pp.file_update_time.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64.do_syscall_64
2.38 ± 2% -0.9 1.44 perf-profile.calltrace.cycles-pp.inode_needs_update_time.file_update_time.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64
1.68 ± 2% -0.9 0.79 perf-profile.calltrace.cycles-pp.folio_unlock.shmem_write_end.generic_perform_write.shmem_file_write_iter.vfs_write
1.42 ± 13% -0.8 0.64 ± 3% perf-profile.calltrace.cycles-pp.file_remove_privs_flags.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64.do_syscall_64
5.69 -0.7 4.99 ± 2% perf-profile.calltrace.cycles-pp.shmem_write_end.generic_perform_write.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64
6.91 -0.4 6.53 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.__libc_pwrite
1.23 ± 2% -0.2 1.01 perf-profile.calltrace.cycles-pp.fdget.__x64_sys_pwrite64.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pwrite
1.41 -0.2 1.26 perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pwrite
0.87 -0.1 0.79 ± 2% perf-profile.calltrace.cycles-pp.up_write.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64.do_syscall_64
0.79 ± 2% -0.1 0.74 perf-profile.calltrace.cycles-pp.noop_dirty_folio.shmem_write_end.generic_perform_write.shmem_file_write_iter.vfs_write
1.15 ± 2% +0.1 1.26 ± 2% perf-profile.calltrace.cycles-pp.down_write.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64.do_syscall_64
0.54 +0.2 0.73 perf-profile.calltrace.cycles-pp.folio_mark_accessed.shmem_get_folio_gfp.shmem_write_begin.generic_perform_write.shmem_file_write_iter
0.82 ± 2% +0.4 1.26 ± 5% perf-profile.calltrace.cycles-pp.folio_mark_dirty.shmem_write_end.generic_perform_write.shmem_file_write_iter.vfs_write
0.00 +0.7 0.67 perf-profile.calltrace.cycles-pp.balance_dirty_pages_ratelimited_flags.generic_perform_write.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64
2.10 +1.2 3.35 perf-profile.calltrace.cycles-pp.fault_in_readable.fault_in_iov_iter_readable.generic_perform_write.shmem_file_write_iter.vfs_write
2.36 +1.3 3.69 perf-profile.calltrace.cycles-pp.fault_in_iov_iter_readable.generic_perform_write.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64
46.08 +2.8 48.91 perf-profile.calltrace.cycles-pp.__x64_sys_pwrite64.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pwrite
43.76 +3.3 47.02 perf-profile.calltrace.cycles-pp.vfs_write.__x64_sys_pwrite64.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pwrite
58.89 +3.4 62.32 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__libc_pwrite
38.55 +3.5 42.07 perf-profile.calltrace.cycles-pp.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64.do_syscall_64.entry_SYSCALL_64_after_hwframe
49.37 +3.7 53.09 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pwrite
29.41 +5.6 34.99 perf-profile.calltrace.cycles-pp.generic_perform_write.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64.do_syscall_64
4.60 +7.7 12.30 perf-profile.calltrace.cycles-pp.rep_movs_alternative.copy_page_from_iter_atomic.generic_perform_write.shmem_file_write_iter.vfs_write
6.68 +10.3 16.96 perf-profile.calltrace.cycles-pp.copy_page_from_iter_atomic.generic_perform_write.shmem_file_write_iter.vfs_write.__x64_sys_pwrite64
10.69 -4.8 5.86 perf-profile.children.cycles-pp.shmem_write_begin
8.99 ± 2% -4.6 4.35 perf-profile.children.cycles-pp.shmem_get_folio_gfp
6.00 ± 3% -4.2 1.81 ± 2% perf-profile.children.cycles-pp.filemap_get_entry
14.20 -1.4 12.77 perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
1.62 ± 9% -1.3 0.37 ± 5% perf-profile.children.cycles-pp.xas_load
16.76 -1.2 15.54 perf-profile.children.cycles-pp.syscall_return_via_sysret
2.96 -1.2 1.79 perf-profile.children.cycles-pp.file_update_time
2.47 ± 2% -1.0 1.51 perf-profile.children.cycles-pp.inode_needs_update_time
1.69 ± 2% -0.9 0.79 perf-profile.children.cycles-pp.folio_unlock
1.44 ± 13% -0.8 0.65 ± 3% perf-profile.children.cycles-pp.file_remove_privs_flags
5.94 -0.7 5.24 ± 2% perf-profile.children.cycles-pp.shmem_write_end
7.17 -0.5 6.67 perf-profile.children.cycles-pp.entry_SYSCALL_64
1.77 -0.4 1.42 perf-profile.children.cycles-pp.__cond_resched
0.67 ± 3% -0.3 0.41 perf-profile.children.cycles-pp.ktime_get_coarse_real_ts64
1.68 ± 9% -0.2 1.42 ± 4% perf-profile.children.cycles-pp.generic_write_checks
1.25 -0.2 1.03 perf-profile.children.cycles-pp.fdget
1.44 -0.2 1.28 perf-profile.children.cycles-pp.syscall_exit_to_user_mode
0.38 ± 3% -0.1 0.27 ± 2% perf-profile.children.cycles-pp.timestamp_truncate
0.37 ± 4% -0.1 0.26 perf-profile.children.cycles-pp.rw_verify_area
0.69 ± 3% -0.1 0.60 perf-profile.children.cycles-pp.rcu_all_qs
0.90 -0.1 0.82 ± 2% perf-profile.children.cycles-pp.up_write
0.23 ± 5% -0.1 0.16 ± 2% perf-profile.children.cycles-pp.xas_start
0.85 -0.1 0.80 perf-profile.children.cycles-pp.noop_dirty_folio
0.23 ± 4% -0.0 0.20 ± 3% perf-profile.children.cycles-pp.x64_sys_call
0.15 ± 5% -0.0 0.11 ± 4% perf-profile.children.cycles-pp.security_file_permission
0.28 ± 2% -0.0 0.26 perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare
0.17 ± 5% +0.0 0.19 ± 3% perf-profile.children.cycles-pp.sched_tick
1.18 +0.1 1.28 ± 2% perf-profile.children.cycles-pp.down_write
0.35 ± 3% +0.1 0.48 ± 6% perf-profile.children.cycles-pp.folio_mapping
0.50 ± 2% +0.2 0.69 perf-profile.children.cycles-pp.balance_dirty_pages_ratelimited_flags
0.55 ± 2% +0.2 0.75 perf-profile.children.cycles-pp.folio_mark_accessed
1.75 ± 2% +0.4 2.10 ± 2% perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
0.90 +0.5 1.36 ± 5% perf-profile.children.cycles-pp.folio_mark_dirty
2.17 +1.2 3.41 perf-profile.children.cycles-pp.fault_in_readable
2.40 +1.4 3.75 perf-profile.children.cycles-pp.fault_in_iov_iter_readable
46.10 +2.8 48.93 perf-profile.children.cycles-pp.__x64_sys_pwrite64
43.86 +3.2 47.10 perf-profile.children.cycles-pp.vfs_write
39.00 +3.4 42.41 perf-profile.children.cycles-pp.shmem_file_write_iter
59.15 +3.4 62.56 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
49.50 +3.7 53.21 perf-profile.children.cycles-pp.do_syscall_64
29.56 +5.6 35.14 perf-profile.children.cycles-pp.generic_perform_write
4.74 +8.3 13.02 perf-profile.children.cycles-pp.rep_movs_alternative
6.85 +9.6 16.44 perf-profile.children.cycles-pp.copy_page_from_iter_atomic
4.34 ± 2% -2.9 1.43 ± 2% perf-profile.self.cycles-pp.filemap_get_entry
14.06 -1.4 12.65 perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
16.74 -1.2 15.53 perf-profile.self.cycles-pp.syscall_return_via_sysret
1.39 ± 10% -1.2 0.21 ± 8% perf-profile.self.cycles-pp.xas_load
1.49 ± 3% -0.9 0.58 perf-profile.self.cycles-pp.folio_unlock
2.72 ± 2% -0.9 1.83 perf-profile.self.cycles-pp.__libc_pwrite
1.42 ± 13% -0.8 0.61 ± 3% perf-profile.self.cycles-pp.file_remove_privs_flags
1.42 -0.6 0.83 perf-profile.self.cycles-pp.inode_needs_update_time
1.92 ± 5% -0.5 1.44 perf-profile.self.cycles-pp.shmem_get_folio_gfp
6.24 -0.4 5.81 perf-profile.self.cycles-pp.entry_SYSCALL_64
9.82 -0.3 9.50 perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
0.64 ± 3% -0.3 0.38 perf-profile.self.cycles-pp.ktime_get_coarse_real_ts64
1.06 ± 2% -0.3 0.79 perf-profile.self.cycles-pp.__cond_resched
1.74 ± 5% -0.2 1.52 ± 2% perf-profile.self.cycles-pp.shmem_write_begin
1.24 ± 2% -0.2 1.03 perf-profile.self.cycles-pp.fdget
0.45 ± 3% -0.2 0.25 perf-profile.self.cycles-pp.file_update_time
0.98 ± 2% -0.2 0.79 ± 2% perf-profile.self.cycles-pp.__x64_sys_pwrite64
2.73 ± 2% -0.2 2.54 ± 2% perf-profile.self.cycles-pp.shmem_write_end
0.72 ± 5% -0.1 0.58 ± 4% perf-profile.self.cycles-pp.generic_write_checks
1.14 -0.1 1.02 perf-profile.self.cycles-pp.syscall_exit_to_user_mode
0.36 ± 3% -0.1 0.25 ± 2% perf-profile.self.cycles-pp.timestamp_truncate
0.23 ± 4% -0.1 0.15 ± 2% perf-profile.self.cycles-pp.rw_verify_area
0.60 ± 3% -0.1 0.53 perf-profile.self.cycles-pp.rcu_all_qs
0.81 -0.1 0.74 perf-profile.self.cycles-pp.noop_dirty_folio
0.20 ± 4% -0.1 0.14 ± 2% perf-profile.self.cycles-pp.xas_start
0.81 -0.1 0.75 ± 2% perf-profile.self.cycles-pp.up_write
0.21 ± 3% -0.0 0.18 ± 3% perf-profile.self.cycles-pp.x64_sys_call
0.26 ± 2% -0.0 0.23 ± 2% perf-profile.self.cycles-pp.syscall_exit_to_user_mode_prepare
0.12 ± 6% -0.0 0.09 ± 4% perf-profile.self.cycles-pp.security_file_permission
0.21 ± 4% +0.0 0.24 perf-profile.self.cycles-pp.testcase
0.77 ± 2% +0.0 0.82 ± 3% perf-profile.self.cycles-pp.down_write
0.24 ± 3% +0.1 0.36 perf-profile.self.cycles-pp.fault_in_iov_iter_readable
0.30 ± 3% +0.1 0.43 ± 6% perf-profile.self.cycles-pp.folio_mapping
0.35 ± 2% +0.2 0.54 perf-profile.self.cycles-pp.balance_dirty_pages_ratelimited_flags
2.74 +0.2 2.93 ± 2% perf-profile.self.cycles-pp.generic_perform_write
0.52 +0.2 0.72 perf-profile.self.cycles-pp.folio_mark_accessed
0.55 ± 2% +0.3 0.87 ± 5% perf-profile.self.cycles-pp.folio_mark_dirty
0.56 +0.5 1.10 ± 4% perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
1.48 ± 2% +1.1 2.55 ± 4% perf-profile.self.cycles-pp.do_syscall_64
2.14 +1.2 3.35 perf-profile.self.cycles-pp.fault_in_readable
2.20 +1.3 3.51 ± 2% perf-profile.self.cycles-pp.copy_page_from_iter_atomic
4.59 +8.2 12.80 perf-profile.self.cycles-pp.rep_movs_alternative
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
Powered by blists - more mailing lists