[<prev] [next>] [day] [month] [year] [list]
Message-ID: <202501241646.81b10e21-lkp@intel.com>
Date: Fri, 24 Jan 2025 16:41:28 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Al Viro <viro@...iv.linux.org.uk>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, <linux-kernel@...r.kernel.org>,
<linux-fsdevel@...r.kernel.org>, <oliver.sang@...el.com>
Subject: [linus:master] [close_files()] 1fa4ffd8e6:
stress-ng.fd-fork.ops_per_sec 6.2% improvement
Hello,
kernel test robot noticed a 6.2% improvement of stress-ng.fd-fork.ops_per_sec on:
commit: 1fa4ffd8e6f6d001da27f00382af79bad0336091 ("close_files(): don't bother with xchg()")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
testcase: stress-ng
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
parameters:
nr_threads: 100%
testtime: 60s
test: fd-fork
cpufreq_governor: performance
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250124/202501241646.81b10e21-lkp@intel.com
=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
gcc-12/performance/x86_64-rhel-9.4/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp8/fd-fork/stress-ng/60s
commit:
be5498cac2 ("remove pointless includes of <linux/fdtable.h>")
1fa4ffd8e6 ("close_files(): don't bother with xchg()")
be5498cac2ddb112 1fa4ffd8e6f6d001da27f00382a
---------------- ---------------------------
%stddev %change %stddev
\ | \
38705 ± 5% +11.1% 42989 ± 2% sched_debug.cpu.curr->pid.avg
96837 +6.2% 102865 stress-ng.fd-fork.ops
1611 +6.2% 1711 stress-ng.fd-fork.ops_per_sec
10.10 -6.7% 9.42 stress-ng.fd-fork.seconds_to_open_all_file_descriptors
131663 +5.2% 138573 stress-ng.time.voluntary_context_switches
4224262 ± 3% +5.5% 4458103 proc-vmstat.numa_hit
4158770 ± 3% +5.6% 4391868 ± 2% proc-vmstat.numa_local
1.002e+08 +6.8% 1.07e+08 proc-vmstat.pgalloc_normal
1.001e+08 +6.8% 1.069e+08 proc-vmstat.pgfree
200571 +14.3% 229179 ± 16% proc-vmstat.pgreuse
1.24 ± 15% +39.0% 1.72 ± 13% perf-sched.sch_delay.avg.ms.__cond_resched.dput.__fput.__x64_sys_close.do_syscall_64
2.36 ± 21% -31.4% 1.62 ± 32% perf-sched.sch_delay.avg.ms.schedule_hrtimeout_range_clock.do_poll.constprop.0.do_sys_poll
1.40 ± 17% -34.7% 0.92 ± 30% perf-sched.sch_delay.avg.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
8.65 ± 31% +64.9% 14.26 ± 55% perf-sched.sch_delay.max.ms.__x64_sys_pause.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
7.30 ± 70% -70.6% 2.14 ± 97% perf-sched.sch_delay.max.ms.irqentry_exit_to_user_mode.asm_exc_page_fault.[unknown].[unknown]
227.50 ± 2% +13.1% 257.36 ± 13% perf-sched.wait_and_delay.avg.ms.irq_thread.kthread.ret_from_fork.ret_from_fork_asm
7.35 ± 6% -13.4% 6.36 ± 8% perf-sched.wait_and_delay.avg.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
22270 ± 4% +51.5% 33741 perf-sched.wait_and_delay.count.__cond_resched.__close_range.__x64_sys_close_range.do_syscall_64.entry_SYSCALL_64_after_hwframe
55845 ± 2% -22.5% 43303 perf-sched.wait_and_delay.count.__cond_resched.put_files_struct.do_exit.do_group_exit.__x64_sys_exit_group
1051 ± 2% +8.6% 1141 ± 4% perf-sched.wait_and_delay.count.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
226.27 ± 2% +13.2% 256.09 ± 13% perf-sched.wait_time.avg.ms.irq_thread.kthread.ret_from_fork.ret_from_fork_asm
4.72 ± 23% +69.3% 7.99 ± 27% perf-sched.wait_time.max.ms.wait_for_partner.fifo_open.do_dentry_open.vfs_open
2.431e+10 +7.6% 2.617e+10 perf-stat.i.branch-instructions
12.91 +0.7 13.61 perf-stat.i.cache-miss-rate%
1.033e+08 +9.5% 1.132e+08 perf-stat.i.cache-misses
8.11e+08 ± 2% +3.4% 8.387e+08 perf-stat.i.cache-references
1.98 -6.4% 1.85 perf-stat.i.cpi
2168 -8.0% 1993 perf-stat.i.cycles-between-cache-misses
1.128e+11 +7.3% 1.21e+11 perf-stat.i.instructions
0.51 +6.7% 0.54 perf-stat.i.ipc
65995 ± 2% +6.6% 70332 ± 3% perf-stat.i.minor-faults
65995 ± 2% +6.6% 70332 ± 3% perf-stat.i.page-faults
0.91 +2.1% 0.93 perf-stat.overall.MPKI
12.73 +0.7 13.48 perf-stat.overall.cache-miss-rate%
1.99 -6.5% 1.86 perf-stat.overall.cpi
2179 -8.4% 1996 perf-stat.overall.cycles-between-cache-misses
0.50 +7.0% 0.54 perf-stat.overall.ipc
2.391e+10 +7.6% 2.574e+10 perf-stat.ps.branch-instructions
1.015e+08 +9.5% 1.111e+08 perf-stat.ps.cache-misses
7.975e+08 +3.4% 8.247e+08 perf-stat.ps.cache-references
1.11e+11 +7.3% 1.191e+11 perf-stat.ps.instructions
64224 ± 2% +6.7% 68500 ± 3% perf-stat.ps.minor-faults
64224 ± 2% +6.7% 68501 ± 3% perf-stat.ps.page-faults
6.87e+12 +7.4% 7.379e+12 perf-stat.total.instructions
29.66 ± 2% -1.3 28.36 ± 2% perf-profile.calltrace.cycles-pp.put_files_struct.do_exit.do_group_exit.__x64_sys_exit_group.x64_sys_call
30.58 ± 2% -1.2 29.42 perf-profile.calltrace.cycles-pp.__x64_sys_exit_group.x64_sys_call.do_syscall_64.entry_SYSCALL_64_after_hwframe
30.58 ± 2% -1.2 29.42 perf-profile.calltrace.cycles-pp.do_exit.do_group_exit.__x64_sys_exit_group.x64_sys_call.do_syscall_64
30.58 ± 2% -1.2 29.42 perf-profile.calltrace.cycles-pp.do_group_exit.__x64_sys_exit_group.x64_sys_call.do_syscall_64.entry_SYSCALL_64_after_hwframe
30.58 ± 2% -1.2 29.42 perf-profile.calltrace.cycles-pp.x64_sys_call.do_syscall_64.entry_SYSCALL_64_after_hwframe
30.58 ± 2% -1.2 29.43 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe
30.58 ± 2% -1.2 29.43 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe
0.53 +0.1 0.60 ± 3% perf-profile.calltrace.cycles-pp.exit_mmap.__mmput.exit_mm.do_exit.do_group_exit
0.54 +0.1 0.60 ± 3% perf-profile.calltrace.cycles-pp.__mmput.exit_mm.do_exit.do_group_exit.__x64_sys_exit_group
0.54 +0.1 0.60 ± 3% perf-profile.calltrace.cycles-pp.exit_mm.do_exit.do_group_exit.__x64_sys_exit_group.x64_sys_call
0.78 +0.1 0.87 ± 2% perf-profile.calltrace.cycles-pp.dup_mm.copy_process.kernel_clone.__do_sys_clone.do_syscall_64
0.76 +0.1 0.84 ± 2% perf-profile.calltrace.cycles-pp.dup_mmap.dup_mm.copy_process.kernel_clone.__do_sys_clone
0.34 ± 70% +0.3 0.65 perf-profile.calltrace.cycles-pp.rcu_all_qs.__cond_resched.put_files_struct.do_exit.do_group_exit
1.18 ± 3% +0.4 1.56 perf-profile.calltrace.cycles-pp.__cond_resched.put_files_struct.do_exit.do_group_exit.__x64_sys_exit_group
21.65 +0.8 22.48 perf-profile.calltrace.cycles-pp.dup_fd.copy_process.kernel_clone.__do_sys_clone.do_syscall_64
22.48 +0.9 23.40 perf-profile.calltrace.cycles-pp.copy_process.kernel_clone.__do_sys_clone.do_syscall_64.entry_SYSCALL_64_after_hwframe
22.50 +0.9 23.43 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe._Fork
22.50 +0.9 23.43 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe._Fork
22.50 +0.9 23.42 perf-profile.calltrace.cycles-pp.__do_sys_clone.do_syscall_64.entry_SYSCALL_64_after_hwframe._Fork
22.50 +0.9 23.42 perf-profile.calltrace.cycles-pp.kernel_clone.__do_sys_clone.do_syscall_64.entry_SYSCALL_64_after_hwframe._Fork
22.52 +0.9 23.45 perf-profile.calltrace.cycles-pp._Fork
1.47 ± 4% +1.0 2.49 perf-profile.calltrace.cycles-pp.dnotify_flush.filp_flush.filp_close.put_files_struct.do_exit
2.19 ± 4% +1.2 3.38 ± 2% perf-profile.calltrace.cycles-pp.locks_remove_posix.filp_flush.filp_close.put_files_struct.do_exit
9.47 ± 2% +2.7 12.14 ± 2% perf-profile.calltrace.cycles-pp.fput.filp_close.put_files_struct.do_exit.do_group_exit
22.10 ± 2% +3.5 25.60 ± 2% perf-profile.calltrace.cycles-pp.filp_close.put_files_struct.do_exit.do_group_exit.__x64_sys_exit_group
30.02 ± 2% -1.2 28.79 perf-profile.children.cycles-pp.put_files_struct
30.60 ± 2% -1.2 29.44 perf-profile.children.cycles-pp.do_exit
30.60 ± 2% -1.2 29.44 perf-profile.children.cycles-pp.__x64_sys_exit_group
30.60 ± 2% -1.2 29.44 perf-profile.children.cycles-pp.do_group_exit
30.60 ± 2% -1.2 29.44 perf-profile.children.cycles-pp.x64_sys_call
0.09 +0.0 0.10 perf-profile.children.cycles-pp.__memcg_slab_post_alloc_hook
0.11 ± 3% +0.0 0.12 ± 4% perf-profile.children.cycles-pp.down_write
0.13 ± 3% +0.0 0.15 ± 3% perf-profile.children.cycles-pp.handle_mm_fault
0.15 ± 3% +0.0 0.16 perf-profile.children.cycles-pp.kmem_cache_alloc_noprof
0.14 ± 2% +0.0 0.16 ± 3% perf-profile.children.cycles-pp.kmem_cache_free
0.16 +0.0 0.18 ± 2% perf-profile.children.cycles-pp.do_user_addr_fault
0.16 +0.0 0.18 ± 2% perf-profile.children.cycles-pp.exc_page_fault
0.25 +0.0 0.27 ± 2% perf-profile.children.cycles-pp.anon_vma_clone
0.24 ± 2% +0.0 0.27 perf-profile.children.cycles-pp.free_pgtables
0.31 +0.0 0.34 ± 2% perf-profile.children.cycles-pp.anon_vma_fork
0.13 ± 5% +0.0 0.16 ± 8% perf-profile.children.cycles-pp.copy_p4d_range
0.54 +0.1 0.60 ± 3% perf-profile.children.cycles-pp.__mmput
0.54 +0.1 0.61 ± 3% perf-profile.children.cycles-pp.exit_mm
0.53 +0.1 0.60 ± 3% perf-profile.children.cycles-pp.exit_mmap
0.78 +0.1 0.87 ± 2% perf-profile.children.cycles-pp.dup_mm
0.76 +0.1 0.85 ± 2% perf-profile.children.cycles-pp.dup_mmap
1.53 +0.2 1.75 perf-profile.children.cycles-pp.rcu_all_qs
3.50 +0.5 4.03 perf-profile.children.cycles-pp.__cond_resched
21.65 +0.8 22.48 perf-profile.children.cycles-pp.dup_fd
22.48 +0.9 23.40 perf-profile.children.cycles-pp.copy_process
22.50 +0.9 23.42 perf-profile.children.cycles-pp.kernel_clone
22.50 +0.9 23.42 perf-profile.children.cycles-pp.__do_sys_clone
22.53 +0.9 23.46 perf-profile.children.cycles-pp._Fork
33.30 +0.9 34.24 perf-profile.children.cycles-pp.filp_flush
3.75 +1.0 4.78 perf-profile.children.cycles-pp.dnotify_flush
5.04 +1.2 6.22 perf-profile.children.cycles-pp.locks_remove_posix
21.42 +2.6 24.05 perf-profile.children.cycles-pp.fput
56.02 +3.7 59.71 perf-profile.children.cycles-pp.filp_close
6.16 ± 2% -5.2 0.91 perf-profile.self.cycles-pp.put_files_struct
24.86 -1.2 23.65 perf-profile.self.cycles-pp.filp_flush
1.14 +0.2 1.31 perf-profile.self.cycles-pp.rcu_all_qs
1.76 +0.2 1.94 perf-profile.self.cycles-pp.filp_close
1.91 +0.3 2.20 perf-profile.self.cycles-pp.__cond_resched
21.51 +0.8 22.34 perf-profile.self.cycles-pp.dup_fd
3.30 +1.0 4.26 perf-profile.self.cycles-pp.dnotify_flush
4.58 +1.1 5.72 perf-profile.self.cycles-pp.locks_remove_posix
20.87 +2.6 23.46 perf-profile.self.cycles-pp.fput
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
Powered by blists - more mailing lists