[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <202408062152.7e5b5d6d-oliver.sang@intel.com>
Date: Tue, 6 Aug 2024 21:48:19 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Yu Ma <yu.ma@...el.com>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, Jan Kara <jack@...e.cz>, "Tim
Chen" <tim.c.chen@...ux.intel.com>, <linux-fsdevel@...r.kernel.org>,
<ying.huang@...el.com>, <feng.tang@...el.com>, <fengwei.yin@...el.com>,
<brauner@...nel.org>, <mjguzik@...il.com>, <edumazet@...gle.com>,
<yu.ma@...el.com>, <linux-kernel@...r.kernel.org>, <pan.deng@...el.com>,
<tianyou.li@...el.com>, <tim.c.chen@...el.com>, <viro@...iv.linux.org.uk>,
<oliver.sang@...el.com>
Subject: Re: [PATCH v5 3/3] fs/file.c: add fast path in find_next_fd()
Hello,
kernel test robot noticed a 6.3% improvement of will-it-scale.per_thread_ops on:
commit: b8decf0015a8b1ff02cdac61c0aa54355d8e73d7 ("[PATCH v5 3/3] fs/file.c: add fast path in find_next_fd()")
url: https://github.com/intel-lab-lkp/linux/commits/Yu-Ma/fs-file-c-remove-sanity_check-and-add-likely-unlikely-in-alloc_fd/20240717-224830
base: https://git.kernel.org/cgit/linux/kernel/git/vfs/vfs.git vfs.all
patch link: https://lore.kernel.org/all/20240717145018.3972922-4-yu.ma@intel.com/
patch subject: [PATCH v5 3/3] fs/file.c: add fast path in find_next_fd()
testcase: will-it-scale
test machine: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 512G memory
parameters:
nr_task: 100%
mode: thread
test: open3
cpufreq_governor: performance
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20240806/202408062152.7e5b5d6d-oliver.sang@intel.com
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
gcc-13/performance/x86_64-rhel-8.3/thread/100%/debian-12-x86_64-20240206.cgz/lkp-spr-2sp4/open3/will-it-scale
commit:
5bb3423bf9 ("fs/file.c: conditionally clear full_fds")
b8decf0015 ("fs/file.c: add fast path in find_next_fd()")
5bb3423bf9f9d91e b8decf0015a8b1ff02cdac61c0a
---------------- ---------------------------
%stddev %change %stddev
\ | \
848151 +6.2% 901119 ± 2% will-it-scale.224.threads
3785 +6.3% 4022 ± 2% will-it-scale.per_thread_ops
848151 +6.2% 901119 ± 2% will-it-scale.workload
0.28 ± 4% +13.3% 0.32 ± 3% perf-stat.i.MPKI
31.31 ± 3% +2.0 33.28 perf-stat.i.cache-miss-rate%
14955855 ± 4% +13.6% 16995785 ± 4% perf-stat.i.cache-misses
49676581 +6.7% 53009444 ± 3% perf-stat.i.cache-references
43955 ± 4% -12.3% 38549 ± 4% perf-stat.i.cycles-between-cache-misses
0.28 ± 4% +13.4% 0.32 ± 4% perf-stat.overall.MPKI
29.84 ± 3% +1.9 31.78 ± 2% perf-stat.overall.cache-miss-rate%
43445 ± 4% -12.1% 38200 ± 4% perf-stat.overall.cycles-between-cache-misses
19005976 -5.4% 17972604 ± 2% perf-stat.overall.path-length
14869677 ± 4% +13.6% 16898438 ± 4% perf-stat.ps.cache-misses
49821402 +6.7% 53168235 ± 3% perf-stat.ps.cache-references
49.42 -0.1 49.34 perf-profile.calltrace.cycles-pp.alloc_fd.do_sys_openat2.__x64_sys_openat.do_syscall_64.entry_SYSCALL_64_after_hwframe
49.40 -0.1 49.32 perf-profile.calltrace.cycles-pp.file_close_fd.__x64_sys_close.do_syscall_64.entry_SYSCALL_64_after_hwframe.__close
49.25 -0.1 49.18 perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.file_close_fd.__x64_sys_close.do_syscall_64
49.20 -0.1 49.13 perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.alloc_fd.do_sys_openat2.__x64_sys_openat
49.33 -0.1 49.26 perf-profile.calltrace.cycles-pp._raw_spin_lock.file_close_fd.__x64_sys_close.do_syscall_64.entry_SYSCALL_64_after_hwframe
49.28 -0.1 49.22 perf-profile.calltrace.cycles-pp._raw_spin_lock.alloc_fd.do_sys_openat2.__x64_sys_openat.do_syscall_64
50.14 +0.0 50.18 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.open64
50.17 +0.0 50.21 perf-profile.calltrace.cycles-pp.open64
0.64 ± 5% +0.1 0.75 ± 6% perf-profile.calltrace.cycles-pp.do_filp_open.do_sys_openat2.__x64_sys_openat.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.62 ± 5% +0.1 0.74 ± 6% perf-profile.calltrace.cycles-pp.path_openat.do_filp_open.do_sys_openat2.__x64_sys_openat.do_syscall_64
49.42 -0.1 49.34 perf-profile.children.cycles-pp.alloc_fd
49.40 -0.1 49.32 perf-profile.children.cycles-pp.file_close_fd
0.06 -0.0 0.05 perf-profile.children.cycles-pp.file_close_fd_locked
0.15 ± 5% +0.0 0.17 ± 4% perf-profile.children.cycles-pp.init_file
0.22 ± 3% +0.0 0.25 ± 3% perf-profile.children.cycles-pp.alloc_empty_file
0.18 ± 6% +0.0 0.22 ± 6% perf-profile.children.cycles-pp.__fput
50.14 +0.0 50.18 perf-profile.children.cycles-pp.__x64_sys_openat
50.18 +0.0 50.22 perf-profile.children.cycles-pp.open64
0.18 ± 14% +0.0 0.23 ± 7% perf-profile.children.cycles-pp.do_dentry_open
0.30 ± 8% +0.1 0.36 ± 8% perf-profile.children.cycles-pp.do_open
0.64 ± 5% +0.1 0.75 ± 6% perf-profile.children.cycles-pp.do_filp_open
0.63 ± 5% +0.1 0.75 ± 6% perf-profile.children.cycles-pp.path_openat
0.06 -0.0 0.05 perf-profile.self.cycles-pp.file_close_fd_locked
0.16 ± 2% +0.0 0.18 ± 2% perf-profile.self.cycles-pp._raw_spin_lock
0.08 ± 12% +0.0 0.10 ± 4% perf-profile.self.cycles-pp.__fput
0.05 ± 7% +0.1 0.10 ± 4% perf-profile.self.cycles-pp.alloc_fd
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
Powered by blists - more mailing lists