lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <202408062152.7e5b5d6d-oliver.sang@intel.com>
Date: Tue, 6 Aug 2024 21:48:19 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Yu Ma <yu.ma@...el.com>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, Jan Kara <jack@...e.cz>, "Tim
 Chen" <tim.c.chen@...ux.intel.com>, <linux-fsdevel@...r.kernel.org>,
	<ying.huang@...el.com>, <feng.tang@...el.com>, <fengwei.yin@...el.com>,
	<brauner@...nel.org>, <mjguzik@...il.com>, <edumazet@...gle.com>,
	<yu.ma@...el.com>, <linux-kernel@...r.kernel.org>, <pan.deng@...el.com>,
	<tianyou.li@...el.com>, <tim.c.chen@...el.com>, <viro@...iv.linux.org.uk>,
	<oliver.sang@...el.com>
Subject: Re: [PATCH v5 3/3] fs/file.c: add fast path in find_next_fd()



Hello,

kernel test robot noticed a 6.3% improvement of will-it-scale.per_thread_ops on:


commit: b8decf0015a8b1ff02cdac61c0aa54355d8e73d7 ("[PATCH v5 3/3] fs/file.c: add fast path in find_next_fd()")
url: https://github.com/intel-lab-lkp/linux/commits/Yu-Ma/fs-file-c-remove-sanity_check-and-add-likely-unlikely-in-alloc_fd/20240717-224830
base: https://git.kernel.org/cgit/linux/kernel/git/vfs/vfs.git vfs.all
patch link: https://lore.kernel.org/all/20240717145018.3972922-4-yu.ma@intel.com/
patch subject: [PATCH v5 3/3] fs/file.c: add fast path in find_next_fd()

testcase: will-it-scale
test machine: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 512G memory
parameters:

	nr_task: 100%
	mode: thread
	test: open3
	cpufreq_governor: performance






Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20240806/202408062152.7e5b5d6d-oliver.sang@intel.com

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
  gcc-13/performance/x86_64-rhel-8.3/thread/100%/debian-12-x86_64-20240206.cgz/lkp-spr-2sp4/open3/will-it-scale

commit: 
  5bb3423bf9 ("fs/file.c: conditionally clear full_fds")
  b8decf0015 ("fs/file.c: add fast path in find_next_fd()")

5bb3423bf9f9d91e b8decf0015a8b1ff02cdac61c0a 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
    848151            +6.2%     901119 ±  2%  will-it-scale.224.threads
      3785            +6.3%       4022 ±  2%  will-it-scale.per_thread_ops
    848151            +6.2%     901119 ±  2%  will-it-scale.workload
      0.28 ±  4%     +13.3%       0.32 ±  3%  perf-stat.i.MPKI
     31.31 ±  3%      +2.0       33.28        perf-stat.i.cache-miss-rate%
  14955855 ±  4%     +13.6%   16995785 ±  4%  perf-stat.i.cache-misses
  49676581            +6.7%   53009444 ±  3%  perf-stat.i.cache-references
     43955 ±  4%     -12.3%      38549 ±  4%  perf-stat.i.cycles-between-cache-misses
      0.28 ±  4%     +13.4%       0.32 ±  4%  perf-stat.overall.MPKI
     29.84 ±  3%      +1.9       31.78 ±  2%  perf-stat.overall.cache-miss-rate%
     43445 ±  4%     -12.1%      38200 ±  4%  perf-stat.overall.cycles-between-cache-misses
  19005976            -5.4%   17972604 ±  2%  perf-stat.overall.path-length
  14869677 ±  4%     +13.6%   16898438 ±  4%  perf-stat.ps.cache-misses
  49821402            +6.7%   53168235 ±  3%  perf-stat.ps.cache-references
     49.42            -0.1       49.34        perf-profile.calltrace.cycles-pp.alloc_fd.do_sys_openat2.__x64_sys_openat.do_syscall_64.entry_SYSCALL_64_after_hwframe
     49.40            -0.1       49.32        perf-profile.calltrace.cycles-pp.file_close_fd.__x64_sys_close.do_syscall_64.entry_SYSCALL_64_after_hwframe.__close
     49.25            -0.1       49.18        perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.file_close_fd.__x64_sys_close.do_syscall_64
     49.20            -0.1       49.13        perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.alloc_fd.do_sys_openat2.__x64_sys_openat
     49.33            -0.1       49.26        perf-profile.calltrace.cycles-pp._raw_spin_lock.file_close_fd.__x64_sys_close.do_syscall_64.entry_SYSCALL_64_after_hwframe
     49.28            -0.1       49.22        perf-profile.calltrace.cycles-pp._raw_spin_lock.alloc_fd.do_sys_openat2.__x64_sys_openat.do_syscall_64
     50.14            +0.0       50.18        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.open64
     50.17            +0.0       50.21        perf-profile.calltrace.cycles-pp.open64
      0.64 ±  5%      +0.1        0.75 ±  6%  perf-profile.calltrace.cycles-pp.do_filp_open.do_sys_openat2.__x64_sys_openat.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.62 ±  5%      +0.1        0.74 ±  6%  perf-profile.calltrace.cycles-pp.path_openat.do_filp_open.do_sys_openat2.__x64_sys_openat.do_syscall_64
     49.42            -0.1       49.34        perf-profile.children.cycles-pp.alloc_fd
     49.40            -0.1       49.32        perf-profile.children.cycles-pp.file_close_fd
      0.06            -0.0        0.05        perf-profile.children.cycles-pp.file_close_fd_locked
      0.15 ±  5%      +0.0        0.17 ±  4%  perf-profile.children.cycles-pp.init_file
      0.22 ±  3%      +0.0        0.25 ±  3%  perf-profile.children.cycles-pp.alloc_empty_file
      0.18 ±  6%      +0.0        0.22 ±  6%  perf-profile.children.cycles-pp.__fput
     50.14            +0.0       50.18        perf-profile.children.cycles-pp.__x64_sys_openat
     50.18            +0.0       50.22        perf-profile.children.cycles-pp.open64
      0.18 ± 14%      +0.0        0.23 ±  7%  perf-profile.children.cycles-pp.do_dentry_open
      0.30 ±  8%      +0.1        0.36 ±  8%  perf-profile.children.cycles-pp.do_open
      0.64 ±  5%      +0.1        0.75 ±  6%  perf-profile.children.cycles-pp.do_filp_open
      0.63 ±  5%      +0.1        0.75 ±  6%  perf-profile.children.cycles-pp.path_openat
      0.06            -0.0        0.05        perf-profile.self.cycles-pp.file_close_fd_locked
      0.16 ±  2%      +0.0        0.18 ±  2%  perf-profile.self.cycles-pp._raw_spin_lock
      0.08 ± 12%      +0.0        0.10 ±  4%  perf-profile.self.cycles-pp.__fput
      0.05 ±  7%      +0.1        0.10 ±  4%  perf-profile.self.cycles-pp.alloc_fd




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ