lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <202503311302.a2bb29e1-lkp@intel.com>
Date: Mon, 31 Mar 2025 14:00:31 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Dave Hansen <dave.hansen@...ux.intel.com>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, <linux-kernel@...r.kernel.org>,
	Andrew Morton <akpm@...ux-foundation.org>, Ted Ts'o <tytso@....edu>, "Matthew
 Wilcox" <willy@...radead.org>, Mateusz Guzik <mjguzik@...il.com>, Dave
 Chinner <david@...morbit.com>, <linux-fsdevel@...r.kernel.org>,
	<oliver.sang@...el.com>
Subject: [linus:master] [filemap]  665575cff0:  will-it-scale.per_thread_ops
 3.6% improvement



Hello,

kernel test robot noticed a 3.6% improvement of will-it-scale.per_thread_ops on:


commit: 665575cff098b696995ddaddf4646a4099941f5e ("filemap: move prefaulting out of hot write path")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master


testcase: will-it-scale
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
parameters:

	nr_task: 100%
	mode: thread
	test: writeseek1
	cpufreq_governor: performance


In addition to that, the commit also has significant impact on the following tests:

+------------------+-------------------------------------------------------------------------------------------+
| testcase: change | unixbench: unixbench.throughput 4.6% improvement                                          |
| test machine     | 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory |
| test parameters  | cpufreq_governor=performance                                                              |
|                  | nr_task=100%                                                                              |
|                  | runtime=300s                                                                              |
|                  | test=fsbuffer-w                                                                           |
+------------------+-------------------------------------------------------------------------------------------+




Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250331/202503311302.a2bb29e1-lkp@intel.com

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
  gcc-12/performance/x86_64-rhel-9.4/thread/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp7/writeseek1/will-it-scale

commit: 
  654b33ada4 ("proc: fix UAF in proc_get_inode()")
  665575cff0 ("filemap: move prefaulting out of hot write path")

654b33ada4ab5e92 665575cff098b696995ddaddf46 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
 1.171e+08 ± 11%     +30.4%  1.526e+08 ± 18%  cpuidle..time
     96.67 ± 15%     -38.6%      59.33 ± 15%  perf-c2c.HITM.local
     91.33 ± 22%     -37.8%      56.83 ± 18%  perf-c2c.HITM.remote
  77338762            +3.6%   80146917        will-it-scale.64.threads
   1208417            +3.6%    1252295        will-it-scale.per_thread_ops
  77338762            +3.6%   80146917        will-it-scale.workload
      0.02 ±  3%      +0.0        0.03 ± 19%  perf-stat.i.branch-miss-rate%
   9721738 ±  4%      +8.9%   10586240 ±  5%  perf-stat.i.branch-misses
      0.02 ±  3%      +0.0        0.02 ±  5%  perf-stat.overall.branch-miss-rate%
    683007            -3.3%     660149        perf-stat.overall.path-length
   9685250 ±  4%      +8.9%   10545947 ±  5%  perf-stat.ps.branch-misses
     31.54            -2.4       29.18        perf-profile.calltrace.cycles-pp.generic_perform_write.shmem_file_write_iter.vfs_write.ksys_write.do_syscall_64
     40.31            -1.9       38.39        perf-profile.calltrace.cycles-pp.shmem_file_write_iter.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
     46.03            -1.9       44.17        perf-profile.calltrace.cycles-pp.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
     53.97            -1.7       52.30        perf-profile.calltrace.cycles-pp.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
     58.43            -1.4       56.98        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
     59.33            -1.4       57.92        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.write
     74.17            -0.8       73.38        perf-profile.calltrace.cycles-pp.write
      0.55            +0.0        0.57        perf-profile.calltrace.cycles-pp.x64_sys_call.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
      0.97            +0.0        1.01        perf-profile.calltrace.cycles-pp.folio_unlock.shmem_write_end.generic_perform_write.shmem_file_write_iter.vfs_write
      0.57 ±  3%      +0.0        0.60        perf-profile.calltrace.cycles-pp.file_remove_privs_flags.shmem_file_write_iter.vfs_write.ksys_write.do_syscall_64
      1.01            +0.0        1.04        perf-profile.calltrace.cycles-pp.fput.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
      1.08            +0.0        1.12        perf-profile.calltrace.cycles-pp.up_write.shmem_file_write_iter.vfs_write.ksys_write.do_syscall_64
      0.54            +0.0        0.59        perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.llseek
      0.99            +0.0        1.04        perf-profile.calltrace.cycles-pp.mutex_unlock.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
      1.60            +0.1        1.67        perf-profile.calltrace.cycles-pp.down_write.shmem_file_write_iter.vfs_write.ksys_write.do_syscall_64
      0.68 ±  2%      +0.1        0.75 ±  3%  perf-profile.calltrace.cycles-pp.ktime_get_coarse_real_ts64_mg.current_time.inode_needs_update_time.file_update_time.shmem_file_write_iter
      1.61            +0.1        1.69        perf-profile.calltrace.cycles-pp.mutex_lock.fdget_pos.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.94            +0.1        1.02        perf-profile.calltrace.cycles-pp.folio_mark_accessed.shmem_get_folio_gfp.shmem_write_begin.generic_perform_write.shmem_file_write_iter
      3.19            +0.1        3.27        perf-profile.calltrace.cycles-pp.shmem_write_end.generic_perform_write.shmem_file_write_iter.vfs_write.ksys_write
      0.87            +0.1        0.96        perf-profile.calltrace.cycles-pp.rw_verify_area.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
      2.23            +0.1        2.32        perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.llseek
      2.21            +0.1        2.31        perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
      1.57            +0.2        1.72        perf-profile.calltrace.cycles-pp.current_time.inode_needs_update_time.file_update_time.shmem_file_write_iter.vfs_write
      4.90            +0.2        5.08        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.write
      0.60 ±  3%      +0.2        0.80 ±  5%  perf-profile.calltrace.cycles-pp.balance_dirty_pages_ratelimited_flags.generic_perform_write.shmem_file_write_iter.vfs_write.ksys_write
      6.44            +0.2        6.66        perf-profile.calltrace.cycles-pp.clear_bhb_loop.llseek
      2.30            +0.2        2.52        perf-profile.calltrace.cycles-pp.inode_needs_update_time.file_update_time.shmem_file_write_iter.vfs_write.ksys_write
      2.24            +0.2        2.48        perf-profile.calltrace.cycles-pp.filemap_get_entry.shmem_get_folio_gfp.shmem_write_begin.generic_perform_write.shmem_file_write_iter
      6.06            +0.2        6.30        perf-profile.calltrace.cycles-pp.clear_bhb_loop.write
      2.82            +0.3        3.09        perf-profile.calltrace.cycles-pp.file_update_time.shmem_file_write_iter.vfs_write.ksys_write.do_syscall_64
      4.78            +0.3        5.06        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.llseek
      5.77            +0.5        6.26        perf-profile.calltrace.cycles-pp.shmem_get_folio_gfp.shmem_write_begin.generic_perform_write.shmem_file_write_iter.vfs_write
      0.00            +0.5        0.52        perf-profile.calltrace.cycles-pp.__cond_resched.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.00            +0.5        0.52        perf-profile.calltrace.cycles-pp.shmem_file_llseek.ksys_lseek.do_syscall_64.entry_SYSCALL_64_after_hwframe.llseek
      0.00            +0.5        0.54 ±  2%  perf-profile.calltrace.cycles-pp.xas_load.filemap_get_entry.shmem_get_folio_gfp.shmem_write_begin.generic_perform_write
      6.65            +0.5        7.20        perf-profile.calltrace.cycles-pp.shmem_write_begin.generic_perform_write.shmem_file_write_iter.vfs_write.ksys_write
     14.05            +0.8       14.81        perf-profile.calltrace.cycles-pp.copy_page_from_iter_atomic.generic_perform_write.shmem_file_write_iter.vfs_write.ksys_write
     28.53            +0.9       29.47        perf-profile.calltrace.cycles-pp.llseek
     32.10            -2.4       29.69        perf-profile.children.cycles-pp.generic_perform_write
     40.86            -1.9       38.96        perf-profile.children.cycles-pp.shmem_file_write_iter
     46.47            -1.8       44.64        perf-profile.children.cycles-pp.vfs_write
     54.38            -1.6       52.74        perf-profile.children.cycles-pp.ksys_write
     72.02            -1.1       70.92        perf-profile.children.cycles-pp.do_syscall_64
     73.76            -1.1       72.66        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
     74.63            -0.7       73.89        perf-profile.children.cycles-pp.write
      0.59 ±  2%      -0.0        0.54        perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
      0.20            +0.0        0.21        perf-profile.children.cycles-pp.file_remove_privs
      0.33            +0.0        0.35        perf-profile.children.cycles-pp.__f_unlock_pos
      0.53            +0.0        0.54        perf-profile.children.cycles-pp.generic_file_llseek_size
      0.89            +0.0        0.92        perf-profile.children.cycles-pp.testcase
      2.29            +0.0        2.32        perf-profile.children.cycles-pp.fput
      0.68 ±  2%      +0.0        0.71        perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare
      0.38            +0.0        0.42 ±  2%  perf-profile.children.cycles-pp.security_file_permission
      0.42            +0.0        0.46 ±  2%  perf-profile.children.cycles-pp.write@plt
      1.17            +0.0        1.21        perf-profile.children.cycles-pp.x64_sys_call
      1.04            +0.0        1.08        perf-profile.children.cycles-pp.folio_unlock
      1.14            +0.0        1.19        perf-profile.children.cycles-pp.up_write
      0.63 ±  2%      +0.0        0.67        perf-profile.children.cycles-pp.file_remove_privs_flags
      0.53 ±  2%      +0.0        0.58        perf-profile.children.cycles-pp.shmem_file_llseek
      1.59            +0.1        1.65        perf-profile.children.cycles-pp.rcu_all_qs
      2.19            +0.1        2.26        perf-profile.children.cycles-pp.mutex_unlock
      0.75            +0.1        0.82 ±  3%  perf-profile.children.cycles-pp.ktime_get_coarse_real_ts64_mg
      0.22 ±  3%      +0.1        0.29 ±  2%  perf-profile.children.cycles-pp.inode_to_bdi
      0.44 ±  2%      +0.1        0.52 ±  2%  perf-profile.children.cycles-pp.xas_start
      1.73            +0.1        1.80        perf-profile.children.cycles-pp.down_write
      3.40            +0.1        3.48        perf-profile.children.cycles-pp.shmem_write_end
      1.00            +0.1        1.08        perf-profile.children.cycles-pp.folio_mark_accessed
      3.53            +0.1        3.62        perf-profile.children.cycles-pp.mutex_lock
      0.65            +0.1        0.75        perf-profile.children.cycles-pp.xas_load
      1.00            +0.1        1.10        perf-profile.children.cycles-pp.rw_verify_area
      1.48 ±  3%      +0.1        1.59 ±  2%  perf-profile.children.cycles-pp.syscall_return_via_sysret
      3.58            +0.2        3.74        perf-profile.children.cycles-pp.__cond_resched
      1.71            +0.2        1.86        perf-profile.children.cycles-pp.current_time
      4.70            +0.2        4.90        perf-profile.children.cycles-pp.syscall_exit_to_user_mode
      4.36            +0.2        4.57        perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
      0.74 ±  2%      +0.2        0.95 ±  4%  perf-profile.children.cycles-pp.balance_dirty_pages_ratelimited_flags
      2.43            +0.2        2.66        perf-profile.children.cycles-pp.inode_needs_update_time
      2.38            +0.2        2.62        perf-profile.children.cycles-pp.filemap_get_entry
      5.53            +0.3        5.78        perf-profile.children.cycles-pp.entry_SYSCALL_64
      2.96            +0.3        3.23        perf-profile.children.cycles-pp.file_update_time
     12.62            +0.5       13.10        perf-profile.children.cycles-pp.clear_bhb_loop
      6.04            +0.5        6.54        perf-profile.children.cycles-pp.shmem_get_folio_gfp
      6.79            +0.6        7.35        perf-profile.children.cycles-pp.shmem_write_begin
     14.14            +0.8       14.90        perf-profile.children.cycles-pp.copy_page_from_iter_atomic
     28.74            +0.9       29.66        perf-profile.children.cycles-pp.llseek
      0.58 ±  3%      -0.0        0.54        perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
      0.13            +0.0        0.14        perf-profile.self.cycles-pp.__f_unlock_pos
      0.47            +0.0        0.48        perf-profile.self.cycles-pp.generic_file_llseek_size
      0.36            +0.0        0.38        perf-profile.self.cycles-pp.folio_mark_dirty
      0.27 ±  2%      +0.0        0.29        perf-profile.self.cycles-pp.xas_load
      1.56            +0.0        1.58        perf-profile.self.cycles-pp.shmem_write_end
      1.14            +0.0        1.17        perf-profile.self.cycles-pp.down_write
      0.31            +0.0        0.34 ±  2%  perf-profile.self.cycles-pp.security_file_permission
      0.54 ±  3%      +0.0        0.58        perf-profile.self.cycles-pp.syscall_exit_to_user_mode_prepare
      1.02            +0.0        1.06        perf-profile.self.cycles-pp.x64_sys_call
      0.56 ±  2%      +0.0        0.60        perf-profile.self.cycles-pp.file_remove_privs_flags
      0.52            +0.0        0.56 ±  2%  perf-profile.self.cycles-pp.file_update_time
      0.41 ±  2%      +0.0        0.45        perf-profile.self.cycles-pp.shmem_file_llseek
      0.96            +0.0        1.00        perf-profile.self.cycles-pp.folio_unlock
      1.07            +0.0        1.12        perf-profile.self.cycles-pp.up_write
      1.36            +0.0        1.41        perf-profile.self.cycles-pp.entry_SYSCALL_64
      0.80            +0.0        0.85        perf-profile.self.cycles-pp.ksys_lseek
      0.86 ±  2%      +0.0        0.91        perf-profile.self.cycles-pp.generic_write_checks
      1.20            +0.0        1.24        perf-profile.self.cycles-pp.rcu_all_qs
      0.75            +0.1        0.80 ±  2%  perf-profile.self.cycles-pp.shmem_write_begin
      0.16 ±  4%      +0.1        0.21 ±  4%  perf-profile.self.cycles-pp.inode_to_bdi
      2.05            +0.1        2.11        perf-profile.self.cycles-pp.mutex_unlock
      0.62 ±  2%      +0.1        0.69        perf-profile.self.cycles-pp.rw_verify_area
      0.69 ±  2%      +0.1        0.75 ±  3%  perf-profile.self.cycles-pp.ktime_get_coarse_real_ts64_mg
      0.72            +0.1        0.79        perf-profile.self.cycles-pp.inode_needs_update_time
      0.31 ±  2%      +0.1        0.38 ±  3%  perf-profile.self.cycles-pp.xas_start
      0.93            +0.1        1.01        perf-profile.self.cycles-pp.folio_mark_accessed
      1.98            +0.1        2.07        perf-profile.self.cycles-pp.__cond_resched
      2.34            +0.1        2.43        perf-profile.self.cycles-pp.llseek
      0.94            +0.1        1.04        perf-profile.self.cycles-pp.current_time
      1.48 ±  3%      +0.1        1.59 ±  2%  perf-profile.self.cycles-pp.syscall_return_via_sysret
      2.50            +0.1        2.62        perf-profile.self.cycles-pp.do_syscall_64
      0.52 ±  4%      +0.1        0.66 ±  7%  perf-profile.self.cycles-pp.balance_dirty_pages_ratelimited_flags
      1.72            +0.1        1.87        perf-profile.self.cycles-pp.filemap_get_entry
      4.03            +0.2        4.19        perf-profile.self.cycles-pp.syscall_exit_to_user_mode
      2.97            +0.2        3.15        perf-profile.self.cycles-pp.write
      2.14            +0.2        2.33        perf-profile.self.cycles-pp.shmem_get_folio_gfp
      4.22            +0.2        4.42        perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
     12.49            +0.5       12.96        perf-profile.self.cycles-pp.clear_bhb_loop
     13.95            +0.7       14.70        perf-profile.self.cycles-pp.copy_page_from_iter_atomic


***************************************************************************************************
lkp-icl-2sp9: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/nr_task/rootfs/runtime/tbox_group/test/testcase:
  gcc-12/performance/x86_64-rhel-9.4/100%/debian-12-x86_64-20240206.cgz/300s/lkp-icl-2sp9/fsbuffer-w/unixbench

commit: 
  654b33ada4 ("proc: fix UAF in proc_get_inode()")
  665575cff0 ("filemap: move prefaulting out of hot write path")

654b33ada4ab5e92 665575cff098b696995ddaddf46 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
  32471117            +4.6%   33974569        unixbench.throughput
      1819            +4.0%       1892        unixbench.time.user_time
 1.201e+10            +4.8%  1.259e+10        unixbench.workload
      0.33 ±  2%      +3.1%       0.34        perf-stat.i.MPKI
 4.577e+10            +1.4%   4.64e+10        perf-stat.i.branch-instructions
      0.02            -0.0        0.02        perf-stat.overall.branch-miss-rate%
      6053            -4.0%       5808        perf-stat.overall.path-length
 4.566e+10            +1.4%  4.629e+10        perf-stat.ps.branch-instructions





Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ