lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <202501241646.81b10e21-lkp@intel.com>
Date: Fri, 24 Jan 2025 16:41:28 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Al Viro <viro@...iv.linux.org.uk>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, <linux-kernel@...r.kernel.org>,
	<linux-fsdevel@...r.kernel.org>, <oliver.sang@...el.com>
Subject: [linus:master] [close_files()]  1fa4ffd8e6:
 stress-ng.fd-fork.ops_per_sec 6.2% improvement



Hello,

kernel test robot noticed a 6.2% improvement of stress-ng.fd-fork.ops_per_sec on:


commit: 1fa4ffd8e6f6d001da27f00382af79bad0336091 ("close_files(): don't bother with xchg()")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master


testcase: stress-ng
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
parameters:

	nr_threads: 100%
	testtime: 60s
	test: fd-fork
	cpufreq_governor: performance






Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250124/202501241646.81b10e21-lkp@intel.com

=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
  gcc-12/performance/x86_64-rhel-9.4/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp8/fd-fork/stress-ng/60s

commit: 
  be5498cac2 ("remove pointless includes of <linux/fdtable.h>")
  1fa4ffd8e6 ("close_files(): don't bother with xchg()")

be5498cac2ddb112 1fa4ffd8e6f6d001da27f00382a 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
     38705 ±  5%     +11.1%      42989 ±  2%  sched_debug.cpu.curr->pid.avg
     96837            +6.2%     102865        stress-ng.fd-fork.ops
      1611            +6.2%       1711        stress-ng.fd-fork.ops_per_sec
     10.10            -6.7%       9.42        stress-ng.fd-fork.seconds_to_open_all_file_descriptors
    131663            +5.2%     138573        stress-ng.time.voluntary_context_switches
   4224262 ±  3%      +5.5%    4458103        proc-vmstat.numa_hit
   4158770 ±  3%      +5.6%    4391868 ±  2%  proc-vmstat.numa_local
 1.002e+08            +6.8%   1.07e+08        proc-vmstat.pgalloc_normal
 1.001e+08            +6.8%  1.069e+08        proc-vmstat.pgfree
    200571           +14.3%     229179 ± 16%  proc-vmstat.pgreuse
      1.24 ± 15%     +39.0%       1.72 ± 13%  perf-sched.sch_delay.avg.ms.__cond_resched.dput.__fput.__x64_sys_close.do_syscall_64
      2.36 ± 21%     -31.4%       1.62 ± 32%  perf-sched.sch_delay.avg.ms.schedule_hrtimeout_range_clock.do_poll.constprop.0.do_sys_poll
      1.40 ± 17%     -34.7%       0.92 ± 30%  perf-sched.sch_delay.avg.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
      8.65 ± 31%     +64.9%      14.26 ± 55%  perf-sched.sch_delay.max.ms.__x64_sys_pause.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
      7.30 ± 70%     -70.6%       2.14 ± 97%  perf-sched.sch_delay.max.ms.irqentry_exit_to_user_mode.asm_exc_page_fault.[unknown].[unknown]
    227.50 ±  2%     +13.1%     257.36 ± 13%  perf-sched.wait_and_delay.avg.ms.irq_thread.kthread.ret_from_fork.ret_from_fork_asm
      7.35 ±  6%     -13.4%       6.36 ±  8%  perf-sched.wait_and_delay.avg.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
     22270 ±  4%     +51.5%      33741        perf-sched.wait_and_delay.count.__cond_resched.__close_range.__x64_sys_close_range.do_syscall_64.entry_SYSCALL_64_after_hwframe
     55845 ±  2%     -22.5%      43303        perf-sched.wait_and_delay.count.__cond_resched.put_files_struct.do_exit.do_group_exit.__x64_sys_exit_group
      1051 ±  2%      +8.6%       1141 ±  4%  perf-sched.wait_and_delay.count.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
    226.27 ±  2%     +13.2%     256.09 ± 13%  perf-sched.wait_time.avg.ms.irq_thread.kthread.ret_from_fork.ret_from_fork_asm
      4.72 ± 23%     +69.3%       7.99 ± 27%  perf-sched.wait_time.max.ms.wait_for_partner.fifo_open.do_dentry_open.vfs_open
 2.431e+10            +7.6%  2.617e+10        perf-stat.i.branch-instructions
     12.91            +0.7       13.61        perf-stat.i.cache-miss-rate%
 1.033e+08            +9.5%  1.132e+08        perf-stat.i.cache-misses
  8.11e+08 ±  2%      +3.4%  8.387e+08        perf-stat.i.cache-references
      1.98            -6.4%       1.85        perf-stat.i.cpi
      2168            -8.0%       1993        perf-stat.i.cycles-between-cache-misses
 1.128e+11            +7.3%   1.21e+11        perf-stat.i.instructions
      0.51            +6.7%       0.54        perf-stat.i.ipc
     65995 ±  2%      +6.6%      70332 ±  3%  perf-stat.i.minor-faults
     65995 ±  2%      +6.6%      70332 ±  3%  perf-stat.i.page-faults
      0.91            +2.1%       0.93        perf-stat.overall.MPKI
     12.73            +0.7       13.48        perf-stat.overall.cache-miss-rate%
      1.99            -6.5%       1.86        perf-stat.overall.cpi
      2179            -8.4%       1996        perf-stat.overall.cycles-between-cache-misses
      0.50            +7.0%       0.54        perf-stat.overall.ipc
 2.391e+10            +7.6%  2.574e+10        perf-stat.ps.branch-instructions
 1.015e+08            +9.5%  1.111e+08        perf-stat.ps.cache-misses
 7.975e+08            +3.4%  8.247e+08        perf-stat.ps.cache-references
  1.11e+11            +7.3%  1.191e+11        perf-stat.ps.instructions
     64224 ±  2%      +6.7%      68500 ±  3%  perf-stat.ps.minor-faults
     64224 ±  2%      +6.7%      68501 ±  3%  perf-stat.ps.page-faults
  6.87e+12            +7.4%  7.379e+12        perf-stat.total.instructions
     29.66 ±  2%      -1.3       28.36 ±  2%  perf-profile.calltrace.cycles-pp.put_files_struct.do_exit.do_group_exit.__x64_sys_exit_group.x64_sys_call
     30.58 ±  2%      -1.2       29.42        perf-profile.calltrace.cycles-pp.__x64_sys_exit_group.x64_sys_call.do_syscall_64.entry_SYSCALL_64_after_hwframe
     30.58 ±  2%      -1.2       29.42        perf-profile.calltrace.cycles-pp.do_exit.do_group_exit.__x64_sys_exit_group.x64_sys_call.do_syscall_64
     30.58 ±  2%      -1.2       29.42        perf-profile.calltrace.cycles-pp.do_group_exit.__x64_sys_exit_group.x64_sys_call.do_syscall_64.entry_SYSCALL_64_after_hwframe
     30.58 ±  2%      -1.2       29.42        perf-profile.calltrace.cycles-pp.x64_sys_call.do_syscall_64.entry_SYSCALL_64_after_hwframe
     30.58 ±  2%      -1.2       29.43        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe
     30.58 ±  2%      -1.2       29.43        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe
      0.53            +0.1        0.60 ±  3%  perf-profile.calltrace.cycles-pp.exit_mmap.__mmput.exit_mm.do_exit.do_group_exit
      0.54            +0.1        0.60 ±  3%  perf-profile.calltrace.cycles-pp.__mmput.exit_mm.do_exit.do_group_exit.__x64_sys_exit_group
      0.54            +0.1        0.60 ±  3%  perf-profile.calltrace.cycles-pp.exit_mm.do_exit.do_group_exit.__x64_sys_exit_group.x64_sys_call
      0.78            +0.1        0.87 ±  2%  perf-profile.calltrace.cycles-pp.dup_mm.copy_process.kernel_clone.__do_sys_clone.do_syscall_64
      0.76            +0.1        0.84 ±  2%  perf-profile.calltrace.cycles-pp.dup_mmap.dup_mm.copy_process.kernel_clone.__do_sys_clone
      0.34 ± 70%      +0.3        0.65        perf-profile.calltrace.cycles-pp.rcu_all_qs.__cond_resched.put_files_struct.do_exit.do_group_exit
      1.18 ±  3%      +0.4        1.56        perf-profile.calltrace.cycles-pp.__cond_resched.put_files_struct.do_exit.do_group_exit.__x64_sys_exit_group
     21.65            +0.8       22.48        perf-profile.calltrace.cycles-pp.dup_fd.copy_process.kernel_clone.__do_sys_clone.do_syscall_64
     22.48            +0.9       23.40        perf-profile.calltrace.cycles-pp.copy_process.kernel_clone.__do_sys_clone.do_syscall_64.entry_SYSCALL_64_after_hwframe
     22.50            +0.9       23.43        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe._Fork
     22.50            +0.9       23.43        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe._Fork
     22.50            +0.9       23.42        perf-profile.calltrace.cycles-pp.__do_sys_clone.do_syscall_64.entry_SYSCALL_64_after_hwframe._Fork
     22.50            +0.9       23.42        perf-profile.calltrace.cycles-pp.kernel_clone.__do_sys_clone.do_syscall_64.entry_SYSCALL_64_after_hwframe._Fork
     22.52            +0.9       23.45        perf-profile.calltrace.cycles-pp._Fork
      1.47 ±  4%      +1.0        2.49        perf-profile.calltrace.cycles-pp.dnotify_flush.filp_flush.filp_close.put_files_struct.do_exit
      2.19 ±  4%      +1.2        3.38 ±  2%  perf-profile.calltrace.cycles-pp.locks_remove_posix.filp_flush.filp_close.put_files_struct.do_exit
      9.47 ±  2%      +2.7       12.14 ±  2%  perf-profile.calltrace.cycles-pp.fput.filp_close.put_files_struct.do_exit.do_group_exit
     22.10 ±  2%      +3.5       25.60 ±  2%  perf-profile.calltrace.cycles-pp.filp_close.put_files_struct.do_exit.do_group_exit.__x64_sys_exit_group
     30.02 ±  2%      -1.2       28.79        perf-profile.children.cycles-pp.put_files_struct
     30.60 ±  2%      -1.2       29.44        perf-profile.children.cycles-pp.do_exit
     30.60 ±  2%      -1.2       29.44        perf-profile.children.cycles-pp.__x64_sys_exit_group
     30.60 ±  2%      -1.2       29.44        perf-profile.children.cycles-pp.do_group_exit
     30.60 ±  2%      -1.2       29.44        perf-profile.children.cycles-pp.x64_sys_call
      0.09            +0.0        0.10        perf-profile.children.cycles-pp.__memcg_slab_post_alloc_hook
      0.11 ±  3%      +0.0        0.12 ±  4%  perf-profile.children.cycles-pp.down_write
      0.13 ±  3%      +0.0        0.15 ±  3%  perf-profile.children.cycles-pp.handle_mm_fault
      0.15 ±  3%      +0.0        0.16        perf-profile.children.cycles-pp.kmem_cache_alloc_noprof
      0.14 ±  2%      +0.0        0.16 ±  3%  perf-profile.children.cycles-pp.kmem_cache_free
      0.16            +0.0        0.18 ±  2%  perf-profile.children.cycles-pp.do_user_addr_fault
      0.16            +0.0        0.18 ±  2%  perf-profile.children.cycles-pp.exc_page_fault
      0.25            +0.0        0.27 ±  2%  perf-profile.children.cycles-pp.anon_vma_clone
      0.24 ±  2%      +0.0        0.27        perf-profile.children.cycles-pp.free_pgtables
      0.31            +0.0        0.34 ±  2%  perf-profile.children.cycles-pp.anon_vma_fork
      0.13 ±  5%      +0.0        0.16 ±  8%  perf-profile.children.cycles-pp.copy_p4d_range
      0.54            +0.1        0.60 ±  3%  perf-profile.children.cycles-pp.__mmput
      0.54            +0.1        0.61 ±  3%  perf-profile.children.cycles-pp.exit_mm
      0.53            +0.1        0.60 ±  3%  perf-profile.children.cycles-pp.exit_mmap
      0.78            +0.1        0.87 ±  2%  perf-profile.children.cycles-pp.dup_mm
      0.76            +0.1        0.85 ±  2%  perf-profile.children.cycles-pp.dup_mmap
      1.53            +0.2        1.75        perf-profile.children.cycles-pp.rcu_all_qs
      3.50            +0.5        4.03        perf-profile.children.cycles-pp.__cond_resched
     21.65            +0.8       22.48        perf-profile.children.cycles-pp.dup_fd
     22.48            +0.9       23.40        perf-profile.children.cycles-pp.copy_process
     22.50            +0.9       23.42        perf-profile.children.cycles-pp.kernel_clone
     22.50            +0.9       23.42        perf-profile.children.cycles-pp.__do_sys_clone
     22.53            +0.9       23.46        perf-profile.children.cycles-pp._Fork
     33.30            +0.9       34.24        perf-profile.children.cycles-pp.filp_flush
      3.75            +1.0        4.78        perf-profile.children.cycles-pp.dnotify_flush
      5.04            +1.2        6.22        perf-profile.children.cycles-pp.locks_remove_posix
     21.42            +2.6       24.05        perf-profile.children.cycles-pp.fput
     56.02            +3.7       59.71        perf-profile.children.cycles-pp.filp_close
      6.16 ±  2%      -5.2        0.91        perf-profile.self.cycles-pp.put_files_struct
     24.86            -1.2       23.65        perf-profile.self.cycles-pp.filp_flush
      1.14            +0.2        1.31        perf-profile.self.cycles-pp.rcu_all_qs
      1.76            +0.2        1.94        perf-profile.self.cycles-pp.filp_close
      1.91            +0.3        2.20        perf-profile.self.cycles-pp.__cond_resched
     21.51            +0.8       22.34        perf-profile.self.cycles-pp.dup_fd
      3.30            +1.0        4.26        perf-profile.self.cycles-pp.dnotify_flush
      4.58            +1.1        5.72        perf-profile.self.cycles-pp.locks_remove_posix
     20.87            +2.6       23.46        perf-profile.self.cycles-pp.fput




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ