lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <202601071316.992a1d32-lkp@intel.com>
Date: Wed, 7 Jan 2026 14:39:06 +0800
From: kernel test robot <oliver.sang@...el.com>
To: "Paul E. McKenney" <paulmck@...nel.org>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, kernel test robot
	<oliver.sang@...el.com>, Andrii Nakryiko <andrii@...nel.org>, "Alexei
 Starovoitov" <ast@...nel.org>, Peter Zijlstra <peterz@...radead.org>,
	<rcu@...r.kernel.org>, <linux-kernel@...r.kernel.org>
Subject: [paulmckrcu:dev.2025.12.16a] [rcu]  1ac50ec628:
 stress-ng.memfd.ops_per_sec 3.4% improvement


hi, Paul E. McKenney,

similar to b41f5a411f report we just made out. we make this report based on
stable data.
please educate us if this report is less meaningful. thanks


Hello,

kernel test robot noticed a 3.4% improvement of stress-ng.memfd.ops_per_sec on:


commit: 1ac50ec62874025381a864f784583dbdc30dcc7c ("rcu: Re-implement RCU Tasks Trace in terms of SRCU-fast")
https://github.com/paulmckrcu/linux dev.2025.12.16a

testcase: stress-ng
config: x86_64-rhel-9.4
compiler: gcc-14
test machine: 192 threads 2 sockets Intel(R) Xeon(R) 6740E  CPU @ 2.4GHz (Sierra Forest) with 256G memory
parameters:

	nr_threads: 100%
	testtime: 60s
	test: memfd
	cpufreq_governor: performance



Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20260107/202601071316.992a1d32-lkp@intel.com

=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
  gcc-14/performance/x86_64-rhel-9.4/100%/debian-13-x86_64-20250902.cgz/lkp-srf-2sp2/memfd/stress-ng/60s

commit: 
  43c23963b3 ("tracing: Guard __DECLARE_TRACE() use of __DO_TRACE_CALL() with SRCU-fast")
  1ac50ec628 ("rcu: Re-implement RCU Tasks Trace in terms of SRCU-fast")

43c23963b3c549da 1ac50ec62874025381a864f7845 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
    203232 ± 22%     -21.8%     158902 ± 18%  numa-meminfo.node1.Mapped
     50977 ± 22%     -21.8%      39853 ± 18%  numa-vmstat.node1.nr_mapped
      7039            +1.6%       7152        vmstat.system.cs
    107613            -3.9%     103453        stress-ng.memfd.nanosecs_per_memfd_create_call
    193537            +3.4%     200175        stress-ng.memfd.ops
      3226            +3.4%       3337        stress-ng.memfd.ops_per_sec
    187908            +1.6%     190921        stress-ng.time.involuntary_context_switches
  99134672            +3.4%  1.025e+08        stress-ng.time.minor_page_faults
     61965 ±  3%      -6.2%      58116        proc-vmstat.nr_mapped
 1.526e+08            +3.4%  1.578e+08        proc-vmstat.numa_hit
 1.524e+08            +3.4%  1.576e+08        proc-vmstat.numa_local
 1.631e+08            +3.4%  1.687e+08        proc-vmstat.pgalloc_normal
  99574028            +3.4%   1.03e+08        proc-vmstat.pgfault
 1.624e+08            +3.4%  1.679e+08        proc-vmstat.pgfree
      2.26            +1.1%       2.28        perf-stat.i.MPKI
 1.646e+10            +1.8%  1.675e+10        perf-stat.i.branch-instructions
      0.24            +0.0        0.25        perf-stat.i.branch-miss-rate%
  38660390            +6.8%   41301381        perf-stat.i.branch-misses
 1.714e+08            +3.1%  1.768e+08        perf-stat.i.cache-misses
 2.884e+08            +3.3%  2.979e+08        perf-stat.i.cache-references
      6758            +1.3%       6846        perf-stat.i.context-switches
      7.88            -2.0%       7.73        perf-stat.i.cpi
      3504            -3.1%       3397        perf-stat.i.cycles-between-cache-misses
 7.628e+10            +2.0%  7.783e+10        perf-stat.i.instructions
      0.13            +2.0%       0.13        perf-stat.i.ipc
     17.01            +3.5%      17.59        perf-stat.i.metric.K/sec
   1632582            +3.5%    1689088        perf-stat.i.minor-faults
   1632582            +3.5%    1689088        perf-stat.i.page-faults
      2.25            +1.1%       2.27        perf-stat.overall.MPKI
      0.23            +0.0        0.25        perf-stat.overall.branch-miss-rate%
      7.91            -2.0%       7.76        perf-stat.overall.cpi
      3519            -3.1%       3411        perf-stat.overall.cycles-between-cache-misses
      0.13            +2.0%       0.13        perf-stat.overall.ipc
 1.619e+10            +1.8%  1.648e+10        perf-stat.ps.branch-instructions
  37927942            +6.8%   40525278        perf-stat.ps.branch-misses
 1.687e+08            +3.1%   1.74e+08        perf-stat.ps.cache-misses
 2.841e+08            +3.3%  2.933e+08        perf-stat.ps.cache-references
      6638            +1.4%       6729        perf-stat.ps.context-switches
 7.503e+10            +2.0%  7.654e+10        perf-stat.ps.instructions
   1606108            +3.5%    1661536        perf-stat.ps.minor-faults
   1606108            +3.5%    1661536        perf-stat.ps.page-faults
 4.564e+12            +1.8%  4.647e+12        perf-stat.total.instructions
     46.05            -0.3       45.79        perf-profile.calltrace.cycles-pp._raw_spin_lock.inode_sb_list_add.new_inode.__shmem_get_inode.__shmem_file_setup
     45.93            -0.3       45.68        perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.inode_sb_list_add.new_inode.__shmem_get_inode
     46.36            -0.3       46.10        perf-profile.calltrace.cycles-pp.__shmem_get_inode.__shmem_file_setup.__x64_sys_memfd_create.do_syscall_64.entry_SYSCALL_64_after_hwframe
     46.12            -0.3       45.87        perf-profile.calltrace.cycles-pp.inode_sb_list_add.new_inode.__shmem_get_inode.__shmem_file_setup.__x64_sys_memfd_create
     46.26            -0.3       46.01        perf-profile.calltrace.cycles-pp.new_inode.__shmem_get_inode.__shmem_file_setup.__x64_sys_memfd_create.do_syscall_64
     46.57            -0.2       46.32        perf-profile.calltrace.cycles-pp.__shmem_file_setup.__x64_sys_memfd_create.do_syscall_64.entry_SYSCALL_64_after_hwframe.memfd_create
     46.62            -0.2       46.38        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.memfd_create
     46.61            -0.2       46.37        perf-profile.calltrace.cycles-pp.__x64_sys_memfd_create.do_syscall_64.entry_SYSCALL_64_after_hwframe.memfd_create
     46.64            -0.2       46.40        perf-profile.calltrace.cycles-pp.memfd_create
     46.62            -0.2       46.37        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.memfd_create
     45.57            -0.2       45.38        perf-profile.calltrace.cycles-pp._raw_spin_lock.evict.__dentry_kill.finish_dput.__fput
     45.40            -0.2       45.22        perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.evict.__dentry_kill.finish_dput
     46.69            -0.2       46.53        perf-profile.calltrace.cycles-pp.__fput.task_work_run.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe
     46.61            -0.2       46.45        perf-profile.calltrace.cycles-pp.finish_dput.__fput.task_work_run.exit_to_user_mode_loop.do_syscall_64
     46.73            -0.2       46.57        perf-profile.calltrace.cycles-pp.close_range
     46.71            -0.2       46.55        perf-profile.calltrace.cycles-pp.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe.close_range
     46.71            -0.2       46.55        perf-profile.calltrace.cycles-pp.task_work_run.exit_to_user_mode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe.close_range
     46.73            -0.2       46.57        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.close_range
     46.73            -0.2       46.57        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.close_range
     46.60            -0.2       46.45        perf-profile.calltrace.cycles-pp.__dentry_kill.finish_dput.__fput.task_work_run.exit_to_user_mode_loop
     46.40            -0.2       46.24        perf-profile.calltrace.cycles-pp.evict.__dentry_kill.finish_dput.__fput.task_work_run
      0.59            +0.0        0.60        perf-profile.calltrace.cycles-pp.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
      0.56            +0.0        0.57        perf-profile.calltrace.cycles-pp.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64
      0.62            +0.0        0.64        perf-profile.calltrace.cycles-pp.__munmap
      0.59            +0.0        0.60        perf-profile.calltrace.cycles-pp.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
      0.57            +0.0        0.59        perf-profile.calltrace.cycles-pp.do_vmi_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.59            +0.0        0.61        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__munmap
      0.59            +0.0        0.61        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
      0.98            +0.0        1.01        perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
      0.94            +0.0        0.97        perf-profile.calltrace.cycles-pp.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault
      0.94            +0.0        0.97        perf-profile.calltrace.cycles-pp.do_shared_fault.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
      1.01            +0.0        1.04        perf-profile.calltrace.cycles-pp.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.stress_memfd_child
      0.83            +0.0        0.86        perf-profile.calltrace.cycles-pp.shmem_get_folio_gfp.shmem_fault.__do_fault.do_shared_fault.do_fault
      0.84            +0.0        0.87        perf-profile.calltrace.cycles-pp.shmem_fault.__do_fault.do_shared_fault.do_fault.__handle_mm_fault
      0.84            +0.0        0.88        perf-profile.calltrace.cycles-pp.__do_fault.do_shared_fault.do_fault.__handle_mm_fault.handle_mm_fault
      1.08            +0.0        1.12        perf-profile.calltrace.cycles-pp.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.stress_memfd_child
      1.09            +0.0        1.14        perf-profile.calltrace.cycles-pp.exc_page_fault.asm_exc_page_fault.stress_memfd_child
      1.25            +0.1        1.30        perf-profile.calltrace.cycles-pp.asm_exc_page_fault.stress_memfd_child
      1.26            +0.1        1.32        perf-profile.calltrace.cycles-pp.stress_memfd_child
      0.81 ±  2%      +0.1        0.91 ±  2%  perf-profile.calltrace.cycles-pp.mmap_region.do_mmap.vm_mmap_pgoff.ksys_mmap_pgoff.do_syscall_64
      0.80 ±  3%      +0.1        0.90 ±  2%  perf-profile.calltrace.cycles-pp.__mmap_region.mmap_region.do_mmap.vm_mmap_pgoff.ksys_mmap_pgoff
      0.99 ±  2%      +0.1        1.11        perf-profile.calltrace.cycles-pp.do_mmap.vm_mmap_pgoff.ksys_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe
      1.00 ±  2%      +0.1        1.13        perf-profile.calltrace.cycles-pp.vm_mmap_pgoff.ksys_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe.__mmap
      1.03 ±  2%      +0.1        1.16        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__mmap
      1.02 ±  2%      +0.1        1.15        perf-profile.calltrace.cycles-pp.ksys_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe.__mmap
      1.03 ±  2%      +0.1        1.16        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__mmap
      1.21            +0.1        1.35        perf-profile.calltrace.cycles-pp.__mmap
     92.36            -0.4       91.92        perf-profile.children.cycles-pp._raw_spin_lock
     92.52            -0.3       92.17        perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
     46.12            -0.3       45.87        perf-profile.children.cycles-pp.inode_sb_list_add
     46.26            -0.3       46.01        perf-profile.children.cycles-pp.new_inode
     46.36            -0.3       46.10        perf-profile.children.cycles-pp.__shmem_get_inode
     46.57            -0.2       46.32        perf-profile.children.cycles-pp.__shmem_file_setup
     46.61            -0.2       46.37        perf-profile.children.cycles-pp.__x64_sys_memfd_create
     46.65            -0.2       46.41        perf-profile.children.cycles-pp.memfd_create
     47.18            -0.2       47.01        perf-profile.children.cycles-pp.__fput
     47.10            -0.2       46.94        perf-profile.children.cycles-pp.finish_dput
     46.73            -0.2       46.57        perf-profile.children.cycles-pp.close_range
     47.09            -0.2       46.93        perf-profile.children.cycles-pp.__dentry_kill
     46.88            -0.2       46.72        perf-profile.children.cycles-pp.evict
     46.71            -0.2       46.55        perf-profile.children.cycles-pp.exit_to_user_mode_loop
     46.71            -0.2       46.55        perf-profile.children.cycles-pp.task_work_run
     97.60            -0.1       97.51        perf-profile.children.cycles-pp.do_syscall_64
     97.61            -0.1       97.53        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
      0.14            +0.0        0.15        perf-profile.children.cycles-pp.xas_create
      0.11            +0.0        0.12        perf-profile.children.cycles-pp.__alloc_frozen_pages_noprof
      0.11            +0.0        0.12        perf-profile.children.cycles-pp.folio_alloc_mpol_noprof
      0.07            +0.0        0.08        perf-profile.children.cycles-pp.mas_rev_awalk
      0.12            +0.0        0.13        perf-profile.children.cycles-pp.alloc_pages_mpol
      0.12            +0.0        0.13        perf-profile.children.cycles-pp.native_flush_tlb_one_user
      0.29            +0.0        0.30        perf-profile.children.cycles-pp.kmem_cache_free
      0.09 ±  5%      +0.0        0.10 ±  4%  perf-profile.children.cycles-pp.xas_expand
      0.17 ±  2%      +0.0        0.18 ±  2%  perf-profile.children.cycles-pp.kmem_cache_alloc_lru_noprof
      0.22            +0.0        0.24 ±  2%  perf-profile.children.cycles-pp.xas_store
      0.13 ±  3%      +0.0        0.15 ±  3%  perf-profile.children.cycles-pp.flush_tlb_func
      0.47            +0.0        0.49        perf-profile.children.cycles-pp.kthread
      0.47            +0.0        0.49        perf-profile.children.cycles-pp.ret_from_fork
      0.47            +0.0        0.49        perf-profile.children.cycles-pp.ret_from_fork_asm
      0.23            +0.0        0.25        perf-profile.children.cycles-pp.shmem_add_to_page_cache
      0.12 ±  4%      +0.0        0.14        perf-profile.children.cycles-pp.shmem_alloc_folio
      0.46            +0.0        0.48        perf-profile.children.cycles-pp.run_ksoftirqd
      0.14 ±  3%      +0.0        0.16        perf-profile.children.cycles-pp.arch_get_unmapped_area_topdown
      0.13 ±  3%      +0.0        0.15        perf-profile.children.cycles-pp.vm_unmapped_area
      0.15 ±  3%      +0.0        0.17        perf-profile.children.cycles-pp.flush_tlb_mm_range
      0.08 ±  5%      +0.0        0.10        perf-profile.children.cycles-pp.mas_empty_area_rev
      0.59            +0.0        0.60        perf-profile.children.cycles-pp.__vm_munmap
      0.53            +0.0        0.55        perf-profile.children.cycles-pp.handle_softirqs
      0.52            +0.0        0.54        perf-profile.children.cycles-pp.rcu_do_batch
      0.56            +0.0        0.57        perf-profile.children.cycles-pp.do_vmi_align_munmap
      0.59            +0.0        0.61        perf-profile.children.cycles-pp.__x64_sys_munmap
      0.52            +0.0        0.54        perf-profile.children.cycles-pp.rcu_core
      0.31            +0.0        0.33        perf-profile.children.cycles-pp.unmap_page_range
      0.13 ±  2%      +0.0        0.15        perf-profile.children.cycles-pp.unmapped_area_topdown
      0.15 ±  2%      +0.0        0.17        perf-profile.children.cycles-pp.__get_unmapped_area
      0.28            +0.0        0.30        perf-profile.children.cycles-pp.zap_pte_range
      0.63            +0.0        0.65        perf-profile.children.cycles-pp.__munmap
      0.57            +0.0        0.59        perf-profile.children.cycles-pp.do_vmi_munmap
      0.15 ±  2%      +0.0        0.17        perf-profile.children.cycles-pp.shmem_get_unmapped_area
      0.21            +0.0        0.23        perf-profile.children.cycles-pp.zap_page_range_single
      0.36            +0.0        0.38        perf-profile.children.cycles-pp.__mmap_new_vma
      0.29            +0.0        0.31        perf-profile.children.cycles-pp.zap_pmd_range
      0.19            +0.0        0.22 ±  2%  perf-profile.children.cycles-pp.zap_page_range_single_batched
      0.23            +0.0        0.26        perf-profile.children.cycles-pp.unmap_mapping_range
      0.05            +0.0        0.08        perf-profile.children.cycles-pp.perf_iterate_sb
      0.51            +0.0        0.54        perf-profile.children.cycles-pp.shmem_alloc_and_add_folio
      0.98            +0.0        1.02        perf-profile.children.cycles-pp.__handle_mm_fault
      0.94            +0.0        0.97        perf-profile.children.cycles-pp.do_fault
      0.94            +0.0        0.97        perf-profile.children.cycles-pp.do_shared_fault
      1.01            +0.0        1.04        perf-profile.children.cycles-pp.handle_mm_fault
      0.84            +0.0        0.87        perf-profile.children.cycles-pp.shmem_fault
      1.09            +0.0        1.12        perf-profile.children.cycles-pp.do_user_addr_fault
      0.84            +0.0        0.88        perf-profile.children.cycles-pp.__do_fault
      1.09            +0.0        1.14        perf-profile.children.cycles-pp.exc_page_fault
      0.14 ±  3%      +0.0        0.18        perf-profile.children.cycles-pp.perf_event_mmap
      0.13 ±  3%      +0.0        0.17        perf-profile.children.cycles-pp.perf_event_mmap_event
      1.02            +0.0        1.06        perf-profile.children.cycles-pp.shmem_get_folio_gfp
      0.00            +0.1        0.05        perf-profile.children.cycles-pp.fault_dirty_shared_page
      0.00            +0.1        0.05        perf-profile.children.cycles-pp.perf_event_mmap_output
      1.40            +0.1        1.46        perf-profile.children.cycles-pp.asm_exc_page_fault
      1.53            +0.1        1.60        perf-profile.children.cycles-pp.stress_memfd_child
      0.81 ±  2%      +0.1        0.91 ±  2%  perf-profile.children.cycles-pp.mmap_region
      0.80 ±  2%      +0.1        0.90 ±  2%  perf-profile.children.cycles-pp.__mmap_region
      0.99 ±  2%      +0.1        1.11        perf-profile.children.cycles-pp.do_mmap
      1.00 ±  2%      +0.1        1.13        perf-profile.children.cycles-pp.vm_mmap_pgoff
      1.02 ±  2%      +0.1        1.15        perf-profile.children.cycles-pp.ksys_mmap_pgoff
      1.22            +0.1        1.36        perf-profile.children.cycles-pp.__mmap
     92.16            -0.4       91.80        perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
      0.05            +0.0        0.06        perf-profile.self.cycles-pp.mas_rev_awalk
      0.12            +0.0        0.13        perf-profile.self.cycles-pp.native_flush_tlb_one_user




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ