lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <202505160923.2556b729-lkp@intel.com>
Date: Fri, 16 May 2025 10:13:11 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Peter Zijlstra <peterz@...radead.org>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, <linux-kernel@...r.kernel.org>,
	<x86@...nel.org>, Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
	<oliver.sang@...el.com>
Subject: [tip:locking/futex] [futex]  cec199c5e3:
 will-it-scale.per_thread_ops 3.2% regression



Hello,

kernel test robot noticed a 3.2% regression of will-it-scale.per_thread_ops on:


commit: cec199c5e39bde7191a08087cc3d002ccfab31ff ("futex: Implement FUTEX2_NUMA")
https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git locking/futex

[test failed on linux-next/master bdd609656ff5573db9ba1d26496a528bdd297cf2]

testcase: will-it-scale
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 256 threads 2 sockets GENUINE INTEL(R) XEON(R) (Sierra Forest) with 128G memory
parameters:

	nr_task: 100%
	mode: thread
	test: futex1
	cpufreq_governor: performance


In addition to that, the commit also has significant impact on the following tests:

+------------------+---------------------------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_process_ops  4.6% regression                   |
| test machine     | 384 threads 2 sockets Intel(R) Xeon(R) 6972P (Granite Rapids) with 128G memory  |
| test parameters  | cpufreq_governor=performance                                                    |
|                  | mode=process                                                                    |
|                  | nr_task=100%                                                                    |
|                  | test=futex4                                                                     |
+------------------+---------------------------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_thread_ops  3.4% regression                    |
| test machine     | 256 threads 2 sockets GENUINE INTEL(R) XEON(R) (Sierra Forest) with 128G memory |
| test parameters  | cpufreq_governor=performance                                                    |
|                  | mode=thread                                                                     |
|                  | nr_task=100%                                                                    |
|                  | test=futex2                                                                     |
+------------------+---------------------------------------------------------------------------------+


If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@...el.com>
| Closes: https://lore.kernel.org/oe-lkp/202505160923.2556b729-lkp@intel.com


Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250516/202505160923.2556b729-lkp@intel.com

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
  gcc-12/performance/x86_64-rhel-9.4/thread/100%/debian-12-x86_64-20240206.cgz/lkp-srf-2sp1/futex1/will-it-scale

commit: 
  63e8595c06 ("futex: Allow to make the private hash immutable")
  cec199c5e3 ("futex: Implement FUTEX2_NUMA")

63e8595c060a1fef cec199c5e39bde7191a08087cc3 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
 1.079e+09 ± 58%     -45.0%  5.934e+08 ±  3%  cpuidle..time
      1.38 ± 53%      -0.7        0.64 ±  6%  mpstat.cpu.all.idle%
   4521392 ± 27%     -41.0%    2666361 ± 55%  numa-meminfo.node0.MemUsed
   3133949 ± 39%     +59.9%    5010749 ± 29%  numa-meminfo.node1.MemUsed
 1.224e+09            -3.2%  1.185e+09        will-it-scale.256.threads
   4780060            -3.2%    4627197        will-it-scale.per_thread_ops
 1.224e+09            -3.2%  1.185e+09        will-it-scale.workload
      0.04 ± 24%     -29.3%       0.03 ± 37%  perf-stat.i.major-faults
      3964 ±  2%      -3.6%       3821        perf-stat.i.minor-faults
      3964 ±  2%      -3.6%       3821        perf-stat.i.page-faults
    322627            +3.7%     334580        perf-stat.overall.path-length
      0.04 ± 24%     -30.0%       0.03 ± 36%  perf-stat.ps.major-faults
      3934 ±  2%      -3.8%       3785        perf-stat.ps.minor-faults
      3934 ±  2%      -3.8%       3785        perf-stat.ps.page-faults
      0.07 ± 26%     -39.6%       0.04 ± 36%  perf-sched.sch_delay.avg.ms.anon_pipe_read.vfs_read.ksys_read.do_syscall_64
      0.11 ± 16%     +38.2%       0.15 ± 25%  perf-sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown]
     66.89 ±  7%     -30.1%      46.79 ± 27%  perf-sched.wait_and_delay.avg.ms.anon_pipe_read.vfs_read.ksys_read.do_syscall_64
     11.66 ± 90%    +124.7%      26.19 ± 27%  perf-sched.wait_and_delay.avg.ms.do_task_dead.do_exit.do_group_exit.__x64_sys_exit_group.x64_sys_call
      0.22 ± 16%     +38.5%       0.30 ± 25%  perf-sched.wait_and_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown]
    414.50 ±  7%     +59.5%     661.00 ± 40%  perf-sched.wait_and_delay.count.anon_pipe_read.vfs_read.ksys_read.do_syscall_64
     66.83 ±  7%     -30.0%      46.75 ± 27%  perf-sched.wait_time.avg.ms.anon_pipe_read.vfs_read.ksys_read.do_syscall_64
     11.66 ± 90%    +124.7%      26.19 ± 27%  perf-sched.wait_time.avg.ms.do_task_dead.do_exit.do_group_exit.__x64_sys_exit_group.x64_sys_call
      0.11 ± 16%     +38.2%       0.15 ± 25%  perf-sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown]
     28.45            -3.5       24.98        perf-profile.calltrace.cycles-pp.get_user_pages_fast.get_futex_key.futex_wake.do_futex.__x64_sys_futex
     27.02            -3.3       23.74        perf-profile.calltrace.cycles-pp.gup_fast_fallback.get_user_pages_fast.get_futex_key.futex_wake.do_futex
     24.86            -2.4       22.46        perf-profile.calltrace.cycles-pp.gup_fast.gup_fast_fallback.get_user_pages_fast.get_futex_key.futex_wake
     22.10            -1.9       20.22        perf-profile.calltrace.cycles-pp.gup_fast_pgd_range.gup_fast.gup_fast_fallback.get_user_pages_fast.get_futex_key
     30.86            -1.0       29.84        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.syscall
     15.86            -0.3       15.57        perf-profile.calltrace.cycles-pp.gup_fast_pte_range.gup_fast_pgd_range.gup_fast.gup_fast_fallback.get_user_pages_fast
      7.01            -0.2        6.79        perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
      3.97            -0.1        3.83        perf-profile.calltrace.cycles-pp.entry_SYSRETQ_unsafe_stack.syscall
      4.28            -0.1        4.14        perf-profile.calltrace.cycles-pp.try_get_folio.gup_fast_pte_range.gup_fast_pgd_range.gup_fast.gup_fast_fallback
      0.75            -0.1        0.62        perf-profile.calltrace.cycles-pp.is_valid_gup_args.get_user_pages_fast.get_futex_key.futex_wake.do_futex
      1.18            -0.1        1.08        perf-profile.calltrace.cycles-pp.___pte_offset_map.gup_fast_pte_range.gup_fast_pgd_range.gup_fast.gup_fast_fallback
      3.04            -0.1        2.95        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_safe_stack.syscall
      2.53            -0.0        2.50        perf-profile.calltrace.cycles-pp.testcase
      0.60            -0.0        0.58        perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.syscall
      0.65            -0.0        0.63        perf-profile.calltrace.cycles-pp.x64_sys_call.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
     55.34            +1.5       56.83        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.syscall
     53.05            +1.6       54.61        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
     43.57            +1.9       45.44        perf-profile.calltrace.cycles-pp.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
     42.42            +1.9       44.33        perf-profile.calltrace.cycles-pp.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
     41.62            +1.9       43.54        perf-profile.calltrace.cycles-pp.futex_wake.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe
      2.05            +2.1        4.17 ±  8%  perf-profile.calltrace.cycles-pp.futex_hash.futex_wake.do_futex.__x64_sys_futex.do_syscall_64
      0.00            +3.6        3.63 ±  9%  perf-profile.calltrace.cycles-pp.__futex_hash.futex_hash.futex_wake.do_futex.__x64_sys_futex
     28.77            -3.7       25.06        perf-profile.children.cycles-pp.get_user_pages_fast
     27.11            -3.3       23.82        perf-profile.children.cycles-pp.gup_fast_fallback
     24.95            -2.4       22.56        perf-profile.children.cycles-pp.gup_fast
     22.15            -1.9       20.25        perf-profile.children.cycles-pp.gup_fast_pgd_range
     20.83            -0.7       20.14        perf-profile.children.cycles-pp.entry_SYSCALL_64
     16.00            -0.6       15.44        perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
     15.99            -0.3       15.68        perf-profile.children.cycles-pp.gup_fast_pte_range
      7.06            -0.2        6.84        perf-profile.children.cycles-pp.syscall_exit_to_user_mode
      0.77            -0.2        0.62        perf-profile.children.cycles-pp.is_valid_gup_args
      4.33            -0.1        4.18        perf-profile.children.cycles-pp.try_get_folio
      1.20            -0.1        1.10        perf-profile.children.cycles-pp.___pte_offset_map
      2.70            -0.1        2.62        perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
      2.03            -0.0        2.00        perf-profile.children.cycles-pp.testcase
      0.65            -0.0        0.63        perf-profile.children.cycles-pp.x64_sys_call
      0.65            -0.0        0.63        perf-profile.children.cycles-pp.syscall_return_via_sysret
      0.10            +0.0        0.11        perf-profile.children.cycles-pp.sysvec_thermal
      0.09            +0.0        0.10        perf-profile.children.cycles-pp.intel_thermal_interrupt
      0.10 ±  4%      +0.0        0.12 ±  4%  perf-profile.children.cycles-pp.asm_sysvec_thermal
     98.25            +0.0       98.30        perf-profile.children.cycles-pp.syscall
     55.48            +1.5       56.95        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
     53.22            +1.5       54.76        perf-profile.children.cycles-pp.do_syscall_64
     43.57            +1.9       45.44        perf-profile.children.cycles-pp.__x64_sys_futex
     42.46            +1.9       44.36        perf-profile.children.cycles-pp.do_futex
     41.72            +1.9       43.64        perf-profile.children.cycles-pp.futex_wake
      2.09            +2.1        4.22 ±  8%  perf-profile.children.cycles-pp.futex_hash
      0.00            +3.7        3.67 ±  9%  perf-profile.children.cycles-pp.__futex_hash
      6.11            -1.6        4.52        perf-profile.self.cycles-pp.gup_fast_pgd_range
      2.04            -1.5        0.50 ±  5%  perf-profile.self.cycles-pp.futex_hash
      1.96            -0.7        1.29        perf-profile.self.cycles-pp.gup_fast_fallback
     15.96            -0.6       15.40        perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
      1.10            -0.5        0.62        perf-profile.self.cycles-pp.get_user_pages_fast
      2.71            -0.5        2.24        perf-profile.self.cycles-pp.gup_fast
     14.25            -0.5       13.78        perf-profile.self.cycles-pp.syscall
     10.14            -0.3        9.81        perf-profile.self.cycles-pp.entry_SYSCALL_64
      6.69            -0.2        6.48        perf-profile.self.cycles-pp.syscall_exit_to_user_mode
      4.85            -0.2        4.68        perf-profile.self.cycles-pp.futex_wake
      0.72            -0.1        0.58        perf-profile.self.cycles-pp.is_valid_gup_args
      4.31            -0.1        4.17        perf-profile.self.cycles-pp.try_get_folio
      1.98            -0.1        1.91        perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
      1.12            -0.1        1.05        perf-profile.self.cycles-pp.___pte_offset_map
      2.06            -0.1        1.99        perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
      1.75            -0.1        1.70        perf-profile.self.cycles-pp.do_syscall_64
      1.10            -0.0        1.07        perf-profile.self.cycles-pp.__x64_sys_futex
      0.78            -0.0        0.75        perf-profile.self.cycles-pp.do_futex
      0.65            -0.0        0.63        perf-profile.self.cycles-pp.syscall_return_via_sysret
      0.60            -0.0        0.58        perf-profile.self.cycles-pp.x64_sys_call
      0.19            -0.0        0.18        perf-profile.self.cycles-pp.futex_hash_put
      0.05            +0.0        0.06        perf-profile.self.cycles-pp.intel_thermal_interrupt
      0.00            +3.7        3.65 ±  9%  perf-profile.self.cycles-pp.__futex_hash
      5.84            +3.7        9.49        perf-profile.self.cycles-pp.get_futex_key


***************************************************************************************************
lkp-gnr-2ap2: 384 threads 2 sockets Intel(R) Xeon(R) 6972P (Granite Rapids) with 128G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
  gcc-12/performance/x86_64-rhel-9.4/process/100%/debian-12-x86_64-20240206.cgz/lkp-gnr-2ap2/futex4/will-it-scale

commit: 
  63e8595c06 ("futex: Allow to make the private hash immutable")
  cec199c5e3 ("futex: Implement FUTEX2_NUMA")

63e8595c060a1fef cec199c5e39bde7191a08087cc3 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
      0.72 ±  4%      +0.1        0.82 ±  6%  mpstat.cpu.all.irq%
   2699383           -13.4%    2337183 ±  6%  numa-meminfo.node1.Shmem
    123578 ±102%    +116.6%     267679 ± 25%  numa-numastat.node1.other_node
    123578 ±102%    +116.6%     267679 ± 25%  numa-vmstat.node1.numa_other
 2.323e+09            -4.6%  2.216e+09        will-it-scale.384.processes
   6049881            -4.6%    5771879        will-it-scale.per_process_ops
 2.323e+09            -4.6%  2.216e+09        will-it-scale.workload
      2.14 ± 53%     -69.0%       0.66 ±149%  perf-sched.sch_delay.avg.ms.schedule_timeout.khugepaged_wait_work.khugepaged.kthread
      2.14 ± 53%     -69.0%       0.66 ±149%  perf-sched.sch_delay.max.ms.schedule_timeout.khugepaged_wait_work.khugepaged.kthread
    351.13 ±130%    +836.1%       3286 ± 49%  perf-sched.wait_and_delay.max.ms.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
      2.14 ± 53%     -68.9%       0.67 ±148%  perf-sched.wait_time.avg.ms.schedule_timeout.khugepaged_wait_work.khugepaged.kthread
      2.14 ± 53%     -68.9%       0.67 ±148%  perf-sched.wait_time.max.ms.schedule_timeout.khugepaged_wait_work.khugepaged.kthread
    342.14 ±135%    +860.6%       3286 ± 49%  perf-sched.wait_time.max.ms.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
 2.319e+11            +1.8%  2.361e+11        perf-stat.i.branch-instructions
      0.92 ±  4%      -4.3%       0.88        perf-stat.i.cpi
 1.473e+12            +2.2%  1.506e+12        perf-stat.i.instructions
      1.11            +2.2%       1.14        perf-stat.i.ipc
      0.89            -1.6%       0.88        perf-stat.overall.cpi
      1.12            +1.6%       1.14        perf-stat.overall.ipc
    193483            +6.3%     205704        perf-stat.overall.path-length
  2.31e+11            +1.8%  2.353e+11        perf-stat.ps.branch-instructions
 1.468e+12            +2.3%  1.501e+12        perf-stat.ps.instructions
 4.495e+14            +1.4%  4.559e+14        perf-stat.total.instructions
      3.43 ±  2%      -0.6        2.82 ±  2%  perf-profile.calltrace.cycles-pp.futex_q_unlock.futex_wait_setup.__futex_wait.futex_wait.do_futex
     28.24            -0.6       27.62        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.syscall
      9.46            -0.5        8.95        perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
      3.36            -0.3        3.10        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_safe_stack.syscall
      1.76            -0.1        1.68 ±  2%  perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.syscall
      1.31            -0.1        1.23        perf-profile.calltrace.cycles-pp.testcase
      1.13            -0.1        1.06        perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
      2.09            -0.0        2.04        perf-profile.calltrace.cycles-pp.x64_sys_call.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
      1.64            -0.0        1.61        perf-profile.calltrace.cycles-pp.futex_hash_put.futex_wait_setup.__futex_wait.futex_wait.do_futex
      0.72            -0.0        0.70        perf-profile.calltrace.cycles-pp.entry_SYSRETQ_unsafe_stack.syscall
      2.16            +0.5        2.63        perf-profile.calltrace.cycles-pp.get_futex_key.futex_wait_setup.__futex_wait.futex_wait.do_futex
      2.92            +0.8        3.74        perf-profile.calltrace.cycles-pp.futex_q_lock.futex_wait_setup.__futex_wait.futex_wait.do_futex
     68.21            +1.1       69.32        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.syscall
     65.61            +1.1       66.75        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
     49.34            +1.8       51.14        perf-profile.calltrace.cycles-pp.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
     45.75            +2.0       47.78        perf-profile.calltrace.cycles-pp.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
     43.36            +2.1       45.44        perf-profile.calltrace.cycles-pp.futex_wait.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe
      3.80 ±  2%      +2.1        5.94 ±  2%  perf-profile.calltrace.cycles-pp.futex_hash.futex_wait_setup.__futex_wait.futex_wait.do_futex
     38.92            +2.3       41.20        perf-profile.calltrace.cycles-pp.__futex_wait.futex_wait.do_futex.__x64_sys_futex.do_syscall_64
     29.49            +2.6       32.08        perf-profile.calltrace.cycles-pp.futex_wait_setup.__futex_wait.futex_wait.do_futex.__x64_sys_futex
      0.00            +4.5        4.50        perf-profile.calltrace.cycles-pp.__futex_hash.futex_hash.futex_wait_setup.__futex_wait.futex_wait
      3.59 ±  2%      -0.6        2.99        perf-profile.children.cycles-pp.futex_q_unlock
     10.00            -0.6        9.43        perf-profile.children.cycles-pp.syscall_exit_to_user_mode
     16.65            -0.5       16.17        perf-profile.children.cycles-pp.entry_SYSCALL_64
      6.34            -0.3        6.08        perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
      2.41            -0.2        2.23        perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
      1.44            -0.1        1.33        perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare
      1.98            -0.1        1.89        perf-profile.children.cycles-pp.syscall_return_via_sysret
      1.51            -0.1        1.43        perf-profile.children.cycles-pp.testcase
      1.41            -0.1        1.36        perf-profile.children.cycles-pp.futex_hash_put
      2.34            -0.1        2.28        perf-profile.children.cycles-pp.x64_sys_call
      0.79            -0.0        0.74        perf-profile.children.cycles-pp.futex_setup_timer
      0.10 ±  3%      +0.0        0.12 ±  4%  perf-profile.children.cycles-pp.ktime_get_update_offsets_now
      0.16 ±  3%      +0.0        0.17 ±  2%  perf-profile.children.cycles-pp.sched_tick
      0.14 ±  9%      +0.1        0.20 ± 16%  perf-profile.children.cycles-pp.ktime_get
      0.14 ± 11%      +0.1        0.20 ± 14%  perf-profile.children.cycles-pp.clockevents_program_event
      0.34 ±  9%      +0.1        0.42 ± 12%  perf-profile.children.cycles-pp.update_process_times
      0.42 ±  8%      +0.1        0.52 ± 12%  perf-profile.children.cycles-pp.__hrtimer_run_queues
      0.42 ±  9%      +0.1        0.51 ± 13%  perf-profile.children.cycles-pp.tick_nohz_handler
      0.70 ±  7%      +0.2        0.86 ± 10%  perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
      0.74 ±  6%      +0.2        0.90 ± 10%  perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
      0.68 ±  7%      +0.2        0.84 ± 10%  perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
      0.68 ±  7%      +0.2        0.84 ± 10%  perf-profile.children.cycles-pp.hrtimer_interrupt
      2.34            +0.4        2.78        perf-profile.children.cycles-pp.get_futex_key
      3.07            +0.8        3.91        perf-profile.children.cycles-pp.futex_q_lock
     66.52            +1.1       67.64        perf-profile.children.cycles-pp.do_syscall_64
     68.64            +1.1       69.76        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
     49.87            +1.8       51.64        perf-profile.children.cycles-pp.__x64_sys_futex
     46.54            +2.0       48.55        perf-profile.children.cycles-pp.do_futex
     43.64            +2.1       45.70        perf-profile.children.cycles-pp.futex_wait
     39.38            +2.3       41.65        perf-profile.children.cycles-pp.__futex_wait
      3.95 ±  2%      +2.4        6.32 ±  2%  perf-profile.children.cycles-pp.futex_hash
     31.16            +2.6       33.80        perf-profile.children.cycles-pp.futex_wait_setup
      0.00            +4.7        4.74        perf-profile.children.cycles-pp.__futex_hash
      3.76 ±  2%      -2.2        1.56 ±  9%  perf-profile.self.cycles-pp.futex_hash
      3.36 ±  2%      -0.6        2.80 ±  2%  perf-profile.self.cycles-pp.futex_q_unlock
      8.55            -0.5        8.08        perf-profile.self.cycles-pp.syscall_exit_to_user_mode
      8.11            -0.4        7.75        perf-profile.self.cycles-pp.__futex_wait
     15.92            -0.4       15.56        perf-profile.self.cycles-pp.syscall
     14.12            -0.3       13.79        perf-profile.self.cycles-pp.futex_wait_setup
      4.08            -0.3        3.80        perf-profile.self.cycles-pp.entry_SYSCALL_64
      6.12            -0.2        5.88        perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
      3.32            -0.2        3.09        perf-profile.self.cycles-pp.__x64_sys_futex
      3.46            -0.2        3.29        perf-profile.self.cycles-pp.futex_wait
      4.49            -0.1        4.38        perf-profile.self.cycles-pp.do_syscall_64
      1.98            -0.1        1.89        perf-profile.self.cycles-pp.syscall_return_via_sysret
      1.42            -0.1        1.33        perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
      1.31            -0.1        1.23        perf-profile.self.cycles-pp.testcase
      1.14            -0.1        1.07        perf-profile.self.cycles-pp.syscall_exit_to_user_mode_prepare
      3.15            -0.1        3.08        perf-profile.self.cycles-pp.do_futex
      2.07            -0.1        2.02        perf-profile.self.cycles-pp.x64_sys_call
      0.81            -0.0        0.76        perf-profile.self.cycles-pp.futex_hash_put
      0.53            -0.0        0.50        perf-profile.self.cycles-pp.futex_setup_timer
      0.10            +0.0        0.12 ±  4%  perf-profile.self.cycles-pp.ktime_get_update_offsets_now
      0.14 ± 10%      +0.1        0.19 ± 16%  perf-profile.self.cycles-pp.ktime_get
      3.69            +0.1        3.74        perf-profile.self.cycles-pp._raw_spin_lock
      2.16            +0.4        2.60        perf-profile.self.cycles-pp.get_futex_key
      2.72            +0.7        3.42        perf-profile.self.cycles-pp.futex_q_lock
      0.00            +4.5        4.52        perf-profile.self.cycles-pp.__futex_hash



***************************************************************************************************
lkp-srf-2sp1: 256 threads 2 sockets GENUINE INTEL(R) XEON(R) (Sierra Forest) with 128G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
  gcc-12/performance/x86_64-rhel-9.4/thread/100%/debian-12-x86_64-20240206.cgz/lkp-srf-2sp1/futex2/will-it-scale

commit: 
  63e8595c06 ("futex: Allow to make the private hash immutable")
  cec199c5e3 ("futex: Implement FUTEX2_NUMA")

63e8595c060a1fef cec199c5e39bde7191a08087cc3 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
     41.17 ±  8%     -21.1%      32.50 ± 15%  perf-sched.wait_and_delay.count.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
     22531 ±  3%     +13.6%      25603 ± 18%  proc-vmstat.numa_pages_migrated
     95467 ± 22%     +48.6%     141903 ± 19%  proc-vmstat.numa_pte_updates
     22531 ±  3%     +13.6%      25603 ± 18%  proc-vmstat.pgmigrate_success
 9.212e+08            -3.4%  8.901e+08        will-it-scale.256.threads
   3598280            -3.4%    3477060        will-it-scale.per_thread_ops
 9.212e+08            -3.4%  8.901e+08        will-it-scale.workload
   7870684 ± 34%    +203.6%   23897639 ± 30%  perf-stat.i.branch-misses
     41.81 ± 55%     -20.4       21.40 ± 90%  perf-stat.i.cache-miss-rate%
      0.00 ± 34%      +0.0        0.01 ± 30%  perf-stat.overall.branch-miss-rate%
     33.72 ± 44%     -15.4       18.36 ± 69%  perf-stat.overall.cache-miss-rate%
    364549            +3.4%     376857        perf-stat.overall.path-length
   7813564 ± 34%    +204.1%   23764749 ± 30%  perf-stat.ps.branch-misses
     21.56            -2.4       19.17        perf-profile.calltrace.cycles-pp.get_user_pages_fast.get_futex_key.futex_wait_setup.__futex_wait.futex_wait
     20.55            -2.3       18.23        perf-profile.calltrace.cycles-pp.gup_fast_fallback.get_user_pages_fast.get_futex_key.futex_wait_setup.__futex_wait
     18.89            -1.6       17.27        perf-profile.calltrace.cycles-pp.gup_fast.gup_fast_fallback.get_user_pages_fast.get_futex_key.futex_wait_setup
     16.79            -1.2       15.55        perf-profile.calltrace.cycles-pp.gup_fast_pgd_range.gup_fast.gup_fast_fallback.get_user_pages_fast.get_futex_key
     23.20            -0.8       22.41        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.syscall
     12.14            -0.6       11.58        perf-profile.calltrace.cycles-pp.gup_fast_pte_range.gup_fast_pgd_range.gup_fast.gup_fast_fallback.get_user_pages_fast
      2.72 ±  2%      -0.2        2.56        perf-profile.calltrace.cycles-pp.futex_q_unlock.futex_wait_setup.__futex_wait.futex_wait.do_futex
      4.52            -0.1        4.37        perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
      2.91 ±  2%      -0.1        2.76 ±  2%  perf-profile.calltrace.cycles-pp._raw_spin_lock.futex_wait_setup.__futex_wait.futex_wait.do_futex
      3.20            -0.1        3.10        perf-profile.calltrace.cycles-pp.try_get_folio.gup_fast_pte_range.gup_fast_pgd_range.gup_fast.gup_fast_fallback
      0.88            -0.1        0.78        perf-profile.calltrace.cycles-pp.___pte_offset_map.gup_fast_pte_range.gup_fast_pgd_range.gup_fast.gup_fast_fallback
      2.98            -0.1        2.88        perf-profile.calltrace.cycles-pp.entry_SYSRETQ_unsafe_stack.syscall
      2.58            -0.1        2.50        perf-profile.calltrace.cycles-pp.futex_q_lock.futex_wait_setup.__futex_wait.futex_wait.do_futex
      1.80            -0.1        1.74        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_safe_stack.syscall
     26.00            +0.4       26.36        perf-profile.calltrace.cycles-pp.get_futex_key.futex_wait_setup.__futex_wait.futex_wait.do_futex
     65.92            +1.1       67.00        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.syscall
     64.16            +1.1       65.30        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
     57.75            +1.4       59.10        perf-profile.calltrace.cycles-pp.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
     56.92            +1.4       58.29        perf-profile.calltrace.cycles-pp.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
     56.20            +1.4       57.60        perf-profile.calltrace.cycles-pp.futex_wait.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe
     54.70            +1.5       56.16        perf-profile.calltrace.cycles-pp.__futex_wait.futex_wait.do_futex.__x64_sys_futex.do_syscall_64
     46.65            +1.6       48.28        perf-profile.calltrace.cycles-pp.futex_wait_setup.__futex_wait.futex_wait.do_futex.__x64_sys_futex
      1.72 ±  7%      +1.8        3.55 ±  3%  perf-profile.calltrace.cycles-pp.futex_hash.futex_wait_setup.__futex_wait.futex_wait.do_futex
      0.00            +3.0        3.01        perf-profile.calltrace.cycles-pp.__futex_hash.futex_hash.futex_wait_setup.__futex_wait.futex_wait
     21.79            -2.6       19.24        perf-profile.children.cycles-pp.get_user_pages_fast
     20.62            -2.3       18.30        perf-profile.children.cycles-pp.gup_fast_fallback
     18.96            -1.6       17.34        perf-profile.children.cycles-pp.gup_fast
     16.82            -1.2       15.58        perf-profile.children.cycles-pp.gup_fast_pgd_range
     12.21            -0.5       11.66        perf-profile.children.cycles-pp.gup_fast_pte_range
     15.66            -0.5       15.13        perf-profile.children.cycles-pp.entry_SYSCALL_64
     12.01            -0.4       11.62        perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
      2.76 ±  2%      -0.2        2.60        perf-profile.children.cycles-pp.futex_q_unlock
      4.59            -0.2        4.43        perf-profile.children.cycles-pp.syscall_exit_to_user_mode
      2.95 ±  2%      -0.1        2.80 ±  2%  perf-profile.children.cycles-pp._raw_spin_lock
      0.59            -0.1        0.47        perf-profile.children.cycles-pp.is_valid_gup_args
      0.91            -0.1        0.80        perf-profile.children.cycles-pp.___pte_offset_map
      3.24            -0.1        3.13        perf-profile.children.cycles-pp.try_get_folio
      2.62            -0.1        2.53        perf-profile.children.cycles-pp.futex_q_lock
      2.03            -0.1        1.96        perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
      0.41            -0.0        0.39        perf-profile.children.cycles-pp.try_grab_folio_fast
      0.50            -0.0        0.49        perf-profile.children.cycles-pp.testcase
      0.49            -0.0        0.47        perf-profile.children.cycles-pp.x64_sys_call
      0.49            -0.0        0.47        perf-profile.children.cycles-pp.syscall_return_via_sysret
      0.15            -0.0        0.14        perf-profile.children.cycles-pp.futex_setup_timer
      0.09            +0.0        0.10 ±  3%  perf-profile.children.cycles-pp.__sysvec_thermal
      0.09            +0.0        0.10 ±  3%  perf-profile.children.cycles-pp.intel_thermal_interrupt
     26.06            +0.4       26.43        perf-profile.children.cycles-pp.get_futex_key
     66.09            +1.1       67.16        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
     64.36            +1.1       65.48        perf-profile.children.cycles-pp.do_syscall_64
     57.75            +1.4       59.10        perf-profile.children.cycles-pp.__x64_sys_futex
     56.92            +1.4       58.29        perf-profile.children.cycles-pp.do_futex
     56.22            +1.4       57.62        perf-profile.children.cycles-pp.futex_wait
     55.22            +1.4       56.67        perf-profile.children.cycles-pp.__futex_wait
     46.26            +1.7       48.00        perf-profile.children.cycles-pp.futex_wait_setup
      1.75 ±  7%      +1.8        3.57 ±  3%  perf-profile.children.cycles-pp.futex_hash
      0.00            +3.0        3.04        perf-profile.children.cycles-pp.__futex_hash
      1.71 ±  7%      -1.2        0.50 ± 23%  perf-profile.self.cycles-pp.futex_hash
      4.58            -0.7        3.88 ±  2%  perf-profile.self.cycles-pp.gup_fast_pgd_range
      1.49            -0.5        0.98 ±  2%  perf-profile.self.cycles-pp.gup_fast_fallback
     11.97            -0.4       11.59        perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
      2.07            -0.4        1.69        perf-profile.self.cycles-pp.gup_fast
     11.78            -0.3       11.46        perf-profile.self.cycles-pp.syscall
      9.43            -0.3        9.12        perf-profile.self.cycles-pp.__futex_wait
      0.76            -0.3        0.47        perf-profile.self.cycles-pp.get_user_pages_fast
      7.34            -0.3        7.08        perf-profile.self.cycles-pp.gup_fast_pte_range
      7.64            -0.3        7.38        perf-profile.self.cycles-pp.entry_SYSCALL_64
      2.72 ±  2%      -0.2        2.56        perf-profile.self.cycles-pp.futex_q_unlock
      0.58            -0.1        0.44        perf-profile.self.cycles-pp.is_valid_gup_args
      4.20            -0.1        4.06        perf-profile.self.cycles-pp.syscall_exit_to_user_mode
      0.88            -0.1        0.76        perf-profile.self.cycles-pp.___pte_offset_map
      3.23            -0.1        3.11        perf-profile.self.cycles-pp.try_get_folio
      2.61            -0.1        2.52        perf-profile.self.cycles-pp.futex_q_lock
      1.53            -0.1        1.48        perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
      1.54            -0.1        1.49        perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
      1.39            -0.0        1.35        perf-profile.self.cycles-pp.do_syscall_64
      0.41            -0.0        0.37        perf-profile.self.cycles-pp.try_grab_folio_fast
      0.86            -0.0        0.83        perf-profile.self.cycles-pp.futex_wait
      0.83            -0.0        0.80        perf-profile.self.cycles-pp.__x64_sys_futex
      0.70            -0.0        0.67        perf-profile.self.cycles-pp.do_futex
      0.45            -0.0        0.44        perf-profile.self.cycles-pp.x64_sys_call
      0.47            -0.0        0.45        perf-profile.self.cycles-pp.testcase
      0.45            -0.0        0.44        perf-profile.self.cycles-pp.syscall_return_via_sysret
      4.24            +2.9        7.17        perf-profile.self.cycles-pp.get_futex_key
      0.00            +3.0        3.03        perf-profile.self.cycles-pp.__futex_hash





Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ