lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <202505151558.ce6cad02-lkp@intel.com>
Date: Thu, 15 May 2025 15:34:47 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, <linux-kernel@...r.kernel.org>,
	<x86@...nel.org>, Peter Zijlstra <peterz@...radead.org>,
	<linux-mm@...ck.org>, <oliver.sang@...el.com>
Subject: [tip:locking/futex] [futex]  bd54df5ea7:
 will-it-scale.per_thread_ops 85.7% regression



Hello,

we reported
"[tip:locking/futex] [futex]  bd54df5ea7: will-it-scale.per_thread_ops 33.9% improvement"
in
https://lore.kernel.org/all/202505131609.20984254-lkp@intel.com/
(which is also listed in this report)

now we noticed a regression from a different will-it-scale sub-test on a
different platform.


below full report FYI.


kernel test robot noticed a 85.7% regression of will-it-scale.per_thread_ops on:


commit: bd54df5ea7cadac520e346d5f0fe5d58e635b6ba ("futex: Allow to resize the private local hash")
https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git locking/futex

[test failed on linux-next/master bdd609656ff5573db9ba1d26496a528bdd297cf2]

testcase: will-it-scale
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 104 threads 2 sockets (Skylake) with 192G memory
parameters:

	nr_task: 100%
	mode: thread
	test: futex3
	cpufreq_governor: performance


In addition to that, the commit also has significant impact on the following tests:

+------------------+-----------------------------------------------------------------------------------------------+
| testcase: change | phoronix-test-suite: phoronix-test-suite.memcached.5:1.ops_sec -3.5% regression               |
| test machine     | 96 threads 2 sockets Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz (Cascade Lake) with 512G memory |
| test parameters  | cpufreq_governor=performance                                                                  |
|                  | option_a=5:1                                                                                  |
|                  | test=memcached-1.2.0                                                                          |
+------------------+-----------------------------------------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_thread_ops 33.9% improvement                                 |
| test machine     | 256 threads 2 sockets Intel(R) Xeon(R) 6767P  CPU @ 2.4GHz (Granite Rapids) with 256G memory  |
| test parameters  | cpufreq_governor=performance                                                                  |
|                  | mode=thread                                                                                   |
|                  | nr_task=100%                                                                                  |
|                  | test=pthread_mutex5                                                                           |
+------------------+-----------------------------------------------------------------------------------------------+


If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@...el.com>
| Closes: https://lore.kernel.org/oe-lkp/202505151558.ce6cad02-lkp@intel.com


Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250515/202505151558.ce6cad02-lkp@intel.com

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
  gcc-12/performance/x86_64-rhel-9.4/thread/100%/debian-12-x86_64-20240206.cgz/lkp-skl-fpga01/futex3/will-it-scale

commit: 
  7c4f75a21f ("futex: Allow automatic allocation of process wide futex hash")
  bd54df5ea7 ("futex: Allow to resize the private local hash")

7c4f75a21f636486 bd54df5ea7cadac520e346d5f0f 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
    338452           +13.8%     385002        meminfo.Shmem
    332062           +13.7%     377427        numa-meminfo.node1.Shmem
     82957           +13.7%      94282        numa-vmstat.node1.nr_shmem
     42.06 ±  2%     +40.4       82.50        mpstat.cpu.all.sys%
     56.95 ±  2%     -40.3       16.68        mpstat.cpu.all.usr%
     42.16 ±  3%     +95.3%      82.35        vmstat.cpu.sy
     56.55 ±  2%     -70.7%      16.57        vmstat.cpu.us
  79060974           -85.7%   11319830 ±  2%  will-it-scale.104.threads
    760201           -85.7%     108843 ±  2%  will-it-scale.per_thread_ops
  79060974           -85.7%   11319830 ±  2%  will-it-scale.workload
    193.33 ±  8%   +1739.5%       3556 ±  4%  perf-c2c.DRAM.remote
    576.67 ± 13%   +1875.7%      11393 ±  2%  perf-c2c.HITM.local
    177.33 ± 10%   +1821.1%       3406 ±  4%  perf-c2c.HITM.remote
    754.00 ± 11%   +1862.9%      14800 ±  2%  perf-c2c.HITM.total
    256705            +4.4%     268126        proc-vmstat.nr_active_anon
    964485            +1.2%     976128        proc-vmstat.nr_file_pages
     84606           +13.8%      96248        proc-vmstat.nr_shmem
    256705            +4.4%     268126        proc-vmstat.nr_zone_active_anon
    935363            -2.3%     913404        proc-vmstat.pgfault
   1013658            -2.0%     993799        proc-vmstat.pgfree
      0.67           +25.0%       0.83        sched_debug.cfs_rq:/.h_nr_queued.min
      0.67           +25.0%       0.83        sched_debug.cfs_rq:/.h_nr_runnable.min
      0.19 ±  5%     -11.1%       0.17 ±  7%  sched_debug.cfs_rq:/.h_nr_runnable.stddev
      6440           +26.2%       8125        sched_debug.cfs_rq:/.load.min
      6.00           +24.5%       7.47        sched_debug.cfs_rq:/.load_avg.min
      0.67           +25.0%       0.83        sched_debug.cfs_rq:/.nr_queued.min
    660.69 ±  5%     +25.4%     828.58        sched_debug.cfs_rq:/.runnable_avg.min
    124.93 ±  5%      -9.6%     112.94 ±  6%  sched_debug.cfs_rq:/.runnable_avg.stddev
    567.81 ±  8%     +25.4%     711.86 ±  6%  sched_debug.cfs_rq:/.util_avg.min
      1358 ±  3%     -12.9%       1183 ± 10%  sched_debug.cfs_rq:/.util_est.max
   1042028 ±  6%     +43.9%    1499082 ± 36%  sched_debug.cpu.avg_idle.max
     13.32 ±  7%     +71.5%      22.85 ± 10%  sched_debug.cpu.clock.stddev
      2202 ± 74%    +348.7%       9884 ± 79%  sched_debug.cpu.max_idle_balance_cost.stddev
    979.97 ±  6%     +17.8%       1154 ±  6%  sched_debug.cpu.nr_switches.min
      0.00          -100.0%       0.00        sched_debug.rt_rq:.rt_nr_running.avg
      0.17          -100.0%       0.00        sched_debug.rt_rq:.rt_nr_running.max
      0.02          -100.0%       0.00        sched_debug.rt_rq:.rt_nr_running.stddev
      0.02 ±  4%   +6643.2%       1.42        perf-stat.i.MPKI
 6.361e+09           -82.1%   1.14e+09        perf-stat.i.branch-instructions
      0.31 ±  4%      +0.4        0.70 ±  2%  perf-stat.i.branch-miss-rate%
  19810336 ±  4%     -47.2%   10450408        perf-stat.i.branch-misses
     16.06 ±  4%      +8.3       24.34        perf-stat.i.cache-miss-rate%
    789989 ±  4%   +1103.3%    9505850 ±  2%  perf-stat.i.cache-misses
   4418710 ±  2%    +782.4%   38989894        perf-stat.i.cache-references
      1803            +1.4%       1828        perf-stat.i.context-switches
      7.35          +497.9%      43.93 ±  2%  perf-stat.i.cpi
    172.02           -12.6%     150.30        perf-stat.i.cpu-migrations
    757174 ±  3%     -95.9%      30876 ±  2%  perf-stat.i.cycles-between-cache-misses
 3.938e+10           -82.8%  6.777e+09 ±  2%  perf-stat.i.instructions
      0.14           -82.1%       0.02        perf-stat.i.ipc
      2745            -1.9%       2694        perf-stat.i.minor-faults
      2745            -1.9%       2694        perf-stat.i.page-faults
      0.02 ±  4%   +6887.3%       1.40        perf-stat.overall.MPKI
      0.31 ±  5%      +0.6        0.92 ±  2%  perf-stat.overall.branch-miss-rate%
     17.85 ±  4%      +6.5       24.38        perf-stat.overall.cache-miss-rate%
      7.35          +481.6%      42.77        perf-stat.overall.cpi
    366856 ±  4%     -91.7%      30485 ±  2%  perf-stat.overall.cycles-between-cache-misses
      0.14           -82.8%       0.02        perf-stat.overall.ipc
    150266           +20.1%     180400        perf-stat.overall.path-length
  6.34e+09           -82.1%  1.135e+09        perf-stat.ps.branch-instructions
  19745655 ±  4%     -47.4%   10395167        perf-stat.ps.branch-misses
    788134 ±  4%   +1102.0%    9473157 ±  2%  perf-stat.ps.cache-misses
   4417634 ±  2%    +779.5%   38852012        perf-stat.ps.cache-references
      1797            +1.3%       1821        perf-stat.ps.context-switches
    171.49           -12.7%     149.66        perf-stat.ps.cpu-migrations
 3.925e+10           -82.8%   6.75e+09        perf-stat.ps.instructions
      2735            -2.2%       2674        perf-stat.ps.minor-faults
      2735            -2.2%       2674        perf-stat.ps.page-faults
 1.188e+13           -82.8%  2.042e+12 ±  2%  perf-stat.total.instructions
     29.53           -22.5        7.01        perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.syscall
     14.37 ± 11%     -11.9        2.44 ±  4%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_safe_stack.syscall
     18.82 ±  2%     -10.9        7.92        perf-profile.calltrace.cycles-pp.entry_SYSRETQ_unsafe_stack.syscall
      9.67            -8.6        1.08 ±  2%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.syscall
      0.00            +0.8        0.81 ±  7%  perf-profile.calltrace.cycles-pp.write.writen.record__pushfn.perf_mmap__push.record__mmap_read_evlist
      0.00            +0.8        0.81 ±  6%  perf-profile.calltrace.cycles-pp.writen.record__pushfn.perf_mmap__push.record__mmap_read_evlist.__cmd_record
      0.00            +0.8        0.81 ±  6%  perf-profile.calltrace.cycles-pp.record__pushfn.perf_mmap__push.record__mmap_read_evlist.__cmd_record.cmd_record
      0.00            +0.9        0.86 ±  6%  perf-profile.calltrace.cycles-pp.perf_mmap__push.record__mmap_read_evlist.__cmd_record.cmd_record.run_builtin
      0.00            +0.9        0.87 ±  6%  perf-profile.calltrace.cycles-pp.__cmd_record.cmd_record.run_builtin.handle_internal_command.main
      0.00            +0.9        0.87 ±  6%  perf-profile.calltrace.cycles-pp.cmd_record.run_builtin.handle_internal_command.main
      0.00            +0.9        0.87 ±  6%  perf-profile.calltrace.cycles-pp.record__mmap_read_evlist.__cmd_record.cmd_record.run_builtin.handle_internal_command
      0.00            +0.9        0.88 ±  6%  perf-profile.calltrace.cycles-pp.handle_internal_command.main
      0.00            +0.9        0.88 ±  6%  perf-profile.calltrace.cycles-pp.main
      0.00            +0.9        0.88 ±  6%  perf-profile.calltrace.cycles-pp.run_builtin.handle_internal_command.main
      0.00           +24.6       24.64        perf-profile.calltrace.cycles-pp.futex_hash_put.futex_wake.do_futex.__x64_sys_futex.do_syscall_64
      1.26 ±  2%     +45.3       46.57        perf-profile.calltrace.cycles-pp.futex_hash.futex_wake.do_futex.__x64_sys_futex.do_syscall_64
     30.29           +48.4       78.64        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.syscall
     14.49 ±  3%     +60.5       74.96        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
      6.40 ±  2%     +68.0       74.41        perf-profile.calltrace.cycles-pp.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
      5.32 ±  4%     +69.0       74.28        perf-profile.calltrace.cycles-pp.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
      4.52 ±  5%     +69.7       74.18        perf-profile.calltrace.cycles-pp.futex_wake.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe
     29.72           -22.5        7.22        perf-profile.children.cycles-pp.syscall_return_via_sysret
     15.01 ±  6%     -12.7        2.28        perf-profile.children.cycles-pp.entry_SYSCALL_64
     19.96 ±  2%     -11.6        8.41        perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
      7.67 ± 10%      -6.4        1.28 ±  3%  perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
      2.57 ±  7%      -2.3        0.27 ±  2%  perf-profile.children.cycles-pp.syscall_exit_to_user_mode
      1.99 ± 11%      -1.8        0.22 ±  2%  perf-profile.children.cycles-pp.get_futex_key
      0.43 ±  2%      -0.4        0.07 ±  5%  perf-profile.children.cycles-pp.x64_sys_call
      0.15 ±  5%      +0.0        0.17 ±  4%  perf-profile.children.cycles-pp.update_process_times
      0.17 ±  5%      +0.0        0.20 ±  4%  perf-profile.children.cycles-pp.tick_nohz_handler
      0.09 ±  8%      +0.0        0.12 ±  4%  perf-profile.children.cycles-pp.sched_tick
      0.02 ±141%      +0.1        0.07 ±  5%  perf-profile.children.cycles-pp.task_tick_fair
      0.21 ±  6%      +0.1        0.27 ±  3%  perf-profile.children.cycles-pp.__hrtimer_run_queues
      0.24 ±  6%      +0.1        0.32        perf-profile.children.cycles-pp.hrtimer_interrupt
      0.24 ±  7%      +0.1        0.32        perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
      0.26 ±  6%      +0.1        0.35        perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
      0.32 ±  6%      +0.1        0.41        perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
      0.00            +0.2        0.18 ± 13%  perf-profile.children.cycles-pp.generic_perform_write
      0.02 ± 99%      +0.2        0.26 ± 12%  perf-profile.children.cycles-pp.vfs_write
      0.00            +0.2        0.25 ± 13%  perf-profile.children.cycles-pp.shmem_file_write_iter
      0.02 ± 99%      +0.3        0.30 ± 10%  perf-profile.children.cycles-pp.ksys_write
      0.04 ± 71%      +0.8        0.83 ±  7%  perf-profile.children.cycles-pp.write
      0.00            +0.8        0.82 ±  6%  perf-profile.children.cycles-pp.record__pushfn
      0.00            +0.8        0.82 ±  6%  perf-profile.children.cycles-pp.writen
      0.00            +0.9        0.86 ±  6%  perf-profile.children.cycles-pp.perf_mmap__push
      0.00            +0.9        0.87 ±  6%  perf-profile.children.cycles-pp.__cmd_record
      0.00            +0.9        0.87 ±  6%  perf-profile.children.cycles-pp.cmd_record
      0.00            +0.9        0.87 ±  6%  perf-profile.children.cycles-pp.record__mmap_read_evlist
      0.00            +0.9        0.88 ±  6%  perf-profile.children.cycles-pp.handle_internal_command
      0.00            +0.9        0.88 ±  6%  perf-profile.children.cycles-pp.main
      0.00            +0.9        0.88 ±  6%  perf-profile.children.cycles-pp.run_builtin
      0.00           +24.7       24.66        perf-profile.children.cycles-pp.futex_hash_put
      1.26 ±  2%     +45.3       46.60        perf-profile.children.cycles-pp.futex_hash
     30.69           +48.5       79.21        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
     14.68 ±  3%     +60.7       75.36        perf-profile.children.cycles-pp.do_syscall_64
      6.49 ±  2%     +67.9       74.41        perf-profile.children.cycles-pp.__x64_sys_futex
      5.40 ±  4%     +68.9       74.30        perf-profile.children.cycles-pp.do_futex
      4.68 ±  4%     +69.5       74.20        perf-profile.children.cycles-pp.futex_wake
     29.66           -22.4        7.22        perf-profile.self.cycles-pp.syscall_return_via_sysret
     16.27 ±  2%     -12.4        3.88        perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
     13.65 ±  6%     -11.5        2.10        perf-profile.self.cycles-pp.entry_SYSCALL_64
     19.78 ±  2%     -11.4        8.38        perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
      4.86            -4.7        0.20 ±  4%  perf-profile.self.cycles-pp.do_syscall_64
      2.30 ±  8%      -2.1        0.24 ±  2%  perf-profile.self.cycles-pp.syscall_exit_to_user_mode
      1.94 ± 11%      -1.7        0.22 ±  2%  perf-profile.self.cycles-pp.get_futex_key
      2.60            -1.5        1.15 ±  3%  perf-profile.self.cycles-pp.syscall
      1.07 ±  2%      -1.0        0.12 ±  4%  perf-profile.self.cycles-pp.__x64_sys_futex
      0.97 ±  6%      -0.9        0.09 ±  4%  perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
      0.74 ±  2%      -0.6        0.10        perf-profile.self.cycles-pp.do_futex
      0.40            -0.3        0.06 ±  8%  perf-profile.self.cycles-pp.x64_sys_call
      1.48            +1.2        2.70 ±  6%  perf-profile.self.cycles-pp.futex_wake
      0.00           +24.6       24.57        perf-profile.self.cycles-pp.futex_hash_put
      1.15 ±  2%     +45.3       46.42        perf-profile.self.cycles-pp.futex_hash
      0.00 ±223%  +28429.2%       1.14 ± 57%  perf-sched.sch_delay.avg.ms.__cond_resched.__kmalloc_cache_noprof.perf_event_mmap_event.perf_event_mmap.__mmap_region
      0.08 ± 12%    +159.1%       0.21 ± 11%  perf-sched.sch_delay.avg.ms.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity
      0.01 ±100%   +7759.4%       0.90 ± 45%  perf-sched.sch_delay.avg.ms.__cond_resched.kmem_cache_alloc_noprof.getname_flags.part.0
      0.66 ± 91%    +263.5%       2.38 ± 13%  perf-sched.sch_delay.avg.ms.__cond_resched.zap_pte_range.zap_pmd_range.isra.0
      0.02 ± 31%   +2073.5%       0.49 ±163%  perf-sched.sch_delay.avg.ms.anon_pipe_read.fifo_pipe_read.vfs_read.ksys_read
      0.01 ±  3%     +97.3%       0.02 ± 15%  perf-sched.sch_delay.avg.ms.irq_thread.kthread.ret_from_fork.ret_from_fork_asm
      0.68 ± 57%    +142.9%       1.66 ± 13%  perf-sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_exc_page_fault.[unknown].[unknown]
      2.89          +107.9%       6.01 ± 45%  perf-sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
      0.08 ± 35%    +109.0%       0.16 ± 26%  perf-sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown].[unknown]
      0.02 ±  4%     +80.4%       0.03 ±  4%  perf-sched.sch_delay.avg.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
      0.09 ± 25%    +284.0%       0.33 ± 53%  perf-sched.sch_delay.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
      0.01 ± 11%     +36.0%       0.02 ±  3%  perf-sched.sch_delay.avg.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
      0.01 ±  8%     +69.7%       0.03 ±  8%  perf-sched.sch_delay.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
      0.55 ± 11%     +64.0%       0.91 ±  5%  perf-sched.sch_delay.avg.ms.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
      0.58 ± 10%     +33.5%       0.77 ±  7%  perf-sched.sch_delay.avg.ms.wait_for_partner.fifo_open.do_dentry_open.vfs_open
      0.03 ± 13%   +2621.6%       0.86 ± 46%  perf-sched.sch_delay.avg.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
      0.00 ±223%  +35991.7%       1.44 ± 62%  perf-sched.sch_delay.max.ms.__cond_resched.__kmalloc_cache_noprof.perf_event_mmap_event.perf_event_mmap.__mmap_region
      2.88 ± 29%    +236.6%       9.70 ± 65%  perf-sched.sch_delay.max.ms.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity
      0.62 ±160%    +283.8%       2.39 ± 28%  perf-sched.sch_delay.max.ms.__cond_resched.down_read.walk_component.link_path_walk.part
      0.01 ±100%  +17714.5%       2.05 ± 47%  perf-sched.sch_delay.max.ms.__cond_resched.kmem_cache_alloc_noprof.getname_flags.part.0
      2.10 ± 48%   +8993.6%     190.59 ±198%  perf-sched.sch_delay.max.ms.anon_pipe_read.fifo_pipe_read.vfs_read.ksys_read
      0.02 ±  8%    +176.3%       0.04 ± 17%  perf-sched.sch_delay.max.ms.irq_thread.kthread.ret_from_fork.ret_from_fork_asm
      0.55 ±187%    +444.1%       3.00        perf-sched.sch_delay.max.ms.irqentry_exit_to_user_mode.asm_exc_page_fault.[unknown]
      3.79 ± 16%    +470.6%      21.63 ± 27%  perf-sched.sch_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
      8.61 ± 71%    +162.4%      22.59 ±  6%  perf-sched.sch_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown]
      2.83 ± 13%    +135.1%       6.66 ± 85%  perf-sched.sch_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown].[unknown]
      0.02 ±  2%    +142.6%       0.05 ±  7%  perf-sched.sch_delay.max.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
      0.34 ±130%   +1108.0%       4.16 ± 52%  perf-sched.sch_delay.max.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
     12.56 ± 49%    +140.3%      30.18 ± 19%  perf-sched.sch_delay.max.ms.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
      4.20 ±  6%     +57.8%       6.63 ± 25%  perf-sched.sch_delay.max.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
      0.55 ±  3%     +30.8%       0.72 ±  8%  perf-sched.total_sch_delay.average.ms
     17.06 ±  5%   +1093.7%     203.68 ±182%  perf-sched.total_sch_delay.max.ms
     81.49 ±  4%     -13.7%      70.33 ±  5%  perf-sched.total_wait_and_delay.average.ms
     10823 ±  4%     +17.9%      12764 ±  4%  perf-sched.total_wait_and_delay.count.ms
     80.94 ±  4%     -14.0%      69.61 ±  5%  perf-sched.total_wait_time.average.ms
      5.78          +108.0%      12.02 ± 45%  perf-sched.wait_and_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
      0.16 ± 34%    +109.0%       0.33 ± 26%  perf-sched.wait_and_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown].[unknown]
    394.00 ±  3%     -87.1%      50.77 ± 58%  perf-sched.wait_and_delay.avg.ms.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
      4.09 ± 18%    +203.9%      12.42 ± 13%  perf-sched.wait_and_delay.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
     11.44 ±  5%     -48.7%       5.87 ±  6%  perf-sched.wait_and_delay.avg.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
    544.37 ±  3%     -15.3%     461.13 ±  6%  perf-sched.wait_and_delay.avg.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
     99.17 ±  4%      -9.9%      89.33 ±  4%  perf-sched.wait_and_delay.count.__cond_resched.stop_one_cpu.sched_exec.bprm_execve.part
    112.33 ±  3%     -12.5%      98.33 ±  6%  perf-sched.wait_and_delay.count.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
    577.00 ± 18%     -52.7%     273.00 ± 50%  perf-sched.wait_and_delay.count.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
      1463 ±  4%     -58.2%     611.83 ±  4%  perf-sched.wait_and_delay.count.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown]
    624.17 ± 31%     -59.2%     254.83 ± 30%  perf-sched.wait_and_delay.count.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown].[unknown]
     13.00 ±  7%   +1356.4%     189.33 ± 47%  perf-sched.wait_and_delay.count.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
    412.17 ±  5%    +105.6%     847.33 ±  7%  perf-sched.wait_and_delay.count.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
      3134 ± 12%     +74.3%       5464 ±  6%  perf-sched.wait_and_delay.count.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
    373.67 ±  9%     +36.1%     508.67 ± 13%  perf-sched.wait_and_delay.count.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
    537.63 ±  3%    +124.4%       1206 ± 35%  perf-sched.wait_and_delay.max.ms.do_task_dead.do_exit.do_group_exit.__x64_sys_exit_group.x64_sys_call
      7.58 ± 16%    +470.6%      43.26 ± 27%  perf-sched.wait_and_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
     17.22 ± 71%    +162.4%      45.18 ±  6%  perf-sched.wait_and_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown]
      5.66 ± 13%    +135.1%      13.32 ± 85%  perf-sched.wait_and_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown].[unknown]
    306.84 ± 13%     -23.0%     236.34 ± 15%  perf-sched.wait_and_delay.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
      0.00 ±223%  +28429.2%       1.14 ± 57%  perf-sched.wait_time.avg.ms.__cond_resched.__kmalloc_cache_noprof.perf_event_mmap_event.perf_event_mmap.__mmap_region
      0.01 ±100%   +6763.8%       0.79 ± 58%  perf-sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc_noprof.getname_flags.part.0
      0.58 ± 94%    +303.8%       2.34 ± 17%  perf-sched.wait_time.avg.ms.__cond_resched.zap_pte_range.zap_pmd_range.isra.0
      0.68 ± 56%    +135.8%       1.59 ± 17%  perf-sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_exc_page_fault.[unknown].[unknown]
      2.89          +108.0%       6.01 ± 45%  perf-sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
      0.08 ± 35%    +109.9%       0.16 ± 26%  perf-sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown].[unknown]
    393.72 ±  3%     -87.1%      50.73 ± 58%  perf-sched.wait_time.avg.ms.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
      4.00 ± 18%    +202.2%      12.10 ± 12%  perf-sched.wait_time.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
     11.43 ±  5%     -48.8%       5.85 ±  7%  perf-sched.wait_time.avg.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
    544.34 ±  3%     -15.4%     460.27 ±  6%  perf-sched.wait_time.avg.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
      0.00 ±223%  +35991.7%       1.44 ± 62%  perf-sched.wait_time.max.ms.__cond_resched.__kmalloc_cache_noprof.perf_event_mmap_event.perf_event_mmap.__mmap_region
      0.62 ±160%    +283.8%       2.39 ± 28%  perf-sched.wait_time.max.ms.__cond_resched.down_read.walk_component.link_path_walk.part
      0.01 ±100%  +16615.9%       1.92 ± 60%  perf-sched.wait_time.max.ms.__cond_resched.kmem_cache_alloc_noprof.getname_flags.part.0
    537.63 ±  3%    +124.4%       1206 ± 35%  perf-sched.wait_time.max.ms.do_task_dead.do_exit.do_group_exit.__x64_sys_exit_group.x64_sys_call
      0.55 ±187%    +445.6%       3.00        perf-sched.wait_time.max.ms.irqentry_exit_to_user_mode.asm_exc_page_fault.[unknown]
      3.79 ± 16%    +470.6%      21.63 ± 27%  perf-sched.wait_time.max.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
      8.61 ± 71%    +162.4%      22.59 ±  6%  perf-sched.wait_time.max.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown]
      2.83 ± 13%    +135.1%       6.66 ± 85%  perf-sched.wait_time.max.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown].[unknown]
    306.82 ± 13%     -23.0%     236.31 ± 15%  perf-sched.wait_time.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread


***************************************************************************************************
lkp-csl-2sp7: 96 threads 2 sockets Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz (Cascade Lake) with 512G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/option_a/rootfs/tbox_group/test/testcase:
  gcc-12/performance/x86_64-rhel-9.4/5:1/debian-12-x86_64-phoronix/lkp-csl-2sp7/memcached-1.2.0/phoronix-test-suite

commit: 
  7c4f75a21f ("futex: Allow automatic allocation of process wide futex hash")
  bd54df5ea7 ("futex: Allow to resize the private local hash")

7c4f75a21f636486 bd54df5ea7cadac520e346d5f0f 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
  71701486 ± 12%     -29.3%   50671505 ±  3%  cpuidle..usage
      5.83 ±  4%      -0.5        5.31        mpstat.cpu.all.usr%
   1025120 ±  9%     -27.8%     740162 ±  2%  vmstat.system.cs
    259599 ±  5%     -21.3%     204388        vmstat.system.in
    669110            -3.5%     646020        phoronix-test-suite.memcached.5:1.ops_sec
     90456 ± 15%     -44.7%      50004 ±  4%  phoronix-test-suite.time.involuntary_context_switches
    225.50 ±  3%      -6.1%     211.67        phoronix-test-suite.time.percent_of_cpu_this_job_got
    274.36 ±  3%      -6.7%     256.01        phoronix-test-suite.time.user_time
  26283864            -5.1%   24931229        phoronix-test-suite.time.voluntary_context_switches
      0.12 ±152%     -97.9%       0.00 ±223%  perf-sched.sch_delay.avg.ms.__cond_resched.__release_sock.release_sock.tcp_recvmsg.inet6_recvmsg
      0.40 ± 40%     +58.5%       0.63 ± 30%  perf-sched.sch_delay.max.ms.__cond_resched.__lock_sock_fast.tcp_ioctl.sk_ioctl.sock_do_ioctl
      0.12 ±152%     -97.9%       0.00 ±223%  perf-sched.sch_delay.max.ms.__cond_resched.__release_sock.release_sock.tcp_recvmsg.inet6_recvmsg
      0.93 ± 50%    -100.0%       0.00        perf-sched.wait_and_delay.avg.ms.__cond_resched.__release_sock.release_sock.tcp_recvmsg.inet6_recvmsg
     26.12 ±  7%     +18.3%      30.90 ±  4%  perf-sched.wait_and_delay.avg.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
      1.00          -100.0%       0.00        perf-sched.wait_and_delay.count.__cond_resched.__release_sock.release_sock.tcp_recvmsg.inet6_recvmsg
    182.67 ±  4%     -13.4%     158.17 ±  4%  perf-sched.wait_and_delay.count.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
      0.93 ± 50%    -100.0%       0.00        perf-sched.wait_and_delay.max.ms.__cond_resched.__release_sock.release_sock.tcp_recvmsg.inet6_recvmsg
      0.81 ± 36%     -92.3%       0.06 ±223%  perf-sched.wait_time.avg.ms.__cond_resched.__release_sock.release_sock.tcp_recvmsg.inet6_recvmsg
     26.09 ±  7%     +18.3%      30.87 ±  4%  perf-sched.wait_time.avg.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
      2.05 ±  9%     +33.5%       2.74 ± 14%  perf-sched.wait_time.max.ms.__cond_resched.__lock_sock_fast.tcp_ioctl.sk_ioctl.sock_do_ioctl
      0.81 ± 36%     -92.3%       0.06 ±223%  perf-sched.wait_time.max.ms.__cond_resched.__release_sock.release_sock.tcp_recvmsg.inet6_recvmsg
    183.75 ±  6%     -16.6%     153.29 ±  8%  sched_debug.cfs_rq:/.runnable_avg.stddev
    176.68 ±  7%     -17.1%     146.40 ±  9%  sched_debug.cfs_rq:/.util_avg.stddev
    185.68 ±  6%     -17.3%     153.53 ±  8%  sched_debug.cfs_rq:/system.slice.runnable_avg.stddev
    185.56 ±  6%     -17.4%     153.35 ±  8%  sched_debug.cfs_rq:/system.slice.se->avg.runnable_avg.stddev
    178.74 ±  7%     -17.7%     147.03 ±  9%  sched_debug.cfs_rq:/system.slice.se->avg.util_avg.stddev
    178.74 ±  7%     -17.7%     147.04 ±  9%  sched_debug.cfs_rq:/system.slice.util_avg.stddev
     10.19 ± 28%     +75.4%      17.88 ± 25%  sched_debug.cfs_rq:/system.slice/containerd.service.tg_load_avg.avg
     12.79 ± 35%     +78.2%      22.79 ± 25%  sched_debug.cfs_rq:/system.slice/containerd.service.tg_load_avg.max
      8.62 ± 28%     +76.3%      15.21 ± 31%  sched_debug.cfs_rq:/system.slice/containerd.service.tg_load_avg.min
      1.17 ± 43%     +88.1%       2.21 ± 29%  sched_debug.cfs_rq:/system.slice/containerd.service.tg_load_avg.stddev
    840226 ± 12%     -25.1%     628975 ±  2%  sched_debug.cpu.nr_switches.avg
    865809 ± 12%     -24.3%     655733 ±  2%  sched_debug.cpu.nr_switches.max
    729559 ± 11%     -23.2%     560221 ±  3%  sched_debug.cpu.nr_switches.min
      0.00            +0.7        0.69 ±  5%  perf-profile.calltrace.cycles-pp.futex_hash_put.futex_wake.do_futex.__x64_sys_futex.do_syscall_64
      0.00            +1.5        1.45 ±  4%  perf-profile.calltrace.cycles-pp.futex_hash.futex_wait_setup.__futex_wait.futex_wait.do_futex
      0.00            +1.5        1.49 ±  4%  perf-profile.calltrace.cycles-pp.futex_hash.futex_wake.do_futex.__x64_sys_futex.do_syscall_64
     83.64 ±  2%      -2.3       81.29        perf-profile.children.cycles-pp._raw_spin_lock
     83.12 ±  2%      -2.3       80.84        perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
      0.41 ± 11%      -0.1        0.34 ±  7%  perf-profile.children.cycles-pp.futex_q_lock
      0.52 ±  4%      -0.0        0.48 ±  4%  perf-profile.children.cycles-pp.schedule_hrtimeout_range_clock
      0.00            +0.1        0.08 ±  8%  perf-profile.children.cycles-pp.futex_unqueue
      0.18 ± 14%      +0.4        0.53 ±  7%  perf-profile.children.cycles-pp.futex_q_unlock
      0.00            +1.2        1.18 ±  5%  perf-profile.children.cycles-pp.futex_hash_put
      0.20 ± 78%      +2.7        2.94 ±  4%  perf-profile.children.cycles-pp.futex_hash
     82.56 ±  2%      -2.3       80.29        perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
      0.40 ± 12%      -0.1        0.34 ±  6%  perf-profile.self.cycles-pp.futex_q_lock
      0.18 ± 17%      +0.4        0.53 ±  6%  perf-profile.self.cycles-pp.futex_q_unlock
      0.00            +1.2        1.17 ±  5%  perf-profile.self.cycles-pp.futex_hash_put
      0.19 ± 81%      +2.7        2.92 ±  4%  perf-profile.self.cycles-pp.futex_hash
      2.84 ±  2%      +5.9%       3.01        perf-stat.i.MPKI
 7.081e+09 ±  2%      -7.5%  6.548e+09        perf-stat.i.branch-instructions
  74795847 ±  5%     -12.2%   65682517        perf-stat.i.branch-misses
     26.36            +1.2       27.52        perf-stat.i.cache-miss-rate%
   1041600 ±  9%     -27.9%     751158 ±  2%  perf-stat.i.context-switches
      6.57 ±  2%      +7.7%       7.07        perf-stat.i.cpi
     40782 ± 12%     -26.8%      29848 ±  4%  perf-stat.i.cpu-migrations
   7021013 ±  3%     -13.0%    6107192 ±  2%  perf-stat.i.dTLB-load-misses
 8.047e+09 ±  2%      -8.4%  7.373e+09        perf-stat.i.dTLB-loads
 2.876e+09 ±  5%     -12.8%  2.508e+09        perf-stat.i.dTLB-stores
     37.20 ±  2%      -2.5       34.66        perf-stat.i.iTLB-load-miss-rate%
  14970466 ±  7%     -18.2%   12253183        perf-stat.i.iTLB-load-misses
  31644049 ±  3%      -4.6%   30187584        perf-stat.i.iTLB-loads
 3.132e+10 ±  2%      -8.1%  2.879e+10        perf-stat.i.instructions
      2252 ±  4%      +7.8%       2427        perf-stat.i.instructions-per-iTLB-miss
      0.21            -5.7%       0.20        perf-stat.i.ipc
    190.02 ±  2%      -8.7%     173.48        perf-stat.i.metric.M/sec
      2.46 ±  2%      +7.5%       2.64 ±  2%  perf-stat.overall.MPKI
     29.56            +1.3       30.82        perf-stat.overall.cache-miss-rate%
      6.73 ±  2%      +8.7%       7.31        perf-stat.overall.cpi
      0.09 ±  2%      -0.0        0.08 ±  2%  perf-stat.overall.dTLB-load-miss-rate%
     32.07 ±  3%      -3.2       28.87        perf-stat.overall.iTLB-load-miss-rate%
      2100 ±  5%     +11.9%       2350        perf-stat.overall.instructions-per-iTLB-miss
      0.15 ±  2%      -8.0%       0.14        perf-stat.overall.ipc
 7.056e+09 ±  2%      -7.5%  6.527e+09        perf-stat.ps.branch-instructions
  74481811 ±  5%     -12.2%   65413115        perf-stat.ps.branch-misses
   1037883 ±  9%     -27.9%     748724 ±  2%  perf-stat.ps.context-switches
     40641 ± 12%     -26.8%      29752 ±  4%  perf-stat.ps.cpu-migrations
   6996630 ±  3%     -13.0%    6086667 ±  2%  perf-stat.ps.dTLB-load-misses
 8.019e+09 ±  2%      -8.4%  7.349e+09        perf-stat.ps.dTLB-loads
 2.866e+09 ±  5%     -12.8%    2.5e+09        perf-stat.ps.dTLB-stores
  14916761 ±  7%     -18.1%   12212715        perf-stat.ps.iTLB-load-misses
  31534987 ±  3%      -4.6%   30091319        perf-stat.ps.iTLB-loads
 3.121e+10 ±  2%      -8.0%   2.87e+10        perf-stat.ps.instructions
  7.11e+12 ±  2%      -8.0%  6.543e+12        perf-stat.total.instructions



***************************************************************************************************
lkp-gnr-2sp3: 256 threads 2 sockets Intel(R) Xeon(R) 6767P  CPU @ 2.4GHz (Granite Rapids) with 256G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
  gcc-12/performance/x86_64-rhel-9.4/thread/100%/debian-12-x86_64-20240206.cgz/lkp-gnr-2sp3/pthread_mutex5/will-it-scale

commit: 
  7c4f75a21f ("futex: Allow automatic allocation of process wide futex hash")
  bd54df5ea7 ("futex: Allow to resize the private local hash")

7c4f75a21f636486 bd54df5ea7cadac520e346d5f0f 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
  23570282           -32.6%   15883630 ±  2%  cpuidle..usage
   1862635            -9.3%    1689404        meminfo.Shmem
      2110           +19.0%       2512 ±  3%  perf-c2c.DRAM.local
      0.16 ±  4%      -0.1        0.08 ±  4%  mpstat.cpu.all.soft%
      0.63            -0.2        0.46 ±  3%  mpstat.cpu.all.usr%
   1264859 ±  2%     -47.5%     664434 ± 62%  numa-vmstat.node1.nr_file_pages
     38897 ± 10%     -47.8%      20323 ± 48%  numa-vmstat.node1.nr_mapped
    206687           -33.5%     137401 ±  2%  vmstat.system.cs
    427708            -8.0%     393532        vmstat.system.in
   5060133 ±  2%     -47.5%    2658326 ± 62%  numa-meminfo.node1.FilePages
    158778 ± 10%     -48.5%      81837 ± 46%  numa-meminfo.node1.Mapped
   6620342 ±  2%     -38.3%    4086741 ± 37%  numa-meminfo.node1.MemUsed
   9566224           +33.9%   12810946        will-it-scale.256.threads
      0.18           -11.1%       0.16        will-it-scale.256.threads_idle
     37367           +33.9%      50042        will-it-scale.per_thread_ops
   9566224           +33.9%   12810946        will-it-scale.workload
      0.00 ± 15%     +29.7%       0.00 ± 15%  sched_debug.cpu.next_balance.stddev
    124704           -33.5%      82964 ±  2%  sched_debug.cpu.nr_switches.avg
    230832 ± 52%     -38.2%     142628 ±  5%  sched_debug.cpu.nr_switches.max
     98911 ±  4%     -33.7%      65543 ±  3%  sched_debug.cpu.nr_switches.min
     17307 ± 60%     -47.4%       9105 ± 20%  sched_debug.cpu.nr_switches.stddev
    672002            -6.5%     628169        proc-vmstat.nr_active_anon
   1345624            -3.2%    1302363        proc-vmstat.nr_file_pages
     41725 ±  7%     -16.3%      34939 ± 12%  proc-vmstat.nr_mapped
    465688            -9.3%     422425        proc-vmstat.nr_shmem
    672002            -6.5%     628169        proc-vmstat.nr_zone_active_anon
   1956811            -2.5%    1908264        proc-vmstat.numa_hit
   1692181            -2.8%    1644262        proc-vmstat.numa_local
      0.20            +4.3%       0.21        perf-stat.i.MPKI
      0.05            -0.0        0.05        perf-stat.i.branch-miss-rate%
   9101814           -10.3%    8161953        perf-stat.i.branch-misses
  14404131            +3.7%   14939924        perf-stat.i.cache-misses
    207911           -33.5%     138184 ±  2%  perf-stat.i.context-switches
     65204            -4.0%      62625        perf-stat.i.cycles-between-cache-misses
      0.01           -95.2%       0.00 ±223%  perf-stat.i.metric.K/sec
      0.20            +4.2%       0.21        perf-stat.overall.MPKI
      0.05            -0.0        0.05        perf-stat.overall.branch-miss-rate%
     63438            -3.5%      61223        perf-stat.overall.cycles-between-cache-misses
   2250086           -25.7%    1671327        perf-stat.overall.path-length
   9086343           -10.4%    8139691        perf-stat.ps.branch-misses
  14400345            +3.6%   14922252        perf-stat.ps.cache-misses
    207422           -33.5%     137839 ±  2%  perf-stat.ps.context-switches
      0.16           +99.2%       0.32 ± 95%  perf-sched.sch_delay.avg.ms.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity
      1.66 ± 12%     +17.5%       1.95 ±  3%  perf-sched.sch_delay.avg.ms.__cond_resched.generic_perform_write.shmem_file_write_iter.vfs_write.ksys_write
      0.08 ±  8%     +37.8%       0.12 ± 20%  perf-sched.sch_delay.avg.ms.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
      0.01 ± 12%     +47.5%       0.01 ±  5%  perf-sched.sch_delay.avg.ms.futex_do_wait.__futex_wait.futex_wait.do_futex
      0.09 ±166%   +1763.7%       1.74 ± 65%  perf-sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
      0.09           +16.3%       0.11 ±  3%  perf-sched.sch_delay.avg.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
      2.98 ± 14%     +28.2%       3.83 ±  4%  perf-sched.sch_delay.max.ms.__cond_resched.shmem_inode_acct_blocks.shmem_alloc_and_add_folio.shmem_get_folio_gfp.shmem_write_begin
      0.18 ±  5%    +248.1%       0.61 ± 63%  perf-sched.sch_delay.max.ms.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
      0.15 ±186%   +1714.0%       2.76 ± 49%  perf-sched.sch_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
      0.01 ± 12%     +45.3%       0.02 ±  5%  perf-sched.total_sch_delay.average.ms
      2.91 ±  2%     +61.4%       4.69 ±  4%  perf-sched.total_wait_and_delay.average.ms
    556081 ±  2%     -37.0%     350186 ±  2%  perf-sched.total_wait_and_delay.count.ms
      2.89 ±  2%     +61.5%       4.67 ±  4%  perf-sched.total_wait_time.average.ms
      0.01 ±  6%     +35.6%       0.02 ±  3%  perf-sched.wait_and_delay.avg.ms.futex_do_wait.__futex_wait.futex_wait.do_futex
     18.90 ±  3%     -15.5%      15.98        perf-sched.wait_and_delay.avg.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
    541651 ±  2%     -37.0%     341352 ±  2%  perf-sched.wait_and_delay.count.futex_do_wait.__futex_wait.futex_wait.do_futex
     11.50 ± 18%     -84.1%       1.83 ±223%  perf-sched.wait_and_delay.count.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
    253.67 ±  3%     +17.1%     297.00        perf-sched.wait_and_delay.count.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
      0.09 ±166%   +1763.7%       1.74 ± 65%  perf-sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
     18.79 ±  3%     -15.6%      15.85        perf-sched.wait_time.avg.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
      0.15 ±186%   +1714.0%       2.76 ± 49%  perf-sched.wait_time.max.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
     43.55            -1.5       42.06        perf-profile.calltrace.cycles-pp._raw_spin_lock.futex_wait_setup.__futex_wait.futex_wait.do_futex
     43.54            -1.5       42.04        perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.futex_wait_setup.__futex_wait.futex_wait
     43.83            -1.3       42.54        perf-profile.calltrace.cycles-pp.__futex_wait.futex_wait.do_futex.__x64_sys_futex.do_syscall_64
     43.83            -1.3       42.54        perf-profile.calltrace.cycles-pp.futex_wait.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe
     43.76            -1.3       42.48        perf-profile.calltrace.cycles-pp.futex_wait_setup.__futex_wait.futex_wait.do_futex.__x64_sys_futex
     99.06            +0.2       99.25        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe
     99.05            +0.2       99.24        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe
     99.03            +0.2       99.22        perf-profile.calltrace.cycles-pp.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe
     99.02            +0.2       99.22        perf-profile.calltrace.cycles-pp.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe
     54.99            +1.1       56.14        perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.futex_wake.do_futex.__x64_sys_futex
     55.02            +1.2       56.21        perf-profile.calltrace.cycles-pp._raw_spin_lock.futex_wake.do_futex.__x64_sys_futex.do_syscall_64
     55.19            +1.5       56.68        perf-profile.calltrace.cycles-pp.futex_wake.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe
     43.83            -1.3       42.54        perf-profile.children.cycles-pp.__futex_wait
     43.83            -1.3       42.54        perf-profile.children.cycles-pp.futex_wait
     43.76            -1.3       42.48        perf-profile.children.cycles-pp.futex_wait_setup
     98.55            -0.3       98.21        perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
     98.59            -0.3       98.28        perf-profile.children.cycles-pp._raw_spin_lock
      0.37            -0.1        0.26        perf-profile.children.cycles-pp.pthread_mutex_lock
      0.60 ±  3%      -0.1        0.49 ±  3%  perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
      0.58 ±  3%      -0.1        0.47 ±  3%  perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
      0.20 ±  5%      -0.1        0.10 ±  9%  perf-profile.children.cycles-pp.handle_softirqs
      0.18 ±  5%      -0.1        0.09 ±  6%  perf-profile.children.cycles-pp.sched_balance_domains
      0.21 ±  4%      -0.1        0.12 ±  4%  perf-profile.children.cycles-pp.__irq_exit_rcu
      0.17 ±  2%      -0.1        0.11 ±  3%  perf-profile.children.cycles-pp.common_startup_64
      0.17 ±  2%      -0.1        0.11 ±  3%  perf-profile.children.cycles-pp.cpu_startup_entry
      0.17 ±  2%      -0.1        0.11 ±  3%  perf-profile.children.cycles-pp.do_idle
      0.17 ±  2%      -0.1        0.11 ±  3%  perf-profile.children.cycles-pp.start_secondary
      0.11 ±  4%      -0.0        0.07 ±  5%  perf-profile.children.cycles-pp.acpi_idle_do_entry
      0.11 ±  4%      -0.0        0.07 ±  5%  perf-profile.children.cycles-pp.acpi_idle_enter
      0.11 ±  4%      -0.0        0.07 ±  5%  perf-profile.children.cycles-pp.acpi_safe_halt
      0.11 ±  4%      -0.0        0.07 ±  5%  perf-profile.children.cycles-pp.pv_native_safe_halt
      0.11 ±  4%      -0.0        0.08        perf-profile.children.cycles-pp.asm_sysvec_call_function_single
      0.10            -0.0        0.07 ±  5%  perf-profile.children.cycles-pp.__schedule
      0.11            -0.0        0.08 ±  4%  perf-profile.children.cycles-pp.cpuidle_enter
      0.06 ±  7%      -0.0        0.03 ± 70%  perf-profile.children.cycles-pp.futex_do_wait
      0.11 ±  3%      -0.0        0.08 ±  4%  perf-profile.children.cycles-pp.cpuidle_enter_state
      0.11            -0.0        0.08        perf-profile.children.cycles-pp.cpuidle_idle_call
      0.08            -0.0        0.05 ±  7%  perf-profile.children.cycles-pp.sysvec_call_function_single
      0.00            +0.1        0.05        perf-profile.children.cycles-pp.futex_q_unlock
      0.07            +0.1        0.12 ±  3%  perf-profile.children.cycles-pp.futex_q_lock
      0.00            +0.2        0.17        perf-profile.children.cycles-pp.futex_hash_put
     99.22            +0.2       99.40        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
     99.22            +0.2       99.40        perf-profile.children.cycles-pp.do_syscall_64
     99.03            +0.2       99.22        perf-profile.children.cycles-pp.__x64_sys_futex
     99.02            +0.2       99.22        perf-profile.children.cycles-pp.do_futex
      0.00            +0.3        0.33        perf-profile.children.cycles-pp.futex_hash
     55.19            +1.5       56.68        perf-profile.children.cycles-pp.futex_wake
     97.95            -0.2       97.71        perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
      0.37            -0.1        0.26        perf-profile.self.cycles-pp.pthread_mutex_lock
      0.18 ±  4%      -0.1        0.09 ±  6%  perf-profile.self.cycles-pp.sched_balance_domains
      0.08            -0.0        0.06        perf-profile.self.cycles-pp.futex_wait_setup
      0.07            +0.0        0.12        perf-profile.self.cycles-pp.futex_q_lock
      0.00            +0.1        0.05        perf-profile.self.cycles-pp.futex_q_unlock
      0.00            +0.1        0.08        perf-profile.self.cycles-pp._raw_spin_lock
      0.00            +0.2        0.17        perf-profile.self.cycles-pp.futex_hash_put
      0.00            +0.3        0.33        perf-profile.self.cycles-pp.futex_hash





Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ