lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <202505291011.9fe37568-lkp@intel.com>
Date: Thu, 29 May 2025 12:50:44 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, <linux-kernel@...r.kernel.org>,
	Peter Zijlstra <peterz@...radead.org>, <oliver.sang@...el.com>
Subject: [linus:master] [futex]  7c4f75a21f:  will-it-scale.per_thread_ops
 98.3% regression



Hello,

kernel test robot noticed a 98.3% regression of will-it-scale.per_thread_ops on:


commit: 7c4f75a21f636486d2969d9b6680403ea8483539 ("futex: Allow automatic allocation of process wide futex hash")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

[still regression on linus/master      feacb1774bd5eac6382990d0f6d1378dc01dd78f]
[still regression on linux-next/master 64d12554715ce825d553caea123b7cb89e56237a]

testcase: will-it-scale
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 224 threads 4 sockets Intel(R) Xeon(R) Platinum 8380H CPU @ 2.90GHz (Cooper Lake) with 192G memory
parameters:

	nr_task: 100%
	mode: thread
	test: futex4
	cpufreq_governor: performance


In addition to that, the commit also has significant impact on the following tests:

+------------------+------------------------------------------------------------------------------------------------+
| testcase: change | perf-bench-futex: perf-bench-futex.ops/s  94.6% regression                                     |
| test machine     | 192 threads 2 sockets Intel(R) Xeon(R) 6740E  CPU @ 2.4GHz (Sierra Forest) with 256G memory    |
| test parameters  | cpufreq_governor=performance                                                                   |
|                  | nr_task=100%                                                                                   |
|                  | runtime=300s                                                                                   |
|                  | test=hash                                                                                      |
+------------------+------------------------------------------------------------------------------------------------+
| testcase: change | phoronix-test-suite: phoronix-test-suite.speedb.SequentialFill.op_s  11.7% regression          |
| test machine     | 128 threads 2 sockets Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz (Ice Lake) with 128G memory |
| test parameters  | cpufreq_governor=performance                                                                   |
|                  | option_a=Sequential Fill                                                                       |
|                  | test=speedb-1.0.1                                                                              |
+------------------+------------------------------------------------------------------------------------------------+


If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@...el.com>
| Closes: https://lore.kernel.org/oe-lkp/202505291011.9fe37568-lkp@intel.com


Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250529/202505291011.9fe37568-lkp@intel.com

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
  gcc-12/performance/x86_64-rhel-9.4/thread/100%/debian-12-x86_64-20240206.cgz/lkp-cpl-4sp2/futex4/will-it-scale

commit: 
  80367ad01d ("futex: Add basic infrastructure for local task local hash")
  7c4f75a21f ("futex: Allow automatic allocation of process wide futex hash")

80367ad01d93ac78 7c4f75a21f636486d2969d9b668 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
    910593           +15.8%    1054733        meminfo.Shmem
     16.07           -98.9%       0.17 ± 11%  vmstat.cpu.us
      2757           -11.8%       2430        vmstat.system.cs
     23.55 ±  2%     -15.3%      19.94 ±  2%  sched_debug.cpu.clock.stddev
    836.03 ±  2%     -11.5%     740.19        sched_debug.cpu.nr_switches.min
      4174 ±  7%     -23.1%       3208 ±  9%  sched_debug.cpu.nr_switches.stddev
  8.23e+08           -98.3%   14313329        will-it-scale.224.threads
   3673940           -98.3%      63898        will-it-scale.per_thread_ops
  8.23e+08           -98.3%   14313329        will-it-scale.workload
      0.55 ±  3%      -0.1        0.45 ±  3%  mpstat.cpu.all.irq%
      0.00 ±  4%      -0.0        0.00 ±  3%  mpstat.cpu.all.soft%
     82.56           +16.0       98.60        mpstat.cpu.all.sys%
     16.28           -16.0        0.30        mpstat.cpu.all.usr%
      9.50 ± 41%   +9208.8%     884.33 ±  4%  perf-c2c.DRAM.local
    549.00 ± 48%  +22892.9%     126231        perf-c2c.DRAM.remote
    537.17 ± 16%  +10624.6%      57608        perf-c2c.HITM.local
    521.00 ± 51%  +14508.3%      76109        perf-c2c.HITM.remote
      1058 ± 22%  +12536.8%     133718        perf-c2c.HITM.total
    421424            +8.9%     458761        proc-vmstat.nr_active_anon
    194593            +0.7%     196003        proc-vmstat.nr_anon_pages
   1109547            +3.2%    1145260        proc-vmstat.nr_file_pages
     26083            +8.1%      28187        proc-vmstat.nr_mapped
      2411            +2.4%       2469        proc-vmstat.nr_page_table_pages
    227754           +15.8%     263666        proc-vmstat.nr_shmem
    421424            +8.9%     458761        proc-vmstat.nr_zone_active_anon
   1637120            +3.4%    1692600        proc-vmstat.numa_hit
   1289218            +4.3%    1344829        proc-vmstat.numa_local
    103162 ± 40%     -47.8%      53801 ± 45%  proc-vmstat.numa_pte_updates
   1765284            +4.4%    1842858        proc-vmstat.pgalloc_normal
      0.01 ± 80%  +83905.7%      11.27        perf-stat.i.MPKI
 1.382e+11           -95.2%  6.573e+09        perf-stat.i.branch-instructions
      0.01 ±  3%      +0.5        0.53        perf-stat.i.branch-miss-rate%
  14002328 ±  2%    +150.5%   35080788        perf-stat.i.branch-misses
   7228215 ± 89%   +4142.7%  3.067e+08 ±  2%  perf-stat.i.cache-misses
  19769349 ± 33%   +2364.5%  4.872e+08 ±  2%  perf-stat.i.cache-references
      2694           -11.9%       2373        perf-stat.i.context-switches
      1.26         +2375.2%      31.10        perf-stat.i.cpi
  7.62e+11           +11.1%  8.464e+11        perf-stat.i.cpu-cycles
    297.49            -5.9%     279.83        perf-stat.i.cpu-migrations
    457686 ± 86%     -99.4%       2758 ±  2%  perf-stat.i.cycles-between-cache-misses
 6.063e+11           -95.5%  2.726e+10        perf-stat.i.instructions
      0.80           -95.8%       0.03        perf-stat.i.ipc
      0.01 ± 31%     -71.3%       0.00 ±141%  perf-stat.i.major-faults
      0.01 ± 89%  +93952.9%      11.25        perf-stat.overall.MPKI
      0.01 ±  2%      +0.5        0.53        perf-stat.overall.branch-miss-rate%
      1.26         +2370.7%      31.05        perf-stat.overall.cpi
    298631 ± 78%     -99.1%       2761 ±  2%  perf-stat.overall.cycles-between-cache-misses
      0.80           -96.0%       0.03        perf-stat.overall.ipc
    222155          +158.4%     574069        perf-stat.overall.path-length
 1.377e+11           -95.2%  6.551e+09        perf-stat.ps.branch-instructions
  13951225 ±  2%    +150.5%   34950752        perf-stat.ps.branch-misses
   7204865 ± 89%   +4142.3%  3.057e+08 ±  2%  perf-stat.ps.cache-misses
  19741274 ± 33%   +2360.2%  4.857e+08 ±  2%  perf-stat.ps.cache-references
      2684           -11.9%       2364        perf-stat.ps.context-switches
 7.595e+11           +11.1%  8.436e+11        perf-stat.ps.cpu-cycles
    296.47            -6.0%     278.82        perf-stat.ps.cpu-migrations
 6.043e+11           -95.5%  2.717e+10        perf-stat.ps.instructions
      0.01 ± 31%     -71.4%       0.00 ±141%  perf-stat.ps.major-faults
 1.828e+14           -95.5%  8.216e+12        perf-stat.total.instructions
      0.04 ±  2%     +31.9%       0.05 ± 24%  perf-sched.sch_delay.avg.ms.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity
      0.02 ± 15%    +115.4%       0.04 ± 20%  perf-sched.sch_delay.avg.ms.anon_pipe_read.vfs_read.ksys_read.do_syscall_64
      0.02 ± 21%     -60.9%       0.01 ± 24%  perf-sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_call_function_single.[unknown].[unknown]
      0.01 ± 10%     +90.2%       0.01 ± 19%  perf-sched.sch_delay.avg.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
      0.01 ± 12%   +1083.0%       0.10 ±138%  perf-sched.sch_delay.avg.ms.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
      0.01 ±  8%    +100.0%       0.01 ±  8%  perf-sched.sch_delay.avg.ms.schedule_timeout.kcompactd.kthread.ret_from_fork
      0.00          +175.0%       0.01 ± 14%  perf-sched.sch_delay.avg.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
      0.01 ± 14%    +178.0%       0.02 ± 36%  perf-sched.sch_delay.max.ms.irq_thread.kthread.ret_from_fork.ret_from_fork_asm
      0.90 ±108%     -98.4%       0.01 ± 39%  perf-sched.sch_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_call_function_single.[unknown].[unknown]
      0.01 ±  7%    +125.5%       0.02 ± 32%  perf-sched.sch_delay.max.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
      0.02 ± 33%  +19261.3%       3.84 ±140%  perf-sched.sch_delay.max.ms.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
     98.63 ±  2%     +11.2%     109.65 ±  2%  perf-sched.total_wait_and_delay.average.ms
      3382 ± 15%     +22.9%       4155 ± 11%  perf-sched.total_wait_and_delay.max.ms
     98.21 ±  2%     +11.2%     109.24 ±  2%  perf-sched.total_wait_time.average.ms
      3382 ± 15%     +22.9%       4155 ± 11%  perf-sched.total_wait_time.max.ms
      4.83 ±  5%     -28.2%       3.47 ± 11%  perf-sched.wait_and_delay.avg.ms.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity
     84.86 ±  3%     +29.2%     109.64 ± 17%  perf-sched.wait_and_delay.avg.ms.anon_pipe_read.fifo_pipe_read.vfs_read.ksys_read
     19.33 ±  6%    +103.7%      39.36 ± 15%  perf-sched.wait_and_delay.avg.ms.anon_pipe_read.vfs_read.ksys_read.do_syscall_64
      2.58 ±  5%    -100.0%       0.00        perf-sched.wait_and_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown]
      0.04 ± 22%    -100.0%       0.00        perf-sched.wait_and_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_call_function_single.[unknown].[unknown]
      1.88 ± 16%     -44.7%       1.04 ± 13%  perf-sched.wait_and_delay.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
      5.95 ±  2%     -21.0%       4.70        perf-sched.wait_and_delay.avg.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
      1461 ±  6%     -49.8%     733.83 ± 17%  perf-sched.wait_and_delay.count.anon_pipe_read.vfs_read.ksys_read.do_syscall_64
    583.67 ±  3%    -100.0%       0.00        perf-sched.wait_and_delay.count.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown]
    347.33 ±  7%    -100.0%       0.00        perf-sched.wait_and_delay.count.irqentry_exit_to_user_mode.asm_sysvec_call_function_single.[unknown].[unknown]
    808.83 ±  2%     +31.0%       1059        perf-sched.wait_and_delay.count.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
      2390 ± 16%     -54.4%       1090 ± 18%  perf-sched.wait_and_delay.max.ms.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity
     21.69 ±101%    -100.0%       0.00        perf-sched.wait_and_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown]
      2.44 ± 87%    -100.0%       0.00        perf-sched.wait_and_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_call_function_single.[unknown].[unknown]
     34.72 ± 81%     -68.8%      10.84 ±107%  perf-sched.wait_and_delay.max.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
    215.67 ± 10%     -40.0%     129.34 ± 27%  perf-sched.wait_and_delay.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
      4.79 ±  6%     -28.6%       3.42 ± 11%  perf-sched.wait_time.avg.ms.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity
     84.85 ±  3%     +29.2%     109.62 ± 17%  perf-sched.wait_time.avg.ms.anon_pipe_read.fifo_pipe_read.vfs_read.ksys_read
     19.31 ±  6%    +103.7%      39.32 ± 15%  perf-sched.wait_time.avg.ms.anon_pipe_read.vfs_read.ksys_read.do_syscall_64
      0.02 ± 29%     -65.4%       0.01 ± 24%  perf-sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_call_function_single.[unknown].[unknown]
      2.79 ± 10%     -21.2%       2.20 ±  9%  perf-sched.wait_time.avg.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
      1.83 ± 16%     -45.1%       1.01 ± 13%  perf-sched.wait_time.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
      5.95 ±  2%     -21.1%       4.69        perf-sched.wait_time.avg.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
      2389 ± 16%     -54.4%       1090 ± 18%  perf-sched.wait_time.max.ms.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity
      1.60 ±102%     -99.1%       0.01 ± 39%  perf-sched.wait_time.max.ms.irqentry_exit_to_user_mode.asm_sysvec_call_function_single.[unknown].[unknown]
      4.99           -36.8%       3.16 ± 38%  perf-sched.wait_time.max.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
     34.41 ± 81%     -68.8%      10.74 ±106%  perf-sched.wait_time.max.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
    215.66 ± 10%     -40.0%     129.32 ± 27%  perf-sched.wait_time.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
     17.37           -17.4        0.00        perf-profile.calltrace.cycles-pp.clear_bhb_loop.syscall
     14.97           -15.0        0.00        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.syscall
      5.13            -5.1        0.00        perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
      4.92            -3.3        1.61 ±  7%  perf-profile.calltrace.cycles-pp.futex_hash.futex_wait_setup.__futex_wait.futex_wait.do_futex
     99.06            +0.8       99.86        perf-profile.calltrace.cycles-pp.syscall
      3.14 ±  7%      +0.9        4.00        perf-profile.calltrace.cycles-pp.futex_q_lock.futex_wait_setup.__futex_wait.futex_wait.do_futex
      2.78 ±  2%      +2.0        4.73        perf-profile.calltrace.cycles-pp.futex_q_unlock.futex_wait_setup.__futex_wait.futex_wait.do_futex
     57.80           +41.6       99.43        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.syscall
     55.37           +44.0       99.36        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
     44.28           +54.9       99.20        perf-profile.calltrace.cycles-pp.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
     40.78           +58.4       99.16        perf-profile.calltrace.cycles-pp.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
     38.15           +61.0       99.14        perf-profile.calltrace.cycles-pp.futex_wait.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe
     35.04           +64.1       99.10        perf-profile.calltrace.cycles-pp.__futex_wait.futex_wait.do_futex.__x64_sys_futex.do_syscall_64
     25.08           +73.9       99.00        perf-profile.calltrace.cycles-pp.futex_wait_setup.__futex_wait.futex_wait.do_futex.__x64_sys_futex
      2.96 ±  5%     +85.6       88.52        perf-profile.calltrace.cycles-pp._raw_spin_lock.futex_wait_setup.__futex_wait.futex_wait.do_futex
      0.00           +87.8       87.81        perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.futex_wait_setup.__futex_wait.futex_wait
     17.49           -17.3        0.20 ±  2%  perf-profile.children.cycles-pp.clear_bhb_loop
     10.98           -10.8        0.14        perf-profile.children.cycles-pp.entry_SYSCALL_64
      5.66            -5.6        0.08 ±  4%  perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
      5.26            -5.2        0.09 ±  5%  perf-profile.children.cycles-pp.syscall_exit_to_user_mode
      4.93            -3.3        1.61 ±  8%  perf-profile.children.cycles-pp.futex_hash
      2.88            -2.8        0.07        perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
      0.51 ±  2%      -0.1        0.40 ±  2%  perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
      0.48 ±  2%      -0.1        0.38 ±  2%  perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
      0.47 ±  2%      -0.1        0.37 ±  2%  perf-profile.children.cycles-pp.hrtimer_interrupt
      0.47 ±  2%      -0.1        0.37 ±  2%  perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
      0.37 ±  3%      -0.1        0.29 ±  4%  perf-profile.children.cycles-pp.__hrtimer_run_queues
      0.30 ±  2%      -0.1        0.24 ±  3%  perf-profile.children.cycles-pp.tick_nohz_handler
      0.25 ±  2%      -0.1        0.20 ±  2%  perf-profile.children.cycles-pp.update_process_times
      0.10            -0.0        0.08 ±  6%  perf-profile.children.cycles-pp.get_jiffies_update
      0.10            -0.0        0.08 ±  6%  perf-profile.children.cycles-pp.tmigr_requires_handle_remote
      0.08 ±  5%      -0.0        0.06 ±  7%  perf-profile.children.cycles-pp.ktime_get_update_offsets_now
      0.11            -0.0        0.10 ±  3%  perf-profile.children.cycles-pp.sched_tick
     99.26            +0.7       99.93        perf-profile.children.cycles-pp.syscall
      3.26 ±  6%      +0.7        4.00        perf-profile.children.cycles-pp.futex_q_lock
      2.78 ±  2%      +2.0        4.73        perf-profile.children.cycles-pp.futex_q_unlock
     58.20           +41.3       99.46        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
     55.71           +43.7       99.39        perf-profile.children.cycles-pp.do_syscall_64
     44.58           +54.6       99.21        perf-profile.children.cycles-pp.__x64_sys_futex
     41.04           +58.1       99.17        perf-profile.children.cycles-pp.do_futex
     38.42           +60.7       99.14        perf-profile.children.cycles-pp.futex_wait
     35.33           +63.8       99.11        perf-profile.children.cycles-pp.__futex_wait
     25.71           +73.3       99.01        perf-profile.children.cycles-pp.futex_wait_setup
      3.08 ±  5%     +85.5       88.53        perf-profile.children.cycles-pp._raw_spin_lock
      0.02 ±142%     +87.8       87.82        perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
     17.38           -17.2        0.20 ±  2%  perf-profile.self.cycles-pp.clear_bhb_loop
      9.55            -9.5        0.10 ±  4%  perf-profile.self.cycles-pp.__futex_wait
      8.72            -8.6        0.12 ±  3%  perf-profile.self.cycles-pp.futex_wait_setup
      8.70            -8.6        0.12 ±  3%  perf-profile.self.cycles-pp.syscall
      5.61            -5.5        0.07        perf-profile.self.cycles-pp.entry_SYSCALL_64
      5.43            -5.4        0.07        perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
      4.54            -4.5        0.07        perf-profile.self.cycles-pp.syscall_exit_to_user_mode
      3.72            -3.7        0.04 ± 44%  perf-profile.self.cycles-pp.do_syscall_64
      4.76            -3.2        1.60 ±  8%  perf-profile.self.cycles-pp.futex_hash
      2.54            -2.5        0.03 ± 70%  perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
      2.92 ±  4%      -2.2        0.71        perf-profile.self.cycles-pp._raw_spin_lock
      1.50            -1.4        0.05        perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
      0.10            -0.0        0.08 ±  6%  perf-profile.self.cycles-pp.get_jiffies_update
      0.08 ±  5%      -0.0        0.06 ±  7%  perf-profile.self.cycles-pp.ktime_get_update_offsets_now
      3.19 ±  6%      +0.8        3.98        perf-profile.self.cycles-pp.futex_q_lock
      2.60 ±  2%      +2.1        4.72        perf-profile.self.cycles-pp.futex_q_unlock
      0.02 ±142%     +87.5       87.48        perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath


***************************************************************************************************
lkp-srf-2sp2: 192 threads 2 sockets Intel(R) Xeon(R) 6740E  CPU @ 2.4GHz (Sierra Forest) with 256G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/nr_task/rootfs/runtime/tbox_group/test/testcase:
  gcc-12/performance/x86_64-rhel-9.4/100%/debian-12-x86_64-20240206.cgz/300s/lkp-srf-2sp2/hash/perf-bench-futex

commit: 
  80367ad01d ("futex: Add basic infrastructure for local task local hash")
  7c4f75a21f ("futex: Allow automatic allocation of process wide futex hash")

80367ad01d93ac78 7c4f75a21f636486d2969d9b668 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
     79777 ±  9%     +29.6%     103404 ± 14%  sched_debug.cpu.avg_idle.stddev
     13.14           -92.4%       0.99        vmstat.cpu.us
     85.94           +12.6       98.54        mpstat.cpu.all.sys%
     13.40           -12.6        0.76        mpstat.cpu.all.usr%
    253330            +1.4%     256755        proc-vmstat.nr_active_anon
      2296            +2.2%       2346        proc-vmstat.nr_page_table_pages
     77274            +4.5%      80782        proc-vmstat.nr_shmem
    253330            +1.4%     256755        proc-vmstat.nr_zone_active_anon
   2667058           -94.6%     144593        perf-bench-futex.ops/s
      0.06 ± 13%      +0.2        0.21 ± 14%  perf-bench-futex.stddev%
    229015            -3.4%     221126        perf-bench-futex.time.involuntary_context_switches
     49696           +14.7%      57010        perf-bench-futex.time.system_time
      7728           -94.6%     416.35        perf-bench-futex.time.user_time
      0.74           +90.7%       1.40        perf-stat.i.MPKI
 5.333e+10           -82.2%   9.48e+09        perf-stat.i.branch-instructions
      0.02 ± 44%      +0.4        0.41        perf-stat.i.branch-miss-rate%
   9538223 ± 47%    +310.1%   39118125        perf-stat.i.branch-misses
     50.17           -14.2       35.98        perf-stat.i.cache-miss-rate%
 2.424e+08           -74.7%   61296533        perf-stat.i.cache-misses
 4.833e+08           -64.7%  1.706e+08        perf-stat.i.cache-references
      1.86          +653.3%      13.99        perf-stat.i.cpi
    249.82            -4.0%     239.71        perf-stat.i.cpu-migrations
      2522          +295.4%       9974        perf-stat.i.cycles-between-cache-misses
 3.295e+11           -86.7%  4.369e+10        perf-stat.i.instructions
      0.54           -86.7%       0.07        perf-stat.i.ipc
      0.74           +90.7%       1.40        perf-stat.overall.MPKI
      0.02 ± 47%      +0.4        0.41        perf-stat.overall.branch-miss-rate%
     50.15           -14.2       35.93        perf-stat.overall.cache-miss-rate%
      1.86          +654.2%      14.00        perf-stat.overall.cpi
      2522          +295.6%       9979        perf-stat.overall.cycles-between-cache-misses
      0.54           -86.7%       0.07        perf-stat.overall.ipc
 5.316e+10           -82.2%  9.448e+09        perf-stat.ps.branch-instructions
   9509524 ± 47%    +310.0%   38990460        perf-stat.ps.branch-misses
 2.416e+08           -74.7%   61091933        perf-stat.ps.cache-misses
 4.817e+08           -64.7%    1.7e+08        perf-stat.ps.cache-references
    249.00            -4.0%     238.92        perf-stat.ps.cpu-migrations
 3.284e+11           -86.7%  4.354e+10        perf-stat.ps.instructions
  9.88e+13           -86.7%   1.31e+13        perf-stat.total.instructions
      0.02           +52.8%       0.04 ±  9%  perf-sched.sch_delay.avg.ms.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity
      0.01           +80.0%       0.01 ±  9%  perf-sched.sch_delay.avg.ms.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
      0.02 ± 13%     +31.1%       0.03 ± 18%  perf-sched.sch_delay.avg.ms.devkmsg_read.vfs_read.ksys_read.do_syscall_64
      0.01 ±  7%    +122.5%       0.01 ± 47%  perf-sched.sch_delay.avg.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
      0.00 ±142%    +566.7%       0.01 ± 39%  perf-sched.sch_delay.avg.ms.schedule_timeout.__wait_for_common.__flush_work.__lru_add_drain_all
      0.00          +204.2%       0.01 ±  7%  perf-sched.sch_delay.avg.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
      0.01 ± 22%    +147.9%       0.02 ± 25%  perf-sched.sch_delay.avg.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
      0.01 ± 12%    +298.1%       0.04 ± 29%  perf-sched.sch_delay.max.ms.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
      0.36 ± 50%     -70.5%       0.11 ± 64%  perf-sched.sch_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_call_function_single.[unknown].[unknown]
      0.01 ±  5%    +145.6%       0.02 ± 64%  perf-sched.sch_delay.max.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
      0.00 ±142%    +566.7%       0.01 ± 39%  perf-sched.sch_delay.max.ms.schedule_timeout.__wait_for_common.__flush_work.__lru_add_drain_all
      0.02 ± 58%    +318.7%       0.07 ± 27%  perf-sched.sch_delay.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
    125.67 ±  2%     -11.1%     111.69 ±  2%  perf-sched.total_wait_and_delay.average.ms
    125.61 ±  2%     -11.1%     111.63 ±  2%  perf-sched.total_wait_time.average.ms
     37.18 ± 15%     +36.8%      50.87 ± 10%  perf-sched.wait_and_delay.avg.ms.anon_pipe_read.vfs_read.ksys_read.do_syscall_64
      0.24 ± 23%    -100.0%       0.00        perf-sched.wait_and_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown]
      0.05 ± 26%    -100.0%       0.00        perf-sched.wait_and_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_call_function_single.[unknown].[unknown]
      7.31 ±  5%     -29.9%       5.12 ±  3%  perf-sched.wait_and_delay.avg.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
    547.19           -12.1%     480.98        perf-sched.wait_and_delay.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
    771.00 ± 14%     -27.7%     557.50 ± 10%  perf-sched.wait_and_delay.count.anon_pipe_read.vfs_read.ksys_read.do_syscall_64
    247.83 ±  4%    -100.0%       0.00        perf-sched.wait_and_delay.count.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown]
    145.17 ± 31%    -100.0%       0.00        perf-sched.wait_and_delay.count.irqentry_exit_to_user_mode.asm_sysvec_call_function_single.[unknown].[unknown]
    632.33 ±  5%     +53.9%     973.00 ±  3%  perf-sched.wait_and_delay.count.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
      3727           +11.5%       4155 ±  2%  perf-sched.wait_and_delay.count.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
      8.43 ± 63%    -100.0%       0.00        perf-sched.wait_and_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown]
      0.72 ± 50%    -100.0%       0.00        perf-sched.wait_and_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_call_function_single.[unknown].[unknown]
    530.84 ±  4%     -44.9%     292.50 ± 10%  perf-sched.wait_and_delay.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
     37.16 ± 14%     +36.8%      50.84 ± 10%  perf-sched.wait_time.avg.ms.anon_pipe_read.vfs_read.ksys_read.do_syscall_64
      3.12 ± 12%     -31.8%       2.13 ± 13%  perf-sched.wait_time.avg.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
      7.30 ±  5%     -30.0%       5.11 ±  3%  perf-sched.wait_time.avg.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
    547.19           -12.1%     480.97        perf-sched.wait_time.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
      0.36 ± 50%     -70.5%       0.11 ± 64%  perf-sched.wait_time.max.ms.irqentry_exit_to_user_mode.asm_sysvec_call_function_single.[unknown].[unknown]
      4.83 ±  7%     -37.9%       3.00 ± 37%  perf-sched.wait_time.max.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
    530.83 ±  4%     -44.9%     292.49 ± 10%  perf-sched.wait_time.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread



***************************************************************************************************
lkp-icl-2sp5: 128 threads 2 sockets Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz (Ice Lake) with 128G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/option_a/rootfs/tbox_group/test/testcase:
  gcc-12/performance/x86_64-rhel-9.4/Sequential Fill/debian-12-x86_64-phoronix/lkp-icl-2sp5/speedb-1.0.1/phoronix-test-suite

commit: 
  80367ad01d ("futex: Add basic infrastructure for local task local hash")
  7c4f75a21f ("futex: Allow automatic allocation of process wide futex hash")

80367ad01d93ac78 7c4f75a21f636486d2969d9b668 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
 6.085e+10           +14.7%  6.979e+10        cpuidle..time
    832.35           +11.0%     923.95        uptime.boot
     75739           +11.9%      84762        uptime.idle
    745.32           -11.5%     659.62        vmstat.io.bi
   1256218            -5.3%    1190033        vmstat.system.cs
   1512066            -4.4%    1445260        proc-vmstat.nr_active_anon
   1758143            -3.9%    1688770        proc-vmstat.nr_file_pages
     48547            -5.9%      45679        proc-vmstat.nr_mapped
   1147755            -6.1%    1078202        proc-vmstat.nr_shmem
   1512066            -4.4%    1445260        proc-vmstat.nr_zone_active_anon
   1252996 ±  7%     +49.6%    1875006 ± 14%  proc-vmstat.numa_pte_updates
      0.56           +10.4%       0.62        perf-sched.total_wait_and_delay.average.ms
   3397230            -9.6%    3070064        perf-sched.total_wait_and_delay.count.ms
      0.56           +10.5%       0.62        perf-sched.total_wait_time.average.ms
      0.18           +10.9%       0.20        perf-sched.wait_and_delay.avg.ms.futex_do_wait.__futex_wait.futex_wait.do_futex
   3388022            -9.6%    3061182        perf-sched.wait_and_delay.count.futex_do_wait.__futex_wait.futex_wait.do_futex
      1238           +11.3%       1378        perf-sched.wait_and_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown]
      0.18           +11.2%       0.20        perf-sched.wait_time.avg.ms.futex_do_wait.__futex_wait.futex_wait.do_futex
      1238           +11.3%       1378        perf-sched.wait_time.max.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown]
    554832           -11.7%     490186        phoronix-test-suite.speedb.SequentialFill.op_s
    713.28           +12.8%     804.49        phoronix-test-suite.time.elapsed_time
    713.28           +12.8%     804.49        phoronix-test-suite.time.elapsed_time.max
    258734 ±  3%     -14.9%     220243 ±  7%  phoronix-test-suite.time.involuntary_context_switches
      4069            -3.4%       3931        phoronix-test-suite.time.percent_of_cpu_this_job_got
     18615           +12.0%      20857        phoronix-test-suite.time.system_time
     10416            +3.5%      10776        phoronix-test-suite.time.user_time
 4.488e+08            +6.8%  4.792e+08        phoronix-test-suite.time.voluntary_context_switches
      0.36            +8.0%       0.39        perf-stat.i.MPKI
  27161568            -2.5%   26475937        perf-stat.i.branch-misses
     28.74            +1.8       30.50        perf-stat.i.cache-miss-rate%
  53609412            +7.0%   57337744        perf-stat.i.cache-misses
   1262748            -5.3%    1195372        perf-stat.i.context-switches
      0.98            -3.1%       0.95        perf-stat.i.cpi
  1.46e+11            -3.5%  1.408e+11        perf-stat.i.cpu-cycles
      2826            -9.6%       2556        perf-stat.i.cycles-between-cache-misses
      0.03            -0.0        0.03 ±  4%  perf-stat.i.dTLB-load-miss-rate%
   4282480 ±  2%      -7.6%    3958135        perf-stat.i.dTLB-load-misses
      0.01 ±  2%      -0.0        0.01 ±  4%  perf-stat.i.dTLB-store-miss-rate%
    636770 ±  4%     -24.1%     483093 ±  2%  perf-stat.i.dTLB-store-misses
      1.06            +3.2%       1.10        perf-stat.i.ipc
      0.28           -12.2%       0.24 ±  7%  perf-stat.i.major-faults
      1.14            -3.5%       1.10        perf-stat.i.metric.GHz
    257.79            +7.6%     277.26        perf-stat.i.metric.K/sec
      7866 ±  2%      -7.8%       7256        perf-stat.i.minor-faults
  16753987            +4.7%   17536138        perf-stat.i.node-load-misses
   6529414 ±  2%     +16.2%    7589233        perf-stat.i.node-store-misses
   4718305 ±  2%     +20.1%    5666449        perf-stat.i.node-stores
      7866 ±  2%      -7.8%       7256        perf-stat.i.page-faults
      0.34            +7.3%       0.37        perf-stat.overall.MPKI
      0.07            -0.0        0.07        perf-stat.overall.branch-miss-rate%
     28.91            +1.8       30.67        perf-stat.overall.cache-miss-rate%
      0.93            -3.1%       0.90        perf-stat.overall.cpi
      2722            -9.8%       2457        perf-stat.overall.cycles-between-cache-misses
      0.01            -0.0        0.01        perf-stat.overall.dTLB-load-miss-rate%
      0.00 ±  4%      -0.0        0.00 ±  2%  perf-stat.overall.dTLB-store-miss-rate%
      1.08            +3.3%       1.11        perf-stat.overall.ipc
  27129347            -2.5%   26448313        perf-stat.ps.branch-misses
  53525176            +7.0%   57254557        perf-stat.ps.cache-misses
   1260670            -5.3%    1193642        perf-stat.ps.context-switches
 1.457e+11            -3.5%  1.406e+11        perf-stat.ps.cpu-cycles
   4278010 ±  2%      -7.6%    3953474        perf-stat.ps.dTLB-load-misses
    635974 ±  4%     -24.1%     482580 ±  2%  perf-stat.ps.dTLB-store-misses
      0.28           -12.1%       0.25 ±  7%  perf-stat.ps.major-faults
      7861 ±  2%      -7.8%       7251        perf-stat.ps.minor-faults
  16727387            +4.7%   17510697        perf-stat.ps.node-load-misses
   6518975 ±  2%     +16.2%    7578033        perf-stat.ps.node-store-misses
   4711090 ±  2%     +20.1%    5658530        perf-stat.ps.node-stores
      7861 ±  2%      -7.8%       7251        perf-stat.ps.page-faults
 1.122e+14           +12.4%  1.261e+14        perf-stat.total.instructions
      2.18 ± 12%     -29.6%       1.54 ± 15%  sched_debug.cfs_rq:/.load_avg.min
     64.68 ± 20%     -41.6%      37.77 ± 16%  sched_debug.cfs_rq:/.runnable_avg.min
     64.69 ± 20%     -41.6%      37.76 ± 16%  sched_debug.cfs_rq:/.util_avg.min
   4792810            +9.9%    5265167        sched_debug.cfs_rq:/system.slice.avg_vruntime.min
      9.18 ±  9%     -11.1%       8.16 ±  3%  sched_debug.cfs_rq:/system.slice.load_avg.avg
      2.31 ± 16%     -25.1%       1.73 ± 19%  sched_debug.cfs_rq:/system.slice.load_avg.min
   4792810            +9.9%    5265167        sched_debug.cfs_rq:/system.slice.min_vruntime.min
     64.62 ± 20%     -41.6%      37.73 ± 15%  sched_debug.cfs_rq:/system.slice.runnable_avg.min
      1.43 ± 18%     -45.1%       0.79 ± 20%  sched_debug.cfs_rq:/system.slice.se->avg.load_avg.min
     64.58 ± 20%     -41.6%      37.70 ± 16%  sched_debug.cfs_rq:/system.slice.se->avg.runnable_avg.min
     64.61 ± 20%     -41.6%      37.70 ± 16%  sched_debug.cfs_rq:/system.slice.se->avg.util_avg.min
    445227           +13.5%     505275        sched_debug.cfs_rq:/system.slice.se->exec_start.avg
    445590           +13.5%     505640        sched_debug.cfs_rq:/system.slice.se->exec_start.max
    437889           +13.7%     497827        sched_debug.cfs_rq:/system.slice.se->exec_start.min
    103036           +14.2%     117717        sched_debug.cfs_rq:/system.slice.se->sum_exec_runtime.avg
    113992 ±  3%     +11.4%     127019        sched_debug.cfs_rq:/system.slice.se->sum_exec_runtime.max
    101120           +14.3%     115617        sched_debug.cfs_rq:/system.slice.se->sum_exec_runtime.min
      2.28 ± 12%     -23.2%       1.75 ± 16%  sched_debug.cfs_rq:/system.slice.tg_load_avg_contrib.min
     64.65 ± 20%     -41.6%      37.73 ± 16%  sched_debug.cfs_rq:/system.slice.util_avg.min
    445111           +13.5%     505165        sched_debug.cfs_rq:/system.slice/containerd.service.se->exec_start.avg
    445426           +13.5%     505520        sched_debug.cfs_rq:/system.slice/containerd.service.se->exec_start.max
    442897           +13.5%     502644        sched_debug.cfs_rq:/system.slice/containerd.service.se->exec_start.min
   4860094            +9.9%    5339463        sched_debug.cfs_rq:/system.slice/containerd.service.se->vruntime.avg
   4805849           +10.0%    5287162        sched_debug.cfs_rq:/system.slice/containerd.service.se->vruntime.min
    102970           +14.2%     117638        sched_debug.cfs_rq:/system.slice/lkp-bootstrap.service.avg_vruntime.avg
    113941 ±  3%     +11.3%     126851        sched_debug.cfs_rq:/system.slice/lkp-bootstrap.service.avg_vruntime.max
    101055           +14.3%     115547        sched_debug.cfs_rq:/system.slice/lkp-bootstrap.service.avg_vruntime.min
     69.43 ± 13%     -43.9%      38.94 ± 16%  sched_debug.cfs_rq:/system.slice/lkp-bootstrap.service.load_avg.min
    102970           +14.2%     117638        sched_debug.cfs_rq:/system.slice/lkp-bootstrap.service.min_vruntime.avg
    113941 ±  3%     +11.3%     126851        sched_debug.cfs_rq:/system.slice/lkp-bootstrap.service.min_vruntime.max
    101055           +14.3%     115547        sched_debug.cfs_rq:/system.slice/lkp-bootstrap.service.min_vruntime.min
     69.17 ± 14%     -44.2%      38.61 ± 16%  sched_debug.cfs_rq:/system.slice/lkp-bootstrap.service.runnable_avg.min
      1.22 ± 19%     -49.4%       0.62 ± 17%  sched_debug.cfs_rq:/system.slice/lkp-bootstrap.service.se->avg.load_avg.min
     67.61 ± 14%     -44.7%      37.38 ± 16%  sched_debug.cfs_rq:/system.slice/lkp-bootstrap.service.se->avg.runnable_avg.min
     67.64 ± 14%     -44.7%      37.38 ± 16%  sched_debug.cfs_rq:/system.slice/lkp-bootstrap.service.se->avg.util_avg.min
    445226           +13.5%     505279        sched_debug.cfs_rq:/system.slice/lkp-bootstrap.service.se->exec_start.avg
    445589           +13.5%     505640        sched_debug.cfs_rq:/system.slice/lkp-bootstrap.service.se->exec_start.max
    437889           +13.7%     497980        sched_debug.cfs_rq:/system.slice/lkp-bootstrap.service.se->exec_start.min
    102976           +14.2%     117644        sched_debug.cfs_rq:/system.slice/lkp-bootstrap.service.se->sum_exec_runtime.avg
    113947 ±  3%     +11.3%     126858        sched_debug.cfs_rq:/system.slice/lkp-bootstrap.service.se->sum_exec_runtime.max
    101061           +14.3%     115553        sched_debug.cfs_rq:/system.slice/lkp-bootstrap.service.se->sum_exec_runtime.min
   4792828            +9.9%    5266259        sched_debug.cfs_rq:/system.slice/lkp-bootstrap.service.se->vruntime.min
     71.67 ± 20%     -47.8%      37.38 ± 16%  sched_debug.cfs_rq:/system.slice/lkp-bootstrap.service.tg_load_avg_contrib.min
     69.17 ± 14%     -44.2%      38.60 ± 16%  sched_debug.cfs_rq:/system.slice/lkp-bootstrap.service.util_avg.min
      0.12 ± 33%     -42.9%       0.07 ± 57%  sched_debug.cfs_rq:/system.slice/redis-server.service.load_avg.max
    445113           +13.5%     505199        sched_debug.cfs_rq:/system.slice/redis-server.service.se->exec_start.avg
    445251           +13.5%     505317        sched_debug.cfs_rq:/system.slice/redis-server.service.se->exec_start.max
    444977           +13.5%     505056        sched_debug.cfs_rq:/system.slice/redis-server.service.se->exec_start.min
   4855865            +9.8%    5333093        sched_debug.cfs_rq:/system.slice/redis-server.service.se->vruntime.avg
   4871037            +9.8%    5349946        sched_debug.cfs_rq:/system.slice/redis-server.service.se->vruntime.max
   4840982            +9.8%    5317191        sched_debug.cfs_rq:/system.slice/redis-server.service.se->vruntime.min
      0.14 ± 28%     -48.6%       0.07 ± 57%  sched_debug.cfs_rq:/system.slice/redis-server.service.tg_load_avg.max
      0.12 ± 33%     -42.9%       0.07 ± 57%  sched_debug.cfs_rq:/system.slice/redis-server.service.tg_load_avg_contrib.max
    447837           +13.5%     508303        sched_debug.cpu.clock.avg
    447843           +13.5%     508310        sched_debug.cpu.clock.max
    447830           +13.5%     508297        sched_debug.cpu.clock.min
    445243           +13.5%     505290        sched_debug.cpu.clock_task.avg
    445599           +13.5%     505645        sched_debug.cpu.clock_task.max
    437708           +13.7%     497758        sched_debug.cpu.clock_task.min
      3266 ±  3%     +10.9%       3623 ±  5%  sched_debug.cpu.curr->pid.avg
     13788           +11.1%      15317        sched_debug.cpu.curr->pid.max
      4619           +14.6%       5292 ±  3%  sched_debug.cpu.curr->pid.stddev
   3215856           +12.5%    3616881        sched_debug.cpu.nr_switches.avg
   3332744           +11.7%    3724017        sched_debug.cpu.nr_switches.max
   3037135           +14.3%    3470789        sched_debug.cpu.nr_switches.min
      0.01 ± 10%     -18.9%       0.01 ± 19%  sched_debug.cpu.nr_uninterruptible.avg
    447830           +13.5%     508297        sched_debug.cpu_clk
    447123           +13.5%     507589        sched_debug.ktime
    448701           +13.5%     509200        sched_debug.sched_clk
     85.47            -4.0       81.46 ±  6%  perf-profile.calltrace.cycles-pp.rocksdb::DBImpl::WriteImpl.rocksdb::DBImpl::Write.rocksdb::Benchmark::DoWrite.rocksdb::Benchmark::ThreadBody
     85.62            -4.0       81.62 ±  6%  perf-profile.calltrace.cycles-pp.rocksdb::DBImpl::Write.rocksdb::Benchmark::DoWrite.rocksdb::Benchmark::ThreadBody
     85.87            -4.0       81.88 ±  5%  perf-profile.calltrace.cycles-pp.rocksdb::Benchmark::ThreadBody
     85.85            -4.0       81.86 ±  5%  perf-profile.calltrace.cycles-pp.rocksdb::Benchmark::DoWrite.rocksdb::Benchmark::ThreadBody
      2.27 ± 15%      -0.8        1.46 ±  8%  perf-profile.calltrace.cycles-pp.rocksdb::WriteThread::JoinBatchGroup.rocksdb::DBImpl::WriteImpl.rocksdb::DBImpl::Write.rocksdb::Benchmark::DoWrite.rocksdb::Benchmark::ThreadBody
      2.27 ± 15%      -0.8        1.46 ±  8%  perf-profile.calltrace.cycles-pp.rocksdb::WriteThread::LinkOne.rocksdb::WriteThread::JoinBatchGroup.rocksdb::DBImpl::WriteImpl.rocksdb::DBImpl::Write.rocksdb::Benchmark::DoWrite
      7.95            -0.7        7.23 ± 10%  perf-profile.calltrace.cycles-pp.clear_bhb_loop.__sched_yield.rocksdb::WriteThread::CompleteParallelMemTableWriter.rocksdb::DBImpl::WriteImpl.rocksdb::DBImpl::Write
      3.48            -0.3        3.14 ± 10%  perf-profile.calltrace.cycles-pp.do_sched_yield.__x64_sys_sched_yield.do_syscall_64.entry_SYSCALL_64_after_hwframe.__sched_yield
      0.63 ±  2%      -0.2        0.39 ± 70%  perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.__sched_yield.rocksdb::WriteThread::CompleteParallelMemTableWriter.rocksdb::DBImpl::WriteImpl.rocksdb::DBImpl::Write
      1.24            +0.7        1.98 ± 32%  perf-profile.calltrace.cycles-pp.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.pthread_cond_signal
      0.00            +0.7        0.74 ± 27%  perf-profile.calltrace.cycles-pp.futex_wait_setup.__futex_wait.futex_wait.do_futex.__x64_sys_futex
      1.25            +0.7        2.00 ± 31%  perf-profile.calltrace.cycles-pp.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.pthread_cond_signal
      1.29            +0.8        2.04 ± 31%  perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.pthread_cond_signal
      1.30            +0.8        2.05 ± 31%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.pthread_cond_signal
      0.00            +0.8        0.76 ± 23%  perf-profile.calltrace.cycles-pp.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.rocksdb::WriteThread::AwaitState.rocksdb::DBImpl::WriteImpl
      1.47            +0.8        2.26 ± 31%  perf-profile.calltrace.cycles-pp.pthread_cond_signal
      0.00            +0.8        0.80 ± 22%  perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.rocksdb::WriteThread::AwaitState.rocksdb::DBImpl::WriteImpl.rocksdb::DBImpl::Write
      2.02            +0.8        2.83 ± 26%  perf-profile.calltrace.cycles-pp.__futex_wait.futex_wait.do_futex.__x64_sys_futex.do_syscall_64
      0.00            +0.8        0.80 ± 23%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.rocksdb::WriteThread::AwaitState.rocksdb::DBImpl::WriteImpl.rocksdb::DBImpl::Write.rocksdb::Benchmark::DoWrite
      2.05            +0.8        2.87 ± 26%  perf-profile.calltrace.cycles-pp.futex_wait.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe
      2.06            +0.8        2.88 ± 26%  perf-profile.calltrace.cycles-pp.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe
      2.07            +0.8        2.92 ± 26%  perf-profile.calltrace.cycles-pp.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.00            +1.0        0.95 ± 48%  perf-profile.calltrace.cycles-pp.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.rocksdb::WriteThread::AwaitState
      2.33            +1.0        3.38 ± 26%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe
      2.32            +1.0        3.37 ± 26%  perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe
      1.23            +1.7        2.92 ± 37%  perf-profile.calltrace.cycles-pp.futex_wake.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe
     85.49            -4.0       81.48 ±  6%  perf-profile.children.cycles-pp.rocksdb::DBImpl::WriteImpl
     85.62            -4.0       81.63 ±  6%  perf-profile.children.cycles-pp.rocksdb::DBImpl::Write
     85.87            -4.0       81.88 ±  5%  perf-profile.children.cycles-pp.rocksdb::Benchmark::ThreadBody
     85.86            -4.0       81.87 ±  5%  perf-profile.children.cycles-pp.rocksdb::Benchmark::DoWrite
      2.34 ± 15%      -0.8        1.52 ±  8%  perf-profile.children.cycles-pp.rocksdb::WriteThread::LinkOne
      2.28 ± 15%      -0.8        1.46 ±  8%  perf-profile.children.cycles-pp.rocksdb::WriteThread::JoinBatchGroup
      8.24            -0.7        7.56 ±  9%  perf-profile.children.cycles-pp.clear_bhb_loop
      3.85            -0.4        3.47 ± 10%  perf-profile.children.cycles-pp.do_sched_yield
      0.80            -0.1        0.71 ±  9%  perf-profile.children.cycles-pp.raw_spin_rq_unlock
      0.26 ± 10%      -0.1        0.20 ± 15%  perf-profile.children.cycles-pp.sched_balance_newidle
      0.46 ±  2%      -0.1        0.40 ± 11%  perf-profile.children.cycles-pp.yield_task_fair
      0.31            -0.1        0.26 ±  3%  perf-profile.children.cycles-pp.pthread_cond_destroy
      0.16 ± 12%      -0.0        0.11 ± 20%  perf-profile.children.cycles-pp.pthread_rwlock_rdlock
      0.10 ±  6%      +0.0        0.14 ± 25%  perf-profile.children.cycles-pp.start_dl_timer
      0.06 ±  7%      +0.0        0.10 ± 25%  perf-profile.children.cycles-pp.rseq_ip_fixup
      0.10            +0.0        0.14 ± 27%  perf-profile.children.cycles-pp.__rseq_handle_notify_resume
      0.03 ± 70%      +0.0        0.08 ± 28%  perf-profile.children.cycles-pp.switch_hrtimer_base
      0.20 ±  2%      +0.1        0.27 ± 28%  perf-profile.children.cycles-pp.enqueue_dl_entity
      0.00            +0.1        0.09 ± 29%  perf-profile.children.cycles-pp.switch_fpu_return
      0.00            +0.1        0.09 ± 34%  perf-profile.children.cycles-pp.wake_q_add_safe
      0.14 ±  7%      +0.1        0.24 ± 34%  perf-profile.children.cycles-pp.futex_q_lock
      0.00            +0.2        0.18 ± 28%  perf-profile.children.cycles-pp.plist_add
      0.00            +0.2        0.20 ± 28%  perf-profile.children.cycles-pp.__futex_queue
      0.00            +0.2        0.25 ± 31%  perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
      0.16 ±  2%      +0.3        0.51 ± 33%  perf-profile.children.cycles-pp.futex_wake_mark
      0.00            +0.4        0.40 ± 35%  perf-profile.children.cycles-pp.__futex_unqueue
      0.29            +0.5        0.75 ± 27%  perf-profile.children.cycles-pp.futex_wait_setup
      3.70            +0.5        4.22        perf-profile.children.cycles-pp._raw_spin_lock
      1.51            +0.8        2.31 ± 31%  perf-profile.children.cycles-pp.pthread_cond_signal
      2.04            +0.8        2.83 ± 26%  perf-profile.children.cycles-pp.__futex_wait
      2.05            +0.8        2.87 ± 26%  perf-profile.children.cycles-pp.futex_wait
      1.36            +1.7        3.06 ± 32%  perf-profile.children.cycles-pp.futex_wake
      3.44            +2.5        5.96 ± 29%  perf-profile.children.cycles-pp.do_futex
      3.47            +2.5        6.01 ± 29%  perf-profile.children.cycles-pp.__x64_sys_futex
      2.33 ± 15%      -0.8        1.51 ±  8%  perf-profile.self.cycles-pp.rocksdb::WriteThread::LinkOne
      8.14            -0.7        7.46 ±  9%  perf-profile.self.cycles-pp.clear_bhb_loop
      5.63            -0.3        5.31 ±  6%  perf-profile.self.cycles-pp.__schedule
      1.24            -0.2        1.08 ± 10%  perf-profile.self.cycles-pp.do_sched_yield
      0.54            -0.1        0.48 ± 10%  perf-profile.self.cycles-pp.raw_spin_rq_unlock
      0.30 ±  2%      -0.1        0.25 ±  5%  perf-profile.self.cycles-pp.pthread_cond_destroy
      0.16 ± 12%      -0.0        0.11 ± 20%  perf-profile.self.cycles-pp.pthread_rwlock_rdlock
      0.06            -0.0        0.03 ± 70%  perf-profile.self.cycles-pp.rocksdb::WriteThread::SetState
      0.08 ±  4%      +0.0        0.12 ± 35%  perf-profile.self.cycles-pp.set_next_entity
      0.00            +0.1        0.08 ± 29%  perf-profile.self.cycles-pp.switch_fpu_return
      0.00            +0.1        0.09 ± 34%  perf-profile.self.cycles-pp.wake_q_add_safe
      0.14 ±  9%      +0.1        0.24 ± 34%  perf-profile.self.cycles-pp.futex_q_lock
      0.08 ±  8%      +0.1        0.20 ± 75%  perf-profile.self.cycles-pp.ktime_get
      0.00            +0.2        0.18 ± 29%  perf-profile.self.cycles-pp.plist_add
      0.00            +0.2        0.25 ± 31%  perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
      3.46            +0.3        3.78 ±  2%  perf-profile.self.cycles-pp._raw_spin_lock
      0.00            +0.4        0.37 ± 36%  perf-profile.self.cycles-pp.__futex_unqueue
      0.32 ±  3%      +0.8        1.10 ± 34%  perf-profile.self.cycles-pp.futex_wake





Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ