[<prev] [next>] [day] [month] [year] [list]
Message-ID: <202505191432.b25b9c1f-lkp@intel.com>
Date: Mon, 19 May 2025 14:58:14 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, <linux-kernel@...r.kernel.org>,
<x86@...nel.org>, Peter Zijlstra <peterz@...radead.org>,
<oliver.sang@...el.com>
Subject: [tip:locking/futex] [futex] 7c4f75a21f:
phoronix-test-suite.speedb.SequentialFill.op_s 11.7% regression
Hello,
kernel test robot noticed a 11.7% regression of phoronix-test-suite.speedb.SequentialFill.op_s on:
commit: 7c4f75a21f636486d2969d9b6680403ea8483539 ("futex: Allow automatic allocation of process wide futex hash")
https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git locking/futex
[test failed on linux-next/master 484803582c77061b470ac64a634f25f89715be3f]
testcase: phoronix-test-suite
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 128 threads 2 sockets Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz (Ice Lake) with 128G memory
parameters:
test: speedb-1.0.1
option_a: Sequential Fill
cpufreq_governor: performance
In addition to that, the commit also has significant impact on the following tests:
+------------------+---------------------------------------------------------------------------------------------+
| testcase: change | perf-bench-futex: perf-bench-futex.ops/s 94.6% regression |
| test machine | 192 threads 2 sockets Intel(R) Xeon(R) 6740E CPU @ 2.4GHz (Sierra Forest) with 256G memory |
| test parameters | cpufreq_governor=performance |
| | nr_task=100% |
| | runtime=300s |
| | test=hash |
+------------------+---------------------------------------------------------------------------------------------+
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@...el.com>
| Closes: https://lore.kernel.org/oe-lkp/202505191432.b25b9c1f-lkp@intel.com
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250519/202505191432.b25b9c1f-lkp@intel.com
=========================================================================================
compiler/cpufreq_governor/kconfig/option_a/rootfs/tbox_group/test/testcase:
gcc-12/performance/x86_64-rhel-9.4/Sequential Fill/debian-12-x86_64-phoronix/lkp-icl-2sp5/speedb-1.0.1/phoronix-test-suite
commit:
80367ad01d ("futex: Add basic infrastructure for local task local hash")
7c4f75a21f ("futex: Allow automatic allocation of process wide futex hash")
80367ad01d93ac78 7c4f75a21f636486d2969d9b668
---------------- ---------------------------
%stddev %change %stddev
\ | \
6.085e+10 +14.7% 6.979e+10 cpuidle..time
832.35 +11.0% 923.95 uptime.boot
75739 +11.9% 84762 uptime.idle
745.32 -11.5% 659.62 vmstat.io.bi
1256218 -5.3% 1190033 vmstat.system.cs
1512066 -4.4% 1445260 proc-vmstat.nr_active_anon
1758143 -3.9% 1688770 proc-vmstat.nr_file_pages
48547 -5.9% 45679 proc-vmstat.nr_mapped
1147755 -6.1% 1078202 proc-vmstat.nr_shmem
1512066 -4.4% 1445260 proc-vmstat.nr_zone_active_anon
1252996 ± 7% +49.6% 1875006 ± 14% proc-vmstat.numa_pte_updates
0.56 +10.4% 0.62 perf-sched.total_wait_and_delay.average.ms
3397230 -9.6% 3070064 perf-sched.total_wait_and_delay.count.ms
0.56 +10.5% 0.62 perf-sched.total_wait_time.average.ms
0.18 +10.9% 0.20 perf-sched.wait_and_delay.avg.ms.futex_do_wait.__futex_wait.futex_wait.do_futex
3388022 -9.6% 3061182 perf-sched.wait_and_delay.count.futex_do_wait.__futex_wait.futex_wait.do_futex
1238 +11.3% 1378 perf-sched.wait_and_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown]
0.18 +11.2% 0.20 perf-sched.wait_time.avg.ms.futex_do_wait.__futex_wait.futex_wait.do_futex
1238 +11.3% 1378 perf-sched.wait_time.max.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown]
554832 -11.7% 490186 phoronix-test-suite.speedb.SequentialFill.op_s
713.28 +12.8% 804.49 phoronix-test-suite.time.elapsed_time
713.28 +12.8% 804.49 phoronix-test-suite.time.elapsed_time.max
258734 ± 3% -14.9% 220243 ± 7% phoronix-test-suite.time.involuntary_context_switches
4069 -3.4% 3931 phoronix-test-suite.time.percent_of_cpu_this_job_got
18615 +12.0% 20857 phoronix-test-suite.time.system_time
10416 +3.5% 10776 phoronix-test-suite.time.user_time
4.488e+08 +6.8% 4.792e+08 phoronix-test-suite.time.voluntary_context_switches
0.36 +8.0% 0.39 perf-stat.i.MPKI
27161568 -2.5% 26475937 perf-stat.i.branch-misses
28.74 +1.8 30.50 perf-stat.i.cache-miss-rate%
53609412 +7.0% 57337744 perf-stat.i.cache-misses
1262748 -5.3% 1195372 perf-stat.i.context-switches
0.98 -3.1% 0.95 perf-stat.i.cpi
1.46e+11 -3.5% 1.408e+11 perf-stat.i.cpu-cycles
2826 -9.6% 2556 perf-stat.i.cycles-between-cache-misses
0.03 -0.0 0.03 ± 4% perf-stat.i.dTLB-load-miss-rate%
4282480 ± 2% -7.6% 3958135 perf-stat.i.dTLB-load-misses
0.01 ± 2% -0.0 0.01 ± 4% perf-stat.i.dTLB-store-miss-rate%
636770 ± 4% -24.1% 483093 ± 2% perf-stat.i.dTLB-store-misses
1.06 +3.2% 1.10 perf-stat.i.ipc
0.28 -12.2% 0.24 ± 7% perf-stat.i.major-faults
1.14 -3.5% 1.10 perf-stat.i.metric.GHz
257.79 +7.6% 277.26 perf-stat.i.metric.K/sec
7866 ± 2% -7.8% 7256 perf-stat.i.minor-faults
16753987 +4.7% 17536138 perf-stat.i.node-load-misses
6529414 ± 2% +16.2% 7589233 perf-stat.i.node-store-misses
4718305 ± 2% +20.1% 5666449 perf-stat.i.node-stores
7866 ± 2% -7.8% 7256 perf-stat.i.page-faults
0.34 +7.3% 0.37 perf-stat.overall.MPKI
0.07 -0.0 0.07 perf-stat.overall.branch-miss-rate%
28.91 +1.8 30.67 perf-stat.overall.cache-miss-rate%
0.93 -3.1% 0.90 perf-stat.overall.cpi
2722 -9.8% 2457 perf-stat.overall.cycles-between-cache-misses
0.01 -0.0 0.01 perf-stat.overall.dTLB-load-miss-rate%
0.00 ± 4% -0.0 0.00 ± 2% perf-stat.overall.dTLB-store-miss-rate%
1.08 +3.3% 1.11 perf-stat.overall.ipc
27129347 -2.5% 26448313 perf-stat.ps.branch-misses
53525176 +7.0% 57254557 perf-stat.ps.cache-misses
1260670 -5.3% 1193642 perf-stat.ps.context-switches
1.457e+11 -3.5% 1.406e+11 perf-stat.ps.cpu-cycles
4278010 ± 2% -7.6% 3953474 perf-stat.ps.dTLB-load-misses
635974 ± 4% -24.1% 482580 ± 2% perf-stat.ps.dTLB-store-misses
0.28 -12.1% 0.25 ± 7% perf-stat.ps.major-faults
7861 ± 2% -7.8% 7251 perf-stat.ps.minor-faults
16727387 +4.7% 17510697 perf-stat.ps.node-load-misses
6518975 ± 2% +16.2% 7578033 perf-stat.ps.node-store-misses
4711090 ± 2% +20.1% 5658530 perf-stat.ps.node-stores
7861 ± 2% -7.8% 7251 perf-stat.ps.page-faults
1.122e+14 +12.4% 1.261e+14 perf-stat.total.instructions
2.18 ± 12% -29.6% 1.54 ± 15% sched_debug.cfs_rq:/.load_avg.min
64.68 ± 20% -41.6% 37.77 ± 16% sched_debug.cfs_rq:/.runnable_avg.min
64.69 ± 20% -41.6% 37.76 ± 16% sched_debug.cfs_rq:/.util_avg.min
4792810 +9.9% 5265167 sched_debug.cfs_rq:/system.slice.avg_vruntime.min
9.18 ± 9% -11.1% 8.16 ± 3% sched_debug.cfs_rq:/system.slice.load_avg.avg
2.31 ± 16% -25.1% 1.73 ± 19% sched_debug.cfs_rq:/system.slice.load_avg.min
4792810 +9.9% 5265167 sched_debug.cfs_rq:/system.slice.min_vruntime.min
64.62 ± 20% -41.6% 37.73 ± 15% sched_debug.cfs_rq:/system.slice.runnable_avg.min
1.43 ± 18% -45.1% 0.79 ± 20% sched_debug.cfs_rq:/system.slice.se->avg.load_avg.min
64.58 ± 20% -41.6% 37.70 ± 16% sched_debug.cfs_rq:/system.slice.se->avg.runnable_avg.min
64.61 ± 20% -41.6% 37.70 ± 16% sched_debug.cfs_rq:/system.slice.se->avg.util_avg.min
445227 +13.5% 505275 sched_debug.cfs_rq:/system.slice.se->exec_start.avg
445590 +13.5% 505640 sched_debug.cfs_rq:/system.slice.se->exec_start.max
437889 +13.7% 497827 sched_debug.cfs_rq:/system.slice.se->exec_start.min
103036 +14.2% 117717 sched_debug.cfs_rq:/system.slice.se->sum_exec_runtime.avg
113992 ± 3% +11.4% 127019 sched_debug.cfs_rq:/system.slice.se->sum_exec_runtime.max
101120 +14.3% 115617 sched_debug.cfs_rq:/system.slice.se->sum_exec_runtime.min
2.28 ± 12% -23.2% 1.75 ± 16% sched_debug.cfs_rq:/system.slice.tg_load_avg_contrib.min
64.65 ± 20% -41.6% 37.73 ± 16% sched_debug.cfs_rq:/system.slice.util_avg.min
445111 +13.5% 505165 sched_debug.cfs_rq:/system.slice/containerd.service.se->exec_start.avg
445426 +13.5% 505520 sched_debug.cfs_rq:/system.slice/containerd.service.se->exec_start.max
442897 +13.5% 502644 sched_debug.cfs_rq:/system.slice/containerd.service.se->exec_start.min
4860094 +9.9% 5339463 sched_debug.cfs_rq:/system.slice/containerd.service.se->vruntime.avg
4805849 +10.0% 5287162 sched_debug.cfs_rq:/system.slice/containerd.service.se->vruntime.min
102970 +14.2% 117638 sched_debug.cfs_rq:/system.slice/lkp-bootstrap.service.avg_vruntime.avg
113941 ± 3% +11.3% 126851 sched_debug.cfs_rq:/system.slice/lkp-bootstrap.service.avg_vruntime.max
101055 +14.3% 115547 sched_debug.cfs_rq:/system.slice/lkp-bootstrap.service.avg_vruntime.min
69.43 ± 13% -43.9% 38.94 ± 16% sched_debug.cfs_rq:/system.slice/lkp-bootstrap.service.load_avg.min
102970 +14.2% 117638 sched_debug.cfs_rq:/system.slice/lkp-bootstrap.service.min_vruntime.avg
113941 ± 3% +11.3% 126851 sched_debug.cfs_rq:/system.slice/lkp-bootstrap.service.min_vruntime.max
101055 +14.3% 115547 sched_debug.cfs_rq:/system.slice/lkp-bootstrap.service.min_vruntime.min
69.17 ± 14% -44.2% 38.61 ± 16% sched_debug.cfs_rq:/system.slice/lkp-bootstrap.service.runnable_avg.min
1.22 ± 19% -49.4% 0.62 ± 17% sched_debug.cfs_rq:/system.slice/lkp-bootstrap.service.se->avg.load_avg.min
67.61 ± 14% -44.7% 37.38 ± 16% sched_debug.cfs_rq:/system.slice/lkp-bootstrap.service.se->avg.runnable_avg.min
67.64 ± 14% -44.7% 37.38 ± 16% sched_debug.cfs_rq:/system.slice/lkp-bootstrap.service.se->avg.util_avg.min
445226 +13.5% 505279 sched_debug.cfs_rq:/system.slice/lkp-bootstrap.service.se->exec_start.avg
445589 +13.5% 505640 sched_debug.cfs_rq:/system.slice/lkp-bootstrap.service.se->exec_start.max
437889 +13.7% 497980 sched_debug.cfs_rq:/system.slice/lkp-bootstrap.service.se->exec_start.min
102976 +14.2% 117644 sched_debug.cfs_rq:/system.slice/lkp-bootstrap.service.se->sum_exec_runtime.avg
113947 ± 3% +11.3% 126858 sched_debug.cfs_rq:/system.slice/lkp-bootstrap.service.se->sum_exec_runtime.max
101061 +14.3% 115553 sched_debug.cfs_rq:/system.slice/lkp-bootstrap.service.se->sum_exec_runtime.min
4792828 +9.9% 5266259 sched_debug.cfs_rq:/system.slice/lkp-bootstrap.service.se->vruntime.min
71.67 ± 20% -47.8% 37.38 ± 16% sched_debug.cfs_rq:/system.slice/lkp-bootstrap.service.tg_load_avg_contrib.min
69.17 ± 14% -44.2% 38.60 ± 16% sched_debug.cfs_rq:/system.slice/lkp-bootstrap.service.util_avg.min
0.12 ± 33% -42.9% 0.07 ± 57% sched_debug.cfs_rq:/system.slice/redis-server.service.load_avg.max
445113 +13.5% 505199 sched_debug.cfs_rq:/system.slice/redis-server.service.se->exec_start.avg
445251 +13.5% 505317 sched_debug.cfs_rq:/system.slice/redis-server.service.se->exec_start.max
444977 +13.5% 505056 sched_debug.cfs_rq:/system.slice/redis-server.service.se->exec_start.min
4855865 +9.8% 5333093 sched_debug.cfs_rq:/system.slice/redis-server.service.se->vruntime.avg
4871037 +9.8% 5349946 sched_debug.cfs_rq:/system.slice/redis-server.service.se->vruntime.max
4840982 +9.8% 5317191 sched_debug.cfs_rq:/system.slice/redis-server.service.se->vruntime.min
0.14 ± 28% -48.6% 0.07 ± 57% sched_debug.cfs_rq:/system.slice/redis-server.service.tg_load_avg.max
0.12 ± 33% -42.9% 0.07 ± 57% sched_debug.cfs_rq:/system.slice/redis-server.service.tg_load_avg_contrib.max
447837 +13.5% 508303 sched_debug.cpu.clock.avg
447843 +13.5% 508310 sched_debug.cpu.clock.max
447830 +13.5% 508297 sched_debug.cpu.clock.min
445243 +13.5% 505290 sched_debug.cpu.clock_task.avg
445599 +13.5% 505645 sched_debug.cpu.clock_task.max
437708 +13.7% 497758 sched_debug.cpu.clock_task.min
3266 ± 3% +10.9% 3623 ± 5% sched_debug.cpu.curr->pid.avg
13788 +11.1% 15317 sched_debug.cpu.curr->pid.max
4619 +14.6% 5292 ± 3% sched_debug.cpu.curr->pid.stddev
3215856 +12.5% 3616881 sched_debug.cpu.nr_switches.avg
3332744 +11.7% 3724017 sched_debug.cpu.nr_switches.max
3037135 +14.3% 3470789 sched_debug.cpu.nr_switches.min
0.01 ± 10% -18.9% 0.01 ± 19% sched_debug.cpu.nr_uninterruptible.avg
447830 +13.5% 508297 sched_debug.cpu_clk
447123 +13.5% 507589 sched_debug.ktime
448701 +13.5% 509200 sched_debug.sched_clk
85.47 -4.0 81.46 ± 6% perf-profile.calltrace.cycles-pp.rocksdb::DBImpl::WriteImpl.rocksdb::DBImpl::Write.rocksdb::Benchmark::DoWrite.rocksdb::Benchmark::ThreadBody
85.62 -4.0 81.62 ± 6% perf-profile.calltrace.cycles-pp.rocksdb::DBImpl::Write.rocksdb::Benchmark::DoWrite.rocksdb::Benchmark::ThreadBody
85.87 -4.0 81.88 ± 5% perf-profile.calltrace.cycles-pp.rocksdb::Benchmark::ThreadBody
85.85 -4.0 81.86 ± 5% perf-profile.calltrace.cycles-pp.rocksdb::Benchmark::DoWrite.rocksdb::Benchmark::ThreadBody
2.27 ± 15% -0.8 1.46 ± 8% perf-profile.calltrace.cycles-pp.rocksdb::WriteThread::JoinBatchGroup.rocksdb::DBImpl::WriteImpl.rocksdb::DBImpl::Write.rocksdb::Benchmark::DoWrite.rocksdb::Benchmark::ThreadBody
2.27 ± 15% -0.8 1.46 ± 8% perf-profile.calltrace.cycles-pp.rocksdb::WriteThread::LinkOne.rocksdb::WriteThread::JoinBatchGroup.rocksdb::DBImpl::WriteImpl.rocksdb::DBImpl::Write.rocksdb::Benchmark::DoWrite
7.95 -0.7 7.23 ± 10% perf-profile.calltrace.cycles-pp.clear_bhb_loop.__sched_yield.rocksdb::WriteThread::CompleteParallelMemTableWriter.rocksdb::DBImpl::WriteImpl.rocksdb::DBImpl::Write
3.48 -0.3 3.14 ± 10% perf-profile.calltrace.cycles-pp.do_sched_yield.__x64_sys_sched_yield.do_syscall_64.entry_SYSCALL_64_after_hwframe.__sched_yield
0.63 ± 2% -0.2 0.39 ± 70% perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.__sched_yield.rocksdb::WriteThread::CompleteParallelMemTableWriter.rocksdb::DBImpl::WriteImpl.rocksdb::DBImpl::Write
1.24 +0.7 1.98 ± 32% perf-profile.calltrace.cycles-pp.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.pthread_cond_signal
0.00 +0.7 0.74 ± 27% perf-profile.calltrace.cycles-pp.futex_wait_setup.__futex_wait.futex_wait.do_futex.__x64_sys_futex
1.25 +0.7 2.00 ± 31% perf-profile.calltrace.cycles-pp.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.pthread_cond_signal
1.29 +0.8 2.04 ± 31% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.pthread_cond_signal
1.30 +0.8 2.05 ± 31% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.pthread_cond_signal
0.00 +0.8 0.76 ± 23% perf-profile.calltrace.cycles-pp.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.rocksdb::WriteThread::AwaitState.rocksdb::DBImpl::WriteImpl
1.47 +0.8 2.26 ± 31% perf-profile.calltrace.cycles-pp.pthread_cond_signal
0.00 +0.8 0.80 ± 22% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.rocksdb::WriteThread::AwaitState.rocksdb::DBImpl::WriteImpl.rocksdb::DBImpl::Write
2.02 +0.8 2.83 ± 26% perf-profile.calltrace.cycles-pp.__futex_wait.futex_wait.do_futex.__x64_sys_futex.do_syscall_64
0.00 +0.8 0.80 ± 23% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.rocksdb::WriteThread::AwaitState.rocksdb::DBImpl::WriteImpl.rocksdb::DBImpl::Write.rocksdb::Benchmark::DoWrite
2.05 +0.8 2.87 ± 26% perf-profile.calltrace.cycles-pp.futex_wait.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe
2.06 +0.8 2.88 ± 26% perf-profile.calltrace.cycles-pp.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe
2.07 +0.8 2.92 ± 26% perf-profile.calltrace.cycles-pp.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.00 +1.0 0.95 ± 48% perf-profile.calltrace.cycles-pp.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.rocksdb::WriteThread::AwaitState
2.33 +1.0 3.38 ± 26% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe
2.32 +1.0 3.37 ± 26% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe
1.23 +1.7 2.92 ± 37% perf-profile.calltrace.cycles-pp.futex_wake.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe
85.49 -4.0 81.48 ± 6% perf-profile.children.cycles-pp.rocksdb::DBImpl::WriteImpl
85.62 -4.0 81.63 ± 6% perf-profile.children.cycles-pp.rocksdb::DBImpl::Write
85.87 -4.0 81.88 ± 5% perf-profile.children.cycles-pp.rocksdb::Benchmark::ThreadBody
85.86 -4.0 81.87 ± 5% perf-profile.children.cycles-pp.rocksdb::Benchmark::DoWrite
2.34 ± 15% -0.8 1.52 ± 8% perf-profile.children.cycles-pp.rocksdb::WriteThread::LinkOne
2.28 ± 15% -0.8 1.46 ± 8% perf-profile.children.cycles-pp.rocksdb::WriteThread::JoinBatchGroup
8.24 -0.7 7.56 ± 9% perf-profile.children.cycles-pp.clear_bhb_loop
3.85 -0.4 3.47 ± 10% perf-profile.children.cycles-pp.do_sched_yield
0.80 -0.1 0.71 ± 9% perf-profile.children.cycles-pp.raw_spin_rq_unlock
0.26 ± 10% -0.1 0.20 ± 15% perf-profile.children.cycles-pp.sched_balance_newidle
0.46 ± 2% -0.1 0.40 ± 11% perf-profile.children.cycles-pp.yield_task_fair
0.31 -0.1 0.26 ± 3% perf-profile.children.cycles-pp.pthread_cond_destroy
0.16 ± 12% -0.0 0.11 ± 20% perf-profile.children.cycles-pp.pthread_rwlock_rdlock
0.10 ± 6% +0.0 0.14 ± 25% perf-profile.children.cycles-pp.start_dl_timer
0.06 ± 7% +0.0 0.10 ± 25% perf-profile.children.cycles-pp.rseq_ip_fixup
0.10 +0.0 0.14 ± 27% perf-profile.children.cycles-pp.__rseq_handle_notify_resume
0.03 ± 70% +0.0 0.08 ± 28% perf-profile.children.cycles-pp.switch_hrtimer_base
0.20 ± 2% +0.1 0.27 ± 28% perf-profile.children.cycles-pp.enqueue_dl_entity
0.00 +0.1 0.09 ± 29% perf-profile.children.cycles-pp.switch_fpu_return
0.00 +0.1 0.09 ± 34% perf-profile.children.cycles-pp.wake_q_add_safe
0.14 ± 7% +0.1 0.24 ± 34% perf-profile.children.cycles-pp.futex_q_lock
0.00 +0.2 0.18 ± 28% perf-profile.children.cycles-pp.plist_add
0.00 +0.2 0.20 ± 28% perf-profile.children.cycles-pp.__futex_queue
0.00 +0.2 0.25 ± 31% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
0.16 ± 2% +0.3 0.51 ± 33% perf-profile.children.cycles-pp.futex_wake_mark
0.00 +0.4 0.40 ± 35% perf-profile.children.cycles-pp.__futex_unqueue
0.29 +0.5 0.75 ± 27% perf-profile.children.cycles-pp.futex_wait_setup
3.70 +0.5 4.22 perf-profile.children.cycles-pp._raw_spin_lock
1.51 +0.8 2.31 ± 31% perf-profile.children.cycles-pp.pthread_cond_signal
2.04 +0.8 2.83 ± 26% perf-profile.children.cycles-pp.__futex_wait
2.05 +0.8 2.87 ± 26% perf-profile.children.cycles-pp.futex_wait
1.36 +1.7 3.06 ± 32% perf-profile.children.cycles-pp.futex_wake
3.44 +2.5 5.96 ± 29% perf-profile.children.cycles-pp.do_futex
3.47 +2.5 6.01 ± 29% perf-profile.children.cycles-pp.__x64_sys_futex
2.33 ± 15% -0.8 1.51 ± 8% perf-profile.self.cycles-pp.rocksdb::WriteThread::LinkOne
8.14 -0.7 7.46 ± 9% perf-profile.self.cycles-pp.clear_bhb_loop
5.63 -0.3 5.31 ± 6% perf-profile.self.cycles-pp.__schedule
1.24 -0.2 1.08 ± 10% perf-profile.self.cycles-pp.do_sched_yield
0.54 -0.1 0.48 ± 10% perf-profile.self.cycles-pp.raw_spin_rq_unlock
0.30 ± 2% -0.1 0.25 ± 5% perf-profile.self.cycles-pp.pthread_cond_destroy
0.16 ± 12% -0.0 0.11 ± 20% perf-profile.self.cycles-pp.pthread_rwlock_rdlock
0.06 -0.0 0.03 ± 70% perf-profile.self.cycles-pp.rocksdb::WriteThread::SetState
0.08 ± 4% +0.0 0.12 ± 35% perf-profile.self.cycles-pp.set_next_entity
0.00 +0.1 0.08 ± 29% perf-profile.self.cycles-pp.switch_fpu_return
0.00 +0.1 0.09 ± 34% perf-profile.self.cycles-pp.wake_q_add_safe
0.14 ± 9% +0.1 0.24 ± 34% perf-profile.self.cycles-pp.futex_q_lock
0.08 ± 8% +0.1 0.20 ± 75% perf-profile.self.cycles-pp.ktime_get
0.00 +0.2 0.18 ± 29% perf-profile.self.cycles-pp.plist_add
0.00 +0.2 0.25 ± 31% perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
3.46 +0.3 3.78 ± 2% perf-profile.self.cycles-pp._raw_spin_lock
0.00 +0.4 0.37 ± 36% perf-profile.self.cycles-pp.__futex_unqueue
0.32 ± 3% +0.8 1.10 ± 34% perf-profile.self.cycles-pp.futex_wake
***************************************************************************************************
lkp-srf-2sp2: 192 threads 2 sockets Intel(R) Xeon(R) 6740E CPU @ 2.4GHz (Sierra Forest) with 256G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/nr_task/rootfs/runtime/tbox_group/test/testcase:
gcc-12/performance/x86_64-rhel-9.4/100%/debian-12-x86_64-20240206.cgz/300s/lkp-srf-2sp2/hash/perf-bench-futex
commit:
80367ad01d ("futex: Add basic infrastructure for local task local hash")
7c4f75a21f ("futex: Allow automatic allocation of process wide futex hash")
80367ad01d93ac78 7c4f75a21f636486d2969d9b668
---------------- ---------------------------
%stddev %change %stddev
\ | \
79777 ± 9% +29.6% 103404 ± 14% sched_debug.cpu.avg_idle.stddev
13.14 -92.4% 0.99 vmstat.cpu.us
85.94 +12.6 98.54 mpstat.cpu.all.sys%
13.40 -12.6 0.76 mpstat.cpu.all.usr%
253330 +1.4% 256755 proc-vmstat.nr_active_anon
2296 +2.2% 2346 proc-vmstat.nr_page_table_pages
77274 +4.5% 80782 proc-vmstat.nr_shmem
253330 +1.4% 256755 proc-vmstat.nr_zone_active_anon
2667058 -94.6% 144593 perf-bench-futex.ops/s
0.06 ± 13% +0.2 0.21 ± 14% perf-bench-futex.stddev%
229015 -3.4% 221126 perf-bench-futex.time.involuntary_context_switches
49696 +14.7% 57010 perf-bench-futex.time.system_time
7728 -94.6% 416.35 perf-bench-futex.time.user_time
0.74 +90.7% 1.40 perf-stat.i.MPKI
5.333e+10 -82.2% 9.48e+09 perf-stat.i.branch-instructions
0.02 ± 44% +0.4 0.41 perf-stat.i.branch-miss-rate%
9538223 ± 47% +310.1% 39118125 perf-stat.i.branch-misses
50.17 -14.2 35.98 perf-stat.i.cache-miss-rate%
2.424e+08 -74.7% 61296533 perf-stat.i.cache-misses
4.833e+08 -64.7% 1.706e+08 perf-stat.i.cache-references
1.86 +653.3% 13.99 perf-stat.i.cpi
249.82 -4.0% 239.71 perf-stat.i.cpu-migrations
2522 +295.4% 9974 perf-stat.i.cycles-between-cache-misses
3.295e+11 -86.7% 4.369e+10 perf-stat.i.instructions
0.54 -86.7% 0.07 perf-stat.i.ipc
0.74 +90.7% 1.40 perf-stat.overall.MPKI
0.02 ± 47% +0.4 0.41 perf-stat.overall.branch-miss-rate%
50.15 -14.2 35.93 perf-stat.overall.cache-miss-rate%
1.86 +654.2% 14.00 perf-stat.overall.cpi
2522 +295.6% 9979 perf-stat.overall.cycles-between-cache-misses
0.54 -86.7% 0.07 perf-stat.overall.ipc
5.316e+10 -82.2% 9.448e+09 perf-stat.ps.branch-instructions
9509524 ± 47% +310.0% 38990460 perf-stat.ps.branch-misses
2.416e+08 -74.7% 61091933 perf-stat.ps.cache-misses
4.817e+08 -64.7% 1.7e+08 perf-stat.ps.cache-references
249.00 -4.0% 238.92 perf-stat.ps.cpu-migrations
3.284e+11 -86.7% 4.354e+10 perf-stat.ps.instructions
9.88e+13 -86.7% 1.31e+13 perf-stat.total.instructions
0.02 +52.8% 0.04 ± 9% perf-sched.sch_delay.avg.ms.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity
0.01 +80.0% 0.01 ± 9% perf-sched.sch_delay.avg.ms.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
0.02 ± 13% +31.1% 0.03 ± 18% perf-sched.sch_delay.avg.ms.devkmsg_read.vfs_read.ksys_read.do_syscall_64
0.01 ± 7% +122.5% 0.01 ± 47% perf-sched.sch_delay.avg.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
0.00 ±142% +566.7% 0.01 ± 39% perf-sched.sch_delay.avg.ms.schedule_timeout.__wait_for_common.__flush_work.__lru_add_drain_all
0.00 +204.2% 0.01 ± 7% perf-sched.sch_delay.avg.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
0.01 ± 22% +147.9% 0.02 ± 25% perf-sched.sch_delay.avg.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
0.01 ± 12% +298.1% 0.04 ± 29% perf-sched.sch_delay.max.ms.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
0.36 ± 50% -70.5% 0.11 ± 64% perf-sched.sch_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_call_function_single.[unknown].[unknown]
0.01 ± 5% +145.6% 0.02 ± 64% perf-sched.sch_delay.max.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
0.00 ±142% +566.7% 0.01 ± 39% perf-sched.sch_delay.max.ms.schedule_timeout.__wait_for_common.__flush_work.__lru_add_drain_all
0.02 ± 58% +318.7% 0.07 ± 27% perf-sched.sch_delay.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
125.67 ± 2% -11.1% 111.69 ± 2% perf-sched.total_wait_and_delay.average.ms
125.61 ± 2% -11.1% 111.63 ± 2% perf-sched.total_wait_time.average.ms
37.18 ± 15% +36.8% 50.87 ± 10% perf-sched.wait_and_delay.avg.ms.anon_pipe_read.vfs_read.ksys_read.do_syscall_64
0.24 ± 23% -100.0% 0.00 perf-sched.wait_and_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown]
0.05 ± 26% -100.0% 0.00 perf-sched.wait_and_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_call_function_single.[unknown].[unknown]
7.31 ± 5% -29.9% 5.12 ± 3% perf-sched.wait_and_delay.avg.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
547.19 -12.1% 480.98 perf-sched.wait_and_delay.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
771.00 ± 14% -27.7% 557.50 ± 10% perf-sched.wait_and_delay.count.anon_pipe_read.vfs_read.ksys_read.do_syscall_64
247.83 ± 4% -100.0% 0.00 perf-sched.wait_and_delay.count.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown]
145.17 ± 31% -100.0% 0.00 perf-sched.wait_and_delay.count.irqentry_exit_to_user_mode.asm_sysvec_call_function_single.[unknown].[unknown]
632.33 ± 5% +53.9% 973.00 ± 3% perf-sched.wait_and_delay.count.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
3727 +11.5% 4155 ± 2% perf-sched.wait_and_delay.count.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
8.43 ± 63% -100.0% 0.00 perf-sched.wait_and_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown]
0.72 ± 50% -100.0% 0.00 perf-sched.wait_and_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_call_function_single.[unknown].[unknown]
530.84 ± 4% -44.9% 292.50 ± 10% perf-sched.wait_and_delay.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
37.16 ± 14% +36.8% 50.84 ± 10% perf-sched.wait_time.avg.ms.anon_pipe_read.vfs_read.ksys_read.do_syscall_64
3.12 ± 12% -31.8% 2.13 ± 13% perf-sched.wait_time.avg.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
7.30 ± 5% -30.0% 5.11 ± 3% perf-sched.wait_time.avg.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
547.19 -12.1% 480.97 perf-sched.wait_time.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
0.36 ± 50% -70.5% 0.11 ± 64% perf-sched.wait_time.max.ms.irqentry_exit_to_user_mode.asm_sysvec_call_function_single.[unknown].[unknown]
4.83 ± 7% -37.9% 3.00 ± 37% perf-sched.wait_time.max.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
530.83 ± 4% -44.9% 292.49 ± 10% perf-sched.wait_time.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
Powered by blists - more mailing lists