[<prev] [next>] [day] [month] [year] [list]
Message-ID: <202502191317.d0050992-lkp@intel.com>
Date: Wed, 19 Feb 2025 13:46:21 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Mateusz Guzik <mjguzik@...il.com>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, Christian Brauner
<brauner@...nel.org>, Oleg Nesterov <oleg@...hat.com>, "Liam R. Howlett"
<Liam.Howlett@...cle.com>, <linux-kernel@...r.kernel.org>,
<oliver.sang@...el.com>
Subject: [linux-next:master] [pid] 7903f907a2: stress-ng.pthread.ops_per_sec
23.4% improvement
Hello,
kernel test robot noticed a 23.4% improvement of stress-ng.pthread.ops_per_sec on:
commit: 7903f907a226058ed99f86e9924e082aea57fc45 ("pid: perform free_pid() calls outside of tasklist_lock")
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
testcase: stress-ng
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 256G memory
parameters:
nr_threads: 100%
testtime: 60s
test: pthread
cpufreq_governor: performance
In addition to that, the commit also has significant impact on the following tests:
+------------------+---------------------------------------------------------------------------------------------+
| testcase: change | stress-ng: stress-ng.vfork.ops_per_sec 28.7% improvement |
| test machine | 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 512G memory |
| test parameters | cpufreq_governor=performance |
| | nr_threads=100% |
| | test=vfork |
| | testtime=60s |
+------------------+---------------------------------------------------------------------------------------------+
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250219/202502191317.d0050992-lkp@intel.com
=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
gcc-12/performance/x86_64-rhel-9.4/100%/debian-12-x86_64-20240206.cgz/lkp-spr-r02/pthread/stress-ng/60s
commit:
74198dc206 ("pid: sprinkle tasklist_lock asserts")
7903f907a2 ("pid: perform free_pid() calls outside of tasklist_lock")
74198dc2067b2aa1 7903f907a226058ed99f86e9924
---------------- ---------------------------
%stddev %change %stddev
\ | \
5.953e+08 ± 9% +82.9% 1.089e+09 ± 3% cpuidle..time
3067781 ± 17% +281.8% 11714061 ± 4% cpuidle..usage
3156621 ± 7% -11.8% 2783051 ± 7% numa-meminfo.node0.AnonPages
315502 ± 4% -11.0% 280901 ± 4% numa-meminfo.node1.PageTables
2119 ± 4% -59.4% 861.38 ± 3% vmstat.procs.r
695158 +37.7% 957064 vmstat.system.cs
786439 +58.8% 1248633 vmstat.system.in
918265 -31.9% 625741 ± 31% meminfo.AnonHugePages
9498433 ± 3% +13.6% 10786868 ± 3% meminfo.Cached
1.188e+09 -11.7% 1.049e+09 meminfo.Committed_AS
5970512 ± 6% +21.6% 7258946 ± 4% meminfo.Shmem
4.38 ± 11% +3.8 8.20 ± 3% mpstat.cpu.all.idle%
0.47 +0.2 0.67 mpstat.cpu.all.irq%
0.37 ± 6% +0.4 0.76 ± 5% mpstat.cpu.all.soft%
1.47 +0.3 1.82 mpstat.cpu.all.usr%
39409396 +21.1% 47737561 ± 2% numa-numastat.node0.local_node
39517687 +21.1% 47862366 ± 2% numa-numastat.node0.numa_hit
39678016 +22.2% 48499008 ± 2% numa-numastat.node1.local_node
39806349 +22.1% 48619579 ± 2% numa-numastat.node1.numa_hit
11111 ± 20% +86.8% 20750 ± 10% perf-c2c.DRAM.local
8594 ± 16% +25.6% 10797 ± 7% perf-c2c.DRAM.remote
14151 ± 18% +100.2% 28336 ± 9% perf-c2c.HITM.local
3853 ± 16% +40.3% 5404 ± 7% perf-c2c.HITM.remote
18004 ± 18% +87.4% 33740 ± 9% perf-c2c.HITM.total
785387 ± 8% -10.5% 702556 ± 7% numa-vmstat.node0.nr_anon_pages
39519842 +20.9% 47789798 ± 2% numa-vmstat.node0.numa_hit
39411551 +20.9% 47665001 ± 2% numa-vmstat.node0.numa_local
78603 ± 3% -9.8% 70878 ± 5% numa-vmstat.node1.nr_page_table_pages
39804028 +22.0% 48541084 ± 2% numa-vmstat.node1.numa_hit
39675696 +22.0% 48420524 ± 2% numa-vmstat.node1.numa_local
304344 ± 7% -66.2% 102730 ± 5% stress-ng.pthread.nanosecs_to_start_a_pthread
10003318 +23.2% 12323193 stress-ng.pthread.ops
166143 +23.4% 204943 stress-ng.pthread.ops_per_sec
4793153 +19.3% 5716581 stress-ng.time.involuntary_context_switches
21587233 +23.1% 26564025 stress-ng.time.minor_page_faults
13184 +11.2% 14659 stress-ng.time.percent_of_cpu_this_job_got
7880 +10.4% 8702 stress-ng.time.system_time
105.74 +51.1% 159.78 stress-ng.time.user_time
23363531 +24.5% 29091883 stress-ng.time.voluntary_context_switches
3104817 ± 2% +7.0% 3322678 ± 2% proc-vmstat.nr_active_anon
1610889 -6.3% 1509476 ± 3% proc-vmstat.nr_anon_pages
447.53 -31.7% 305.57 ± 31% proc-vmstat.nr_anon_transparent_hugepages
2380189 ± 3% +13.4% 2699415 ± 3% proc-vmstat.nr_file_pages
1794253 -3.7% 1727492 proc-vmstat.nr_kernel_stack
154819 -9.1% 140710 proc-vmstat.nr_page_table_pages
1498207 ± 5% +21.3% 1817432 ± 4% proc-vmstat.nr_shmem
47516 +2.5% 48728 proc-vmstat.nr_slab_reclaimable
3104817 ± 2% +7.0% 3322678 ± 2% proc-vmstat.nr_zone_active_anon
550885 ± 15% +69.4% 932960 ± 11% proc-vmstat.numa_hint_faults
293967 ± 27% +95.8% 575443 ± 19% proc-vmstat.numa_hint_faults_local
79375488 +21.6% 96482937 proc-vmstat.numa_hit
79138861 +21.6% 96237560 proc-vmstat.numa_local
330580 ± 9% +27.1% 420192 ± 5% proc-vmstat.numa_pages_migrated
808808 ± 11% +43.0% 1156712 ± 9% proc-vmstat.numa_pte_updates
83384617 +26.0% 1.05e+08 proc-vmstat.pgalloc_normal
22326472 +22.9% 27448052 proc-vmstat.pgfault
80530234 +26.2% 1.017e+08 proc-vmstat.pgfree
330580 ± 9% +27.1% 420192 ± 5% proc-vmstat.pgmigrate_success
261994 ± 8% +39.8% 366207 ± 7% proc-vmstat.pgreuse
4612194 ± 2% +62.7% 7503881 sched_debug.cfs_rq:/.avg_vruntime.avg
5440180 ± 13% +85.6% 10099394 ± 2% sched_debug.cfs_rq:/.avg_vruntime.max
501155 ± 64% +329.5% 2152678 ± 6% sched_debug.cfs_rq:/.avg_vruntime.stddev
2.13 ± 9% -47.3% 1.12 ± 18% sched_debug.cfs_rq:/.h_nr_queued.avg
44.33 ± 10% -55.6% 19.67 ± 47% sched_debug.cfs_rq:/.h_nr_queued.max
5.09 ± 5% -53.8% 2.35 ± 26% sched_debug.cfs_rq:/.h_nr_queued.stddev
2.09 ± 9% -47.9% 1.09 ± 19% sched_debug.cfs_rq:/.h_nr_runnable.avg
44.25 ± 10% -55.7% 19.58 ± 47% sched_debug.cfs_rq:/.h_nr_runnable.max
5.05 ± 5% -54.2% 2.31 ± 27% sched_debug.cfs_rq:/.h_nr_runnable.stddev
5340703 ± 12% +85.8% 9925031 ± 2% sched_debug.cfs_rq:/.left_deadline.max
2202572 ± 2% +55.2% 3417743 ± 9% sched_debug.cfs_rq:/.left_deadline.stddev
5340659 ± 12% +85.8% 9924585 ± 2% sched_debug.cfs_rq:/.left_vruntime.max
2202531 ± 2% +55.2% 3417686 ± 9% sched_debug.cfs_rq:/.left_vruntime.stddev
313473 ± 6% -24.8% 235882 ± 22% sched_debug.cfs_rq:/.load.avg
4612199 ± 2% +62.7% 7503887 sched_debug.cfs_rq:/.min_vruntime.avg
5440184 ± 13% +85.6% 10099394 ± 2% sched_debug.cfs_rq:/.min_vruntime.max
501154 ± 64% +329.5% 2152680 ± 6% sched_debug.cfs_rq:/.min_vruntime.stddev
0.60 ± 6% -19.5% 0.49 ± 13% sched_debug.cfs_rq:/.nr_queued.avg
5340667 ± 12% +85.8% 9924585 ± 2% sched_debug.cfs_rq:/.right_vruntime.max
2202534 ± 2% +55.2% 3417691 ± 9% sched_debug.cfs_rq:/.right_vruntime.stddev
364.26 ± 3% +16.6% 424.72 ± 2% sched_debug.cfs_rq:/.util_avg.avg
1206 ± 23% +53.8% 1856 ± 26% sched_debug.cfs_rq:/.util_est.max
209.57 ± 9% +27.9% 268.09 ± 11% sched_debug.cfs_rq:/.util_est.stddev
360185 ± 5% +68.1% 605388 ± 15% sched_debug.cpu.curr->pid.avg
401600 ± 3% +120.0% 883327 ± 5% sched_debug.cpu.curr->pid.stddev
2.13 ± 10% -47.0% 1.13 ± 18% sched_debug.cpu.nr_running.avg
44.25 ± 10% -55.6% 19.67 ± 47% sched_debug.cpu.nr_running.max
5.08 ± 5% -53.8% 2.35 ± 25% sched_debug.cpu.nr_running.stddev
98005 +37.5% 134753 sched_debug.cpu.nr_switches.avg
178454 ± 8% +106.9% 369189 ± 4% sched_debug.cpu.nr_switches.max
16050 ± 34% +376.0% 76393 ± 3% sched_debug.cpu.nr_switches.stddev
3.76 +13.7% 4.27 perf-stat.i.MPKI
1.873e+10 +6.2% 1.989e+10 perf-stat.i.branch-instructions
0.61 +0.1 0.69 perf-stat.i.branch-miss-rate%
1.096e+08 +21.8% 1.335e+08 perf-stat.i.branch-misses
40.32 -2.7 37.62 perf-stat.i.cache-miss-rate%
3.087e+08 +22.7% 3.787e+08 perf-stat.i.cache-misses
7.635e+08 +31.5% 1.004e+09 perf-stat.i.cache-references
712864 +38.1% 984398 perf-stat.i.context-switches
7.63 -10.6% 6.82 perf-stat.i.cpi
6.279e+11 -3.7% 6.047e+11 perf-stat.i.cpu-cycles
2027 -21.4% 1593 perf-stat.i.cycles-between-cache-misses
8.232e+10 +7.9% 8.881e+10 perf-stat.i.instructions
0.14 +10.8% 0.15 perf-stat.i.ipc
8.13 +26.5% 10.29 perf-stat.i.metric.K/sec
369735 +22.0% 450981 perf-stat.i.minor-faults
532034 +22.5% 651748 perf-stat.i.page-faults
3.76 +13.3% 4.26 perf-stat.overall.MPKI
0.58 +0.1 0.67 perf-stat.overall.branch-miss-rate%
40.43 -2.7 37.76 perf-stat.overall.cache-miss-rate%
7.66 -11.4% 6.79 perf-stat.overall.cpi
2038 -21.8% 1594 perf-stat.overall.cycles-between-cache-misses
0.13 +12.8% 0.15 perf-stat.overall.ipc
1.821e+10 +7.3% 1.954e+10 perf-stat.ps.branch-instructions
1.057e+08 +23.2% 1.302e+08 perf-stat.ps.branch-misses
3.007e+08 +23.6% 3.717e+08 perf-stat.ps.cache-misses
7.438e+08 +32.4% 9.845e+08 perf-stat.ps.cache-references
696299 +38.7% 965478 perf-stat.ps.context-switches
6.131e+11 -3.4% 5.925e+11 perf-stat.ps.cpu-cycles
8e+10 +9.0% 8.724e+10 perf-stat.ps.instructions
356195 +23.6% 440270 perf-stat.ps.minor-faults
514755 +23.8% 637135 perf-stat.ps.page-faults
4.867e+12 +9.3% 5.319e+12 perf-stat.total.instructions
74.42 ± 44% -60.3 14.16 ±223% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe
74.41 ± 44% -60.3 14.16 ±223% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe
46.44 ± 44% -41.7 4.73 ±223% perf-profile.calltrace.cycles-pp.x64_sys_call.do_syscall_64.entry_SYSCALL_64_after_hwframe
46.44 ± 44% -41.7 4.73 ±223% perf-profile.calltrace.cycles-pp.__x64_sys_exit.x64_sys_call.do_syscall_64.entry_SYSCALL_64_after_hwframe
46.43 ± 44% -41.7 4.72 ±223% perf-profile.calltrace.cycles-pp.do_exit.__x64_sys_exit.x64_sys_call.do_syscall_64.entry_SYSCALL_64_after_hwframe
45.72 ± 44% -41.2 4.50 ±223% perf-profile.calltrace.cycles-pp.exit_notify.do_exit.__x64_sys_exit.x64_sys_call.do_syscall_64
23.46 ± 44% -23.5 0.00 perf-profile.calltrace.cycles-pp.queued_write_lock_slowpath.exit_notify.do_exit.__x64_sys_exit.x64_sys_call
23.34 ± 44% -23.3 0.00 perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath.queued_write_lock_slowpath.exit_notify.do_exit.__x64_sys_exit
23.33 ± 45% -23.3 0.00 perf-profile.calltrace.cycles-pp.queued_write_lock_slowpath.copy_process.kernel_clone.__do_sys_clone3.do_syscall_64
23.24 ± 45% -23.2 0.00 perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath.queued_write_lock_slowpath.copy_process.kernel_clone.__do_sys_clone3
21.68 ± 44% -21.7 0.00 perf-profile.calltrace.cycles-pp.queued_write_lock_slowpath.release_task.exit_notify.do_exit.__x64_sys_exit
21.54 ± 44% -21.5 0.00 perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath.queued_write_lock_slowpath.release_task.exit_notify.do_exit
27.26 ± 45% -18.0 9.26 ±223% perf-profile.calltrace.cycles-pp.__do_sys_clone3.do_syscall_64.entry_SYSCALL_64_after_hwframe
27.26 ± 45% -18.0 9.26 ±223% perf-profile.calltrace.cycles-pp.kernel_clone.__do_sys_clone3.do_syscall_64.entry_SYSCALL_64_after_hwframe
22.09 ± 44% -17.6 4.45 ±223% perf-profile.calltrace.cycles-pp.release_task.exit_notify.do_exit.__x64_sys_exit.x64_sys_call
26.16 ± 45% -17.2 8.99 ±223% perf-profile.calltrace.cycles-pp.copy_process.kernel_clone.__do_sys_clone3.do_syscall_64.entry_SYSCALL_64_after_hwframe
5.24 ± 47% -3.9 1.38 ±223% perf-profile.calltrace.cycles-pp.__madvise
5.24 ± 47% -3.9 1.38 ±223% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise
5.24 ± 47% -3.9 1.38 ±223% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__madvise
5.24 ± 47% -3.9 1.38 ±223% perf-profile.calltrace.cycles-pp.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise
5.24 ± 47% -3.9 1.38 ±223% perf-profile.calltrace.cycles-pp.do_madvise.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise
5.18 ± 47% -3.8 1.37 ±223% perf-profile.calltrace.cycles-pp.madvise_vma_behavior.do_madvise.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe
5.18 ± 47% -3.8 1.36 ±223% perf-profile.calltrace.cycles-pp.zap_page_range_single.madvise_vma_behavior.do_madvise.__x64_sys_madvise.do_syscall_64
5.08 ± 47% -3.7 1.34 ±223% perf-profile.calltrace.cycles-pp.flush_tlb_mm_range.tlb_finish_mmu.zap_page_range_single.madvise_vma_behavior.do_madvise
5.08 ± 47% -3.7 1.34 ±223% perf-profile.calltrace.cycles-pp.tlb_finish_mmu.zap_page_range_single.madvise_vma_behavior.do_madvise.__x64_sys_madvise
5.07 ± 47% -3.7 1.34 ±223% perf-profile.calltrace.cycles-pp.on_each_cpu_cond_mask.flush_tlb_mm_range.tlb_finish_mmu.zap_page_range_single.madvise_vma_behavior
5.06 ± 47% -3.7 1.33 ±223% perf-profile.calltrace.cycles-pp.smp_call_function_many_cond.on_each_cpu_cond_mask.flush_tlb_mm_range.tlb_finish_mmu.zap_page_range_single
68.48 ± 44% -68.4 0.09 ±223% perf-profile.children.cycles-pp.queued_write_lock_slowpath
81.41 ± 44% -65.4 16.02 ±223% perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
81.40 ± 44% -65.4 16.01 ±223% perf-profile.children.cycles-pp.do_syscall_64
70.40 ± 44% -57.1 13.32 ±223% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
46.45 ± 44% -41.7 4.73 ±223% perf-profile.children.cycles-pp.x64_sys_call
46.44 ± 44% -41.7 4.73 ±223% perf-profile.children.cycles-pp.do_exit
46.44 ± 44% -41.7 4.73 ±223% perf-profile.children.cycles-pp.__x64_sys_exit
45.74 ± 44% -41.2 4.50 ±223% perf-profile.children.cycles-pp.exit_notify
27.26 ± 45% -18.0 9.26 ±223% perf-profile.children.cycles-pp.__do_sys_clone3
27.26 ± 45% -18.0 9.26 ±223% perf-profile.children.cycles-pp.kernel_clone
22.11 ± 44% -17.7 4.45 ±223% perf-profile.children.cycles-pp.release_task
26.18 ± 45% -17.2 8.99 ±223% perf-profile.children.cycles-pp.copy_process
5.38 ± 47% -4.0 1.38 ±223% perf-profile.children.cycles-pp.tlb_finish_mmu
5.30 ± 47% -3.9 1.36 ±223% perf-profile.children.cycles-pp.on_each_cpu_cond_mask
5.30 ± 47% -3.9 1.36 ±223% perf-profile.children.cycles-pp.smp_call_function_many_cond
5.30 ± 47% -3.9 1.37 ±223% perf-profile.children.cycles-pp.flush_tlb_mm_range
5.25 ± 47% -3.9 1.38 ±223% perf-profile.children.cycles-pp.__madvise
5.24 ± 47% -3.9 1.38 ±223% perf-profile.children.cycles-pp.__x64_sys_madvise
5.24 ± 47% -3.9 1.38 ±223% perf-profile.children.cycles-pp.do_madvise
5.18 ± 47% -3.8 1.37 ±223% perf-profile.children.cycles-pp.madvise_vma_behavior
5.18 ± 47% -3.8 1.36 ±223% perf-profile.children.cycles-pp.zap_page_range_single
70.39 ± 44% -57.1 13.32 ±223% perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
5.16 ± 47% -3.9 1.30 ±223% perf-profile.self.cycles-pp.smp_call_function_many_cond
***************************************************************************************************
lkp-spr-2sp4: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 512G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
gcc-12/performance/x86_64-rhel-9.4/100%/debian-12-x86_64-20240206.cgz/lkp-spr-2sp4/vfork/stress-ng/60s
commit:
74198dc206 ("pid: sprinkle tasklist_lock asserts")
7903f907a2 ("pid: perform free_pid() calls outside of tasklist_lock")
74198dc2067b2aa1 7903f907a226058ed99f86e9924
---------------- ---------------------------
%stddev %change %stddev
\ | \
6562366 ± 8% +37.0% 8993652 ± 10% cpuidle..usage
0.29 +0.1 0.39 mpstat.cpu.all.soft%
486692 +31.8% 641303 vmstat.system.cs
506323 +4.8% 530409 vmstat.system.in
4004574 ± 3% +8.7% 4353640 ± 3% meminfo.Active
4004574 ± 3% +8.7% 4353640 ± 3% meminfo.Active(anon)
2657761 ± 6% +15.5% 3069404 ± 5% meminfo.Shmem
3257759 ± 11% +14.3% 3724594 ± 7% numa-meminfo.node1.Active
3257759 ± 11% +14.3% 3724594 ± 7% numa-meminfo.node1.Active(anon)
2492828 ± 9% +21.0% 3017306 ± 6% numa-meminfo.node1.Shmem
9063611 ± 2% +36.5% 12368884 ± 9% numa-numastat.node0.local_node
9220375 ± 2% +35.7% 12513653 ± 9% numa-numastat.node0.numa_hit
10168176 +28.3% 13044773 numa-numastat.node1.local_node
10243149 +28.2% 13131946 numa-numastat.node1.numa_hit
5700 ± 8% +47.9% 8432 ± 11% perf-c2c.DRAM.remote
14297 ± 7% +42.5% 20373 ± 12% perf-c2c.HITM.local
3624 ± 8% +54.4% 5597 ± 11% perf-c2c.HITM.remote
17922 ± 7% +44.9% 25970 ± 12% perf-c2c.HITM.total
51838 ± 45% -56.5% 22543 ±105% numa-vmstat.node0.nr_mapped
9221619 ± 2% +35.2% 12469913 ± 9% numa-vmstat.node0.numa_hit
9064856 ± 2% +36.0% 12325144 ± 10% numa-vmstat.node0.numa_local
623443 ± 9% +20.6% 752138 ± 6% numa-vmstat.node1.nr_shmem
10243633 +27.8% 13088671 numa-vmstat.node1.numa_hit
10168660 +27.9% 13001498 numa-vmstat.node1.numa_local
1378378 +18.3% 1630343 stress-ng.time.involuntary_context_switches
10647 -3.1% 10321 stress-ng.time.system_time
1838 +13.8% 2092 stress-ng.time.user_time
16431508 +30.8% 21498222 stress-ng.time.voluntary_context_switches
8890752 +28.7% 11442483 stress-ng.vfork.ops
148177 +28.7% 190706 stress-ng.vfork.ops_per_sec
1000826 ± 3% +8.9% 1090125 ± 3% proc-vmstat.nr_active_anon
1545626 ± 2% +6.8% 1650840 ± 2% proc-vmstat.nr_file_pages
120475 +2.9% 124024 proc-vmstat.nr_mapped
663632 ± 6% +15.9% 768846 ± 5% proc-vmstat.nr_shmem
1000826 ± 3% +8.9% 1090125 ± 3% proc-vmstat.nr_zone_active_anon
19510114 +31.5% 25647538 ± 4% proc-vmstat.numa_hit
19278378 +31.8% 25415597 ± 4% proc-vmstat.numa_local
22280233 +32.9% 29608930 ± 4% proc-vmstat.pgalloc_normal
20644303 +35.1% 27885848 ± 4% proc-vmstat.pgfree
1.03 +18.9% 1.22 ± 2% perf-stat.i.MPKI
1.703e+10 +6.2% 1.809e+10 perf-stat.i.branch-instructions
0.53 ± 2% +0.1 0.59 ± 4% perf-stat.i.branch-miss-rate%
88001361 ± 3% +17.3% 1.032e+08 ± 5% perf-stat.i.branch-misses
74412375 +27.9% 95182974 perf-stat.i.cache-misses
7.674e+08 ± 3% +26.4% 9.698e+08 ± 4% perf-stat.i.cache-references
503132 +32.0% 664329 perf-stat.i.context-switches
8.49 -7.5% 7.85 perf-stat.i.cpi
112807 ± 2% +23.7% 139583 ± 5% perf-stat.i.cpu-migrations
8617 -23.1% 6627 perf-stat.i.cycles-between-cache-misses
7.368e+10 +7.4% 7.917e+10 perf-stat.i.instructions
0.12 +8.3% 0.13 perf-stat.i.ipc
2.25 +31.7% 2.97 perf-stat.i.metric.K/sec
1.02 +18.9% 1.21 perf-stat.overall.MPKI
0.50 ± 2% +0.1 0.56 ± 3% perf-stat.overall.branch-miss-rate%
8.55 -7.5% 7.91 perf-stat.overall.cpi
8374 -22.2% 6517 perf-stat.overall.cycles-between-cache-misses
0.12 +8.1% 0.13 perf-stat.overall.ipc
1.655e+10 +6.2% 1.758e+10 perf-stat.ps.branch-instructions
82996740 ± 3% +17.8% 97762479 ± 5% perf-stat.ps.branch-misses
73065238 +27.7% 93297913 perf-stat.ps.cache-misses
7.509e+08 ± 3% +26.3% 9.487e+08 ± 4% perf-stat.ps.cache-references
491567 +32.0% 649035 perf-stat.ps.context-switches
110242 ± 2% +23.6% 136250 ± 4% perf-stat.ps.cpu-migrations
7.159e+10 +7.4% 7.69e+10 perf-stat.ps.instructions
11850 ± 2% +6.0% 12559 ± 3% perf-stat.ps.minor-faults
11850 ± 2% +6.0% 12559 ± 3% perf-stat.ps.page-faults
4.334e+12 +8.1% 4.684e+12 perf-stat.total.instructions
0.55 ± 10% -29.3% 0.39 ± 13% perf-sched.sch_delay.avg.ms.__cond_resched.__kmalloc_cache_node_noprof.__get_vm_area_node.__vmalloc_node_range_noprof.alloc_thread_stack_node
0.80 ± 3% -31.4% 0.55 ± 6% perf-sched.sch_delay.avg.ms.__cond_resched.__kmalloc_node_noprof.__vmalloc_area_node.__vmalloc_node_range_noprof.alloc_thread_stack_node
0.94 ± 3% -31.1% 0.65 ± 2% perf-sched.sch_delay.avg.ms.__cond_resched.__vmalloc_area_node.__vmalloc_node_range_noprof.alloc_thread_stack_node.dup_task_struct
0.30 ± 2% -14.5% 0.26 ± 4% perf-sched.sch_delay.avg.ms.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity
0.37 -28.9% 0.27 perf-sched.sch_delay.avg.ms.__cond_resched.__wait_for_common.wait_for_completion_state.kernel_clone.__x64_sys_vfork
0.81 ± 12% -28.8% 0.58 ± 10% perf-sched.sch_delay.avg.ms.__cond_resched.alloc_pages_bulk_noprof.__vmalloc_area_node.__vmalloc_node_range_noprof.alloc_thread_stack_node
0.76 ± 4% -43.4% 0.43 ± 3% perf-sched.sch_delay.avg.ms.__cond_resched.cgroup_css_set_fork.cgroup_can_fork.copy_process.kernel_clone
0.42 ± 16% -45.4% 0.23 ± 15% perf-sched.sch_delay.avg.ms.__cond_resched.kmem_cache_alloc_node_noprof.dup_task_struct.copy_process.kernel_clone
0.81 -38.6% 0.50 ± 5% perf-sched.sch_delay.avg.ms.__cond_resched.kmem_cache_alloc_noprof.alloc_pid.copy_process.kernel_clone
0.92 -31.7% 0.63 ± 8% perf-sched.sch_delay.avg.ms.__cond_resched.kmem_cache_alloc_noprof.copy_fs_struct.copy_process.kernel_clone
0.87 ± 3% -33.4% 0.58 ± 8% perf-sched.sch_delay.avg.ms.__cond_resched.kmem_cache_alloc_noprof.copy_sighand.copy_process.kernel_clone
0.86 ± 8% -32.5% 0.58 ± 7% perf-sched.sch_delay.avg.ms.__cond_resched.kmem_cache_alloc_noprof.copy_signal.copy_process.kernel_clone
0.96 ± 5% -36.0% 0.61 ± 4% perf-sched.sch_delay.avg.ms.__cond_resched.kmem_cache_alloc_noprof.dup_fd.copy_process.kernel_clone
0.85 -38.0% 0.53 ± 3% perf-sched.sch_delay.avg.ms.__cond_resched.kmem_cache_alloc_noprof.prepare_creds.copy_creds.copy_process
0.34 ± 33% -57.1% 0.15 ± 82% perf-sched.sch_delay.avg.ms.__cond_resched.kvfree_rcu_drain_ready.kfree_rcu_monitor.process_one_work.worker_thread
0.04 ± 3% -20.9% 0.04 ± 6% perf-sched.sch_delay.avg.ms.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.17 ± 9% -31.5% 0.11 ± 16% perf-sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown]
0.23 -18.1% 0.19 perf-sched.sch_delay.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
0.30 -20.7% 0.24 ± 2% perf-sched.sch_delay.avg.ms.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
0.10 ± 6% -18.2% 0.08 ± 5% perf-sched.sch_delay.avg.ms.syscall_exit_to_user_mode.ret_from_fork_asm.[unknown].[unknown]
0.13 -18.4% 0.11 ± 2% perf-sched.sch_delay.avg.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
1.64 ± 33% -34.6% 1.07 ± 20% perf-sched.sch_delay.max.ms.__cond_resched.kmem_cache_alloc_node_noprof.dup_task_struct.copy_process.kernel_clone
0.43 ± 28% -41.7% 0.25 ± 31% perf-sched.sch_delay.max.ms.__cond_resched.mmput.exit_mm.do_exit.__x64_sys_exit
0.78 ± 19% -42.2% 0.45 ± 25% perf-sched.sch_delay.max.ms.__cond_resched.stop_one_cpu.sched_exec.bprm_execve.part
0.13 -20.3% 0.10 perf-sched.total_sch_delay.average.ms
59.45 ± 12% -21.3% 46.77 ± 9% perf-sched.total_sch_delay.max.ms
2.32 -18.5% 1.89 perf-sched.total_wait_and_delay.average.ms
1656374 +26.0% 2087010 perf-sched.total_wait_and_delay.count.ms
2.20 -18.4% 1.79 perf-sched.total_wait_time.average.ms
0.90 -26.7% 0.66 perf-sched.wait_and_delay.avg.ms.__cond_resched.__wait_for_common.wait_for_completion_state.kernel_clone.__x64_sys_vfork
20.62 ± 6% -43.0% 11.74 ± 2% perf-sched.wait_and_delay.avg.ms.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
0.17 ± 2% -18.4% 0.14 ± 5% perf-sched.wait_and_delay.avg.ms.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
60.43 ± 19% +76.4% 106.62 ± 33% perf-sched.wait_and_delay.avg.ms.pipe_read.vfs_read.ksys_read.do_syscall_64
0.65 -18.1% 0.53 perf-sched.wait_and_delay.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
56.03 ± 3% -45.1% 30.75 ± 2% perf-sched.wait_and_delay.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
0.89 ± 3% -17.5% 0.73 ± 7% perf-sched.wait_and_delay.avg.ms.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
10.82 -15.3% 9.17 perf-sched.wait_and_delay.avg.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
33654 -9.5% 30471 perf-sched.wait_and_delay.count.__cond_resched.__wait_for_common.wait_for_completion_state.kernel_clone.__x64_sys_vfork
1689 ± 8% +168.2% 4529 ± 8% perf-sched.wait_and_delay.count.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
59.50 ± 6% +39.5% 83.00 ± 11% perf-sched.wait_and_delay.count.__cond_resched.vunmap_p4d_range.__vunmap_range_noflush.remove_vm_area.vfree
675414 +24.7% 842197 perf-sched.wait_and_delay.count.do_task_dead.do_exit.__x64_sys_exit.x64_sys_call.do_syscall_64
69934 ± 4% +46.4% 102383 ± 6% perf-sched.wait_and_delay.count.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
1118 ± 19% -36.7% 708.00 ± 28% perf-sched.wait_and_delay.count.pipe_read.vfs_read.ksys_read.do_syscall_64
652564 +25.8% 821118 perf-sched.wait_and_delay.count.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
36347 ± 3% +89.4% 68847 ± 2% perf-sched.wait_and_delay.count.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
62439 +16.9% 72971 perf-sched.wait_and_delay.count.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
104431 +18.2% 123395 perf-sched.wait_and_delay.count.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
3.18 ±183% -87.1% 0.41 ± 14% perf-sched.wait_time.avg.ms.__cond_resched.__kmalloc_cache_node_noprof.__get_vm_area_node.__vmalloc_node_range_noprof.alloc_thread_stack_node
0.83 ± 3% -30.1% 0.58 ± 6% perf-sched.wait_time.avg.ms.__cond_resched.__kmalloc_node_noprof.__vmalloc_area_node.__vmalloc_node_range_noprof.alloc_thread_stack_node
1.28 ± 57% -47.5% 0.67 ± 2% perf-sched.wait_time.avg.ms.__cond_resched.__vmalloc_area_node.__vmalloc_node_range_noprof.alloc_thread_stack_node.dup_task_struct
0.52 -25.1% 0.39 perf-sched.wait_time.avg.ms.__cond_resched.__wait_for_common.wait_for_completion_state.kernel_clone.__x64_sys_vfork
0.85 ± 12% -34.9% 0.55 ± 17% perf-sched.wait_time.avg.ms.__cond_resched.alloc_pages_bulk_noprof.__vmalloc_area_node.__vmalloc_node_range_noprof.alloc_thread_stack_node
0.80 ± 5% -37.6% 0.50 perf-sched.wait_time.avg.ms.__cond_resched.cgroup_css_set_fork.cgroup_can_fork.copy_process.kernel_clone
0.79 ± 26% -37.0% 0.50 ± 19% perf-sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc_node_noprof.alloc_vmap_area.__get_vm_area_node.__vmalloc_node_range_noprof
0.51 ± 9% -42.1% 0.30 ± 12% perf-sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc_node_noprof.dup_task_struct.copy_process.kernel_clone
0.94 -31.8% 0.64 ± 2% perf-sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc_noprof.copy_fs_struct.copy_process.kernel_clone
0.90 ± 2% -32.1% 0.61 ± 6% perf-sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc_noprof.copy_sighand.copy_process.kernel_clone
0.89 ± 8% -31.4% 0.61 ± 6% perf-sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc_noprof.copy_signal.copy_process.kernel_clone
0.96 ± 2% -33.2% 0.64 ± 4% perf-sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc_noprof.dup_fd.copy_process.kernel_clone
0.88 -34.6% 0.57 ± 2% perf-sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc_noprof.prepare_creds.copy_creds.copy_process
20.58 ± 6% -43.1% 11.71 ± 2% perf-sched.wait_time.avg.ms.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
0.13 ± 3% -17.6% 0.11 ± 4% perf-sched.wait_time.avg.ms.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
60.37 ± 19% +76.5% 106.54 ± 33% perf-sched.wait_time.avg.ms.pipe_read.vfs_read.ksys_read.do_syscall_64
0.41 -17.9% 0.34 perf-sched.wait_time.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
55.91 ± 3% -45.2% 30.65 ± 2% perf-sched.wait_time.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
0.58 ± 6% -15.8% 0.49 ± 11% perf-sched.wait_time.avg.ms.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
10.69 -15.3% 9.06 perf-sched.wait_time.avg.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
1.25 -23.3% 0.96 ± 13% perf-sched.wait_time.max.ms.__cond_resched.alloc_pages_bulk_noprof.__vmalloc_area_node.__vmalloc_node_range_noprof.alloc_thread_stack_node
1.65 ± 34% -34.4% 1.08 ± 19% perf-sched.wait_time.max.ms.__cond_resched.kmem_cache_alloc_node_noprof.dup_task_struct.copy_process.kernel_clone
44.32 ± 19% -26.5% 32.59 ± 11% perf-sched.wait_time.max.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
Powered by blists - more mailing lists