[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20201227145952.GA22566@xsang-OptiPlex-9020>
Date: Sun, 27 Dec 2020 22:59:52 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Feng Tang <feng.tang@...el.com>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
kernel test robot <rong.a.chen@...el.com>,
Waiman Long <longman@...hat.com>,
LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org,
lkp@...el.com, ying.huang@...el.com, feng.tang@...el.com,
zhengjun.xing@...el.com
Subject: [mm] 4df910620b: will-it-scale.per_process_ops 37.7% improvement
Greeting,
FYI, we noticed a 37.7% improvement of will-it-scale.per_process_ops due to commit:
commit: 4df910620bebb5cfe234af16ac8f6474b60215fd ("mm: memcg: relayout structure mem_cgroup to avoid cache interference")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
in testcase: will-it-scale
on test machine: 144 threads Intel(R) Xeon(R) CPU E7-8890 v3 @ 2.50GHz with 512G memory
with following parameters:
nr_task: 50%
mode: process
test: page_fault2
cpufreq_governor: performance
ucode: 0x16
test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
test-url: https://github.com/antonblanchard/will-it-scale
In addition to that, the commit also has significant impact on the following tests:
+------------------+------------------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_process_ops 2.0% improvement |
| test machine | 48 threads Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz with 112G memory |
| test parameters | cpufreq_governor=performance |
| | mode=process |
| | nr_task=50% |
| | test=page_fault2 |
| | ucode=0x42e |
+------------------+------------------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_process_ops 12.1% improvement |
| test machine | 144 threads Intel(R) Xeon(R) CPU E7-8890 v3 @ 2.50GHz with 512G memory |
| test parameters | cpufreq_governor=performance |
| | mode=process |
| | nr_task=100% |
| | test=page_fault2 |
| | ucode=0x16 |
+------------------+------------------------------------------------------------------------+
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
gcc-9/performance/x86_64-rhel-8.3/process/50%/debian-10.4-x86_64-20200603.cgz/lkp-hsw-4ex1/page_fault2/will-it-scale/0x16
commit:
fa02fcd94b ("Merge tag 'media/v5.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media")
4df910620b ("mm: memcg: relayout structure mem_cgroup to avoid cache interference")
fa02fcd94b0c8dff 4df910620bebb5cfe234af16ac8
---------------- ---------------------------
fail:runs %reproduction fail:runs
| | |
3:4 34% 4:4 perf-profile.calltrace.cycles-pp.error_entry.testcase
3:4 36% 5:4 perf-profile.children.cycles-pp.error_entry
3:4 30% 4:4 perf-profile.self.cycles-pp.error_entry
%stddev %change %stddev
\ | \
10516193 +37.7% 14476896 will-it-scale.72.processes
146057 +37.7% 201067 will-it-scale.per_process_ops
10516193 +37.7% 14476896 will-it-scale.workload
59980 +10.8% 66467 meminfo.max_used_kB
0.08 ± 2% +0.0 0.09 ± 4% mpstat.cpu.all.soft%
29092 ± 3% +23.6% 35961 ± 8% cpuidle.POLL.time
8949 ± 6% +25.1% 11194 ± 8% cpuidle.POLL.usage
5203 ± 10% -17.3% 4300 ± 8% numa-meminfo.node0.PageTables
68326 ± 12% -14.9% 58153 ± 9% numa-meminfo.node0.SUnreclaim
1675674 ± 3% +27.7% 2140554 ± 10% numa-meminfo.node1.AnonPages.max
36696 ± 11% -28.2% 26346 ± 12% numa-meminfo.node1.KReclaimable
36696 ± 11% -28.2% 26346 ± 12% numa-meminfo.node1.SReclaimable
4212 ± 5% +14.0% 4802 ± 9% numa-meminfo.node2.PageTables
4683 +1.9% 4772 proc-vmstat.nr_page_table_pages
3.168e+09 +37.6% 4.359e+09 proc-vmstat.numa_hit
3.168e+09 +37.6% 4.359e+09 proc-vmstat.numa_local
3.171e+09 +37.5% 4.362e+09 proc-vmstat.pgalloc_normal
3.161e+09 +37.6% 4.349e+09 proc-vmstat.pgfault
3.168e+09 +37.6% 4.358e+09 proc-vmstat.pgfree
8.176e+08 +33.8% 1.094e+09 numa-numastat.node0.local_node
8.177e+08 +33.8% 1.094e+09 numa-numastat.node0.numa_hit
7.905e+08 +37.8% 1.089e+09 numa-numastat.node1.local_node
7.906e+08 +37.8% 1.09e+09 numa-numastat.node1.numa_hit
15636 ± 82% +111.0% 32997 ± 16% numa-numastat.node1.other_node
7.865e+08 +38.5% 1.09e+09 numa-numastat.node2.local_node
7.866e+08 +38.5% 1.09e+09 numa-numastat.node2.numa_hit
7.764e+08 +40.4% 1.09e+09 numa-numastat.node3.local_node
7.764e+08 +40.4% 1.09e+09 numa-numastat.node3.numa_hit
11.53 ± 6% +66.7% 19.22 sched_debug.cfs_rq:/.nr_spread_over.avg
30.62 ± 9% +89.9% 58.17 ± 39% sched_debug.cfs_rq:/.nr_spread_over.max
1.17 ± 41% +128.6% 2.67 ± 25% sched_debug.cfs_rq:/.nr_spread_over.min
6.13 ± 15% +54.3% 9.46 ± 15% sched_debug.cfs_rq:/.nr_spread_over.stddev
-4142704 -17.9% -3401780 sched_debug.cfs_rq:/.spread0.min
0.00 ± 4% +115.1% 0.00 ± 25% sched_debug.cpu.next_balance.stddev
2904 ± 5% -12.2% 2550 ± 3% sched_debug.cpu.nr_switches.stddev
2487 ± 8% -11.9% 2191 ± 6% sched_debug.cpu.sched_count.stddev
1261 ± 8% -11.9% 1111 ± 6% sched_debug.cpu.sched_goidle.stddev
317588 ± 4% -6.8% 296039 ± 5% numa-vmstat.node0.nr_anon_pages
1300 ± 10% -17.4% 1074 ± 8% numa-vmstat.node0.nr_page_table_pages
17081 ± 12% -14.9% 14537 ± 9% numa-vmstat.node0.nr_slab_unreclaimable
4.069e+08 +33.8% 5.443e+08 numa-vmstat.node0.numa_hit
4.069e+08 +33.8% 5.443e+08 numa-vmstat.node0.numa_local
9173 ± 11% -28.2% 6586 ± 12% numa-vmstat.node1.nr_slab_reclaimable
3.941e+08 +37.4% 5.417e+08 numa-vmstat.node1.numa_hit
3.94e+08 +37.5% 5.415e+08 numa-vmstat.node1.numa_local
104864 ± 11% +17.4% 123112 ± 4% numa-vmstat.node1.numa_other
1054 ± 5% +13.7% 1199 ± 9% numa-vmstat.node2.nr_page_table_pages
3.92e+08 +38.3% 5.42e+08 numa-vmstat.node2.numa_hit
3.919e+08 +38.3% 5.419e+08 numa-vmstat.node2.numa_local
1222 ± 12% +19.1% 1456 ± 12% numa-vmstat.node3.nr_page_table_pages
3.868e+08 +40.4% 5.433e+08 numa-vmstat.node3.numa_hit
3.867e+08 +40.5% 5.432e+08 numa-vmstat.node3.numa_local
22655 ± 4% +11.5% 25263 ± 6% softirqs.CPU0.SCHED
10045 ± 2% -9.0% 9139 ± 6% softirqs.CPU1.RCU
9348 ± 6% -10.4% 8377 softirqs.CPU12.RCU
9601 ± 6% -12.6% 8396 ± 2% softirqs.CPU18.RCU
22521 ± 4% -22.2% 17522 ± 12% softirqs.CPU2.SCHED
9465 ± 5% -9.1% 8601 ± 5% softirqs.CPU25.RCU
9393 ± 5% -10.6% 8398 ± 3% softirqs.CPU26.RCU
9708 ± 4% -13.6% 8388 ± 2% softirqs.CPU27.RCU
9338 ± 7% -10.6% 8351 ± 2% softirqs.CPU28.RCU
9386 ± 5% -14.2% 8055 ± 8% softirqs.CPU31.RCU
9057 ± 7% -12.3% 7948 ± 6% softirqs.CPU36.RCU
18746 ± 20% -24.2% 14204 ± 22% softirqs.CPU4.SCHED
13616 ± 8% +22.3% 16654 ± 6% softirqs.CPU44.SCHED
13697 ± 16% +23.2% 16879 ± 12% softirqs.CPU48.SCHED
10234 ± 6% -14.8% 8719 softirqs.CPU5.RCU
12718 ± 21% +48.8% 18923 ± 23% softirqs.CPU58.SCHED
14544 ± 11% +11.4% 16196 ± 11% softirqs.CPU61.SCHED
9145 ± 7% -12.6% 7988 ± 3% softirqs.CPU70.RCU
21831 ± 6% +21.9% 26619 ± 9% softirqs.CPU74.SCHED
11254 ± 26% -28.0% 8101 ± 6% softirqs.CPU89.RCU
47398543 ± 12% -6.4e+06 41043074 ± 6% syscalls.sys_close.noise.100%
60480226 ± 8% -6e+06 54474054 ± 4% syscalls.sys_close.noise.2%
58842094 ± 9% -6.2e+06 52672651 ± 4% syscalls.sys_close.noise.25%
60405154 ± 8% -6e+06 54392442 ± 4% syscalls.sys_close.noise.5%
55525400 ± 10% -6.4e+06 49123129 ± 4% syscalls.sys_close.noise.50%
51399889 ± 10% -6.4e+06 45005108 ± 5% syscalls.sys_close.noise.75%
18095 ± 2% +11.9% 20244 syscalls.sys_mmap.med
4739 ± 6% -12.2% 4160 ± 3% syscalls.sys_mmap.min
1.337e+09 ± 2% +1.6e+08 1.501e+09 syscalls.sys_mmap.noise.100%
1.437e+09 +1.6e+08 1.592e+09 syscalls.sys_mmap.noise.2%
1.421e+09 +1.6e+08 1.58e+09 syscalls.sys_mmap.noise.25%
1.436e+09 +1.6e+08 1.592e+09 syscalls.sys_mmap.noise.5%
1.395e+09 ± 2% +1.7e+08 1.56e+09 syscalls.sys_mmap.noise.50%
1.368e+09 ± 2% +1.7e+08 1.538e+09 syscalls.sys_mmap.noise.75%
2934931 ± 3% -19.9% 2350376 ± 4% syscalls.sys_write.max
2.862e+09 ± 6% -6.2e+08 2.246e+09 ± 10% syscalls.sys_write.noise.100%
2.88e+09 ± 6% -6.1e+08 2.266e+09 ± 10% syscalls.sys_write.noise.2%
2.878e+09 ± 6% -6.1e+08 2.263e+09 ± 10% syscalls.sys_write.noise.25%
2.88e+09 ± 6% -6.1e+08 2.266e+09 ± 10% syscalls.sys_write.noise.5%
2.873e+09 ± 6% -6.1e+08 2.259e+09 ± 10% syscalls.sys_write.noise.50%
2.867e+09 ± 6% -6.1e+08 2.253e+09 ± 10% syscalls.sys_write.noise.75%
3191 ± 15% +68.3% 5370 ± 36% interrupts.CPU102.NMI:Non-maskable_interrupts
3191 ± 15% +68.3% 5370 ± 36% interrupts.CPU102.PMI:Performance_monitoring_interrupts
3249 ± 18% +97.1% 6404 ± 23% interrupts.CPU11.NMI:Non-maskable_interrupts
3249 ± 18% +97.1% 6404 ± 23% interrupts.CPU11.PMI:Performance_monitoring_interrupts
6027 ± 15% -46.9% 3197 ± 56% interrupts.CPU115.NMI:Non-maskable_interrupts
6027 ± 15% -46.9% 3197 ± 56% interrupts.CPU115.PMI:Performance_monitoring_interrupts
32.50 ± 9% +141.5% 78.50 ± 79% interrupts.CPU119.RES:Rescheduling_interrupts
5795 ± 31% -36.3% 3689 ± 7% interrupts.CPU12.NMI:Non-maskable_interrupts
5795 ± 31% -36.3% 3689 ± 7% interrupts.CPU12.PMI:Performance_monitoring_interrupts
31.25 ± 13% +244.0% 107.50 ± 79% interrupts.CPU127.RES:Rescheduling_interrupts
6072 ± 16% -32.6% 4094 ± 37% interrupts.CPU133.NMI:Non-maskable_interrupts
6072 ± 16% -32.6% 4094 ± 37% interrupts.CPU133.PMI:Performance_monitoring_interrupts
29.75 ± 13% +97.5% 58.75 ± 43% interrupts.CPU141.RES:Rescheduling_interrupts
3232 ± 22% +69.4% 5475 ± 28% interrupts.CPU25.NMI:Non-maskable_interrupts
3232 ± 22% +69.4% 5475 ± 28% interrupts.CPU25.PMI:Performance_monitoring_interrupts
5761 ± 20% +32.2% 7615 ± 4% interrupts.CPU29.NMI:Non-maskable_interrupts
5761 ± 20% +32.2% 7615 ± 4% interrupts.CPU29.PMI:Performance_monitoring_interrupts
5831 ± 24% -38.8% 3566 ± 32% interrupts.CPU3.NMI:Non-maskable_interrupts
5831 ± 24% -38.8% 3566 ± 32% interrupts.CPU3.PMI:Performance_monitoring_interrupts
4257 ± 14% +48.0% 6301 ± 10% interrupts.CPU36.NMI:Non-maskable_interrupts
4257 ± 14% +48.0% 6301 ± 10% interrupts.CPU36.PMI:Performance_monitoring_interrupts
6022 ± 26% -41.0% 3553 ± 12% interrupts.CPU48.NMI:Non-maskable_interrupts
6022 ± 26% -41.0% 3553 ± 12% interrupts.CPU48.PMI:Performance_monitoring_interrupts
4806 ± 16% -35.7% 3091 ± 12% interrupts.CPU99.NMI:Non-maskable_interrupts
4806 ± 16% -35.7% 3091 ± 12% interrupts.CPU99.PMI:Performance_monitoring_interrupts
17.42 -11.8% 15.36 perf-stat.i.MPKI
1.005e+10 +44.7% 1.455e+10 perf-stat.i.branch-instructions
0.52 -0.0 0.48 perf-stat.i.branch-miss-rate%
50729291 +33.0% 67453908 perf-stat.i.branch-misses
35.51 +2.5 37.97 perf-stat.i.cache-miss-rate%
3.216e+08 +34.8% 4.335e+08 perf-stat.i.cache-misses
9.03e+08 +26.0% 1.138e+09 perf-stat.i.cache-references
4.05 -30.3% 2.82 perf-stat.i.cpi
675.62 -25.2% 505.55 perf-stat.i.cycles-between-cache-misses
14707774 ± 9% +34.4% 19774179 ± 2% perf-stat.i.dTLB-load-misses
1.565e+10 +42.2% 2.225e+10 perf-stat.i.dTLB-loads
73922631 +37.6% 1.017e+08 perf-stat.i.dTLB-store-misses
9.737e+09 +36.5% 1.329e+10 perf-stat.i.dTLB-stores
93.39 +1.8 95.21 perf-stat.i.iTLB-load-miss-rate%
31412146 +37.9% 43313640 perf-stat.i.iTLB-load-misses
2165412 -3.2% 2096990 perf-stat.i.iTLB-loads
5.18e+10 +43.0% 7.408e+10 perf-stat.i.instructions
1657 +4.0% 1724 perf-stat.i.instructions-per-iTLB-miss
0.25 +43.4% 0.35 perf-stat.i.ipc
255.15 +40.9% 359.51 perf-stat.i.metric.M/sec
10465904 +37.7% 14408416 perf-stat.i.minor-faults
6.04 -2.6 3.48 ± 2% perf-stat.i.node-load-miss-rate%
15615698 -25.0% 11707799 perf-stat.i.node-load-misses
2.551e+08 +37.9% 3.516e+08 perf-stat.i.node-loads
8900850 +39.9% 12451889 perf-stat.i.node-store-misses
39327623 +40.3% 55186081 perf-stat.i.node-stores
10465907 +37.7% 14408418 perf-stat.i.page-faults
17.43 -11.9% 15.36 perf-stat.overall.MPKI
0.50 -0.0 0.46 perf-stat.overall.branch-miss-rate%
35.62 +2.5 38.09 perf-stat.overall.cache-miss-rate%
4.05 -30.3% 2.82 perf-stat.overall.cpi
652.44 -26.0% 482.69 perf-stat.overall.cycles-between-cache-misses
93.55 +1.8 95.38 perf-stat.overall.iTLB-load-miss-rate%
1649 +3.7% 1710 perf-stat.overall.instructions-per-iTLB-miss
0.25 +43.4% 0.35 perf-stat.overall.ipc
5.77 -2.5 3.22 perf-stat.overall.node-load-miss-rate%
1484194 +3.8% 1540211 perf-stat.overall.path-length
1.002e+10 +44.7% 1.449e+10 perf-stat.ps.branch-instructions
50561382 +32.9% 67185149 perf-stat.ps.branch-misses
3.205e+08 +34.8% 4.319e+08 perf-stat.ps.cache-misses
8.999e+08 +26.0% 1.134e+09 perf-stat.ps.cache-references
14656335 ± 9% +34.4% 19698887 ± 2% perf-stat.ps.dTLB-load-misses
1.559e+10 +42.2% 2.217e+10 perf-stat.ps.dTLB-loads
73657440 +37.6% 1.013e+08 perf-stat.ps.dTLB-store-misses
9.703e+09 +36.5% 1.324e+10 perf-stat.ps.dTLB-stores
31302020 +37.9% 43150913 perf-stat.ps.iTLB-load-misses
2157706 -3.2% 2088881 perf-stat.ps.iTLB-loads
5.163e+10 +43.0% 7.38e+10 perf-stat.ps.instructions
10429014 +37.6% 14354103 perf-stat.ps.minor-faults
15559965 -25.0% 11664101 perf-stat.ps.node-load-misses
2.542e+08 +37.8% 3.503e+08 perf-stat.ps.node-loads
8869865 +39.9% 12405459 perf-stat.ps.node-store-misses
39189951 +40.3% 54978655 perf-stat.ps.node-stores
10429016 +37.6% 14354105 perf-stat.ps.page-faults
1.561e+13 +42.9% 2.23e+13 perf-stat.total.instructions
16.08 ± 14% -11.3 4.75 ± 11% perf-profile.calltrace.cycles-pp.mem_cgroup_charge.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
6.45 ± 15% -5.0 1.43 ± 13% perf-profile.calltrace.cycles-pp.get_mem_cgroup_from_mm.mem_cgroup_charge.do_fault.__handle_mm_fault.handle_mm_fault
1.22 ± 10% -0.6 0.64 ± 9% perf-profile.calltrace.cycles-pp.__mod_memcg_lruvec_state.page_add_new_anon_rmap.alloc_set_pte.finish_fault.do_fault
1.38 ± 11% -0.5 0.83 ± 9% perf-profile.calltrace.cycles-pp.page_add_new_anon_rmap.alloc_set_pte.finish_fault.do_fault.__handle_mm_fault
0.68 ± 14% +0.2 0.92 ± 9% perf-profile.calltrace.cycles-pp.find_get_entry.find_lock_entry.shmem_getpage_gfp.shmem_fault.__do_fault
0.84 ± 13% +0.3 1.13 ± 9% perf-profile.calltrace.cycles-pp.find_lock_entry.shmem_getpage_gfp.shmem_fault.__do_fault.do_fault
0.77 ± 13% +0.3 1.07 ± 11% perf-profile.calltrace.cycles-pp.__irqentry_text_end.testcase
0.77 ± 13% +0.3 1.07 ± 13% perf-profile.calltrace.cycles-pp.__list_del_entry_valid.rmqueue_bulk.rmqueue.get_page_from_freelist.__alloc_pages_nodemask
0.61 ± 13% +0.3 0.92 ± 12% perf-profile.calltrace.cycles-pp.release_pages.tlb_flush_mmu.tlb_finish_mmu.unmap_region.__do_munmap
0.63 ± 13% +0.3 0.95 ± 12% perf-profile.calltrace.cycles-pp.tlb_flush_mmu.tlb_finish_mmu.unmap_region.__do_munmap.__vm_munmap
0.63 ± 13% +0.3 0.96 ± 12% perf-profile.calltrace.cycles-pp.tlb_finish_mmu.unmap_region.__do_munmap.__vm_munmap.__x64_sys_munmap
0.97 ± 13% +0.3 1.30 ± 9% perf-profile.calltrace.cycles-pp.shmem_getpage_gfp.shmem_fault.__do_fault.do_fault.__handle_mm_fault
1.07 ± 13% +0.4 1.52 ± 10% perf-profile.calltrace.cycles-pp.shmem_fault.__do_fault.do_fault.__handle_mm_fault.handle_mm_fault
1.17 ± 13% +0.5 1.64 ± 10% perf-profile.calltrace.cycles-pp.__do_fault.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
1.10 ± 12% +0.6 1.73 ± 12% perf-profile.calltrace.cycles-pp.free_pcppages_bulk.free_unref_page_list.release_pages.tlb_flush_mmu.zap_pte_range
0.00 +0.6 0.65 ± 10% perf-profile.calltrace.cycles-pp.__mod_memcg_lruvec_state.page_remove_rmap.zap_pte_range.unmap_page_range.unmap_vmas
0.81 ± 10% +0.7 1.48 ± 9% perf-profile.calltrace.cycles-pp.page_counter_try_charge.try_charge.mem_cgroup_charge.do_fault.__handle_mm_fault
1.29 ± 12% +0.7 1.98 ± 12% perf-profile.calltrace.cycles-pp.free_unref_page_list.release_pages.tlb_flush_mmu.zap_pte_range.unmap_page_range
1.27 ± 13% +0.8 2.08 ± 12% perf-profile.calltrace.cycles-pp.rmqueue_bulk.rmqueue.get_page_from_freelist.__alloc_pages_nodemask.alloc_pages_vma
0.00 +0.8 0.84 ± 12% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.rmqueue_bulk.rmqueue.get_page_from_freelist
0.00 +0.8 0.85 ± 12% perf-profile.calltrace.cycles-pp._raw_spin_lock.rmqueue_bulk.rmqueue.get_page_from_freelist.__alloc_pages_nodemask
1.66 ± 13% +0.9 2.58 ± 12% perf-profile.calltrace.cycles-pp.rmqueue.get_page_from_freelist.__alloc_pages_nodemask.alloc_pages_vma.do_fault
1.17 ± 10% +0.9 2.10 ± 9% perf-profile.calltrace.cycles-pp.try_charge.mem_cgroup_charge.do_fault.__handle_mm_fault.handle_mm_fault
1.86 ± 13% +1.0 2.85 ± 12% perf-profile.calltrace.cycles-pp.get_page_from_freelist.__alloc_pages_nodemask.alloc_pages_vma.do_fault.__handle_mm_fault
2.06 ± 13% +1.1 3.12 ± 12% perf-profile.calltrace.cycles-pp.__alloc_pages_nodemask.alloc_pages_vma.do_fault.__handle_mm_fault.handle_mm_fault
2.37 ± 13% +1.2 3.53 ± 12% perf-profile.calltrace.cycles-pp.alloc_pages_vma.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
2.03 ± 14% +1.7 3.73 ± 14% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.release_pages.tlb_flush_mmu.zap_pte_range
2.04 ± 14% +1.7 3.75 ± 14% perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.release_pages.tlb_flush_mmu.zap_pte_range.unmap_page_range
4.69 ± 13% +2.5 7.17 ± 13% perf-profile.calltrace.cycles-pp.release_pages.tlb_flush_mmu.zap_pte_range.unmap_page_range.unmap_vmas
5.01 ± 13% +2.6 7.63 ± 12% perf-profile.calltrace.cycles-pp.tlb_flush_mmu.zap_pte_range.unmap_page_range.unmap_vmas.unmap_region
3.28 ± 11% +3.1 6.34 ± 9% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.pagevec_lru_move_fn.lru_cache_add.alloc_set_pte
3.32 ± 10% +3.1 6.40 ± 9% perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.pagevec_lru_move_fn.lru_cache_add.alloc_set_pte.finish_fault
7.62 ± 12% +3.3 10.96 ± 12% perf-profile.calltrace.cycles-pp.zap_pte_range.unmap_page_range.unmap_vmas.unmap_region.__do_munmap
7.62 ± 12% +3.4 10.97 ± 12% perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.unmap_region.__do_munmap.__vm_munmap
7.62 ± 12% +3.4 10.98 ± 12% perf-profile.calltrace.cycles-pp.unmap_vmas.unmap_region.__do_munmap.__vm_munmap.__x64_sys_munmap
4.91 ± 11% +3.4 8.30 ± 9% perf-profile.calltrace.cycles-pp.pagevec_lru_move_fn.lru_cache_add.alloc_set_pte.finish_fault.do_fault
8.80 ± 11% +3.4 12.22 ± 10% perf-profile.calltrace.cycles-pp.alloc_set_pte.finish_fault.do_fault.__handle_mm_fault.handle_mm_fault
8.83 ± 11% +3.4 12.26 ± 10% perf-profile.calltrace.cycles-pp.finish_fault.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
5.01 ± 11% +3.4 8.45 ± 9% perf-profile.calltrace.cycles-pp.lru_cache_add.alloc_set_pte.finish_fault.do_fault.__handle_mm_fault
8.27 ± 12% +3.7 11.94 ± 12% perf-profile.calltrace.cycles-pp.__do_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe
8.27 ± 12% +3.7 11.94 ± 12% perf-profile.calltrace.cycles-pp.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
8.27 ± 12% +3.7 11.94 ± 12% perf-profile.calltrace.cycles-pp.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
8.27 ± 12% +3.7 11.94 ± 12% perf-profile.calltrace.cycles-pp.unmap_region.__do_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64
8.27 ± 12% +3.7 11.95 ± 12% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
8.27 ± 12% +3.7 11.95 ± 12% perf-profile.calltrace.cycles-pp.__munmap
8.27 ± 12% +3.7 11.95 ± 12% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__munmap
16.11 ± 14% -11.3 4.76 ± 11% perf-profile.children.cycles-pp.mem_cgroup_charge
6.48 ± 15% -5.0 1.45 ± 13% perf-profile.children.cycles-pp.get_mem_cgroup_from_mm
2.04 ± 12% -1.3 0.74 ± 10% perf-profile.children.cycles-pp.__count_memcg_events
2.30 ± 11% -0.9 1.43 ± 10% perf-profile.children.cycles-pp.__mod_memcg_state
2.30 ± 16% -0.8 1.51 ± 10% perf-profile.children.cycles-pp.native_irq_return_iret
1.38 ± 11% -0.5 0.83 ± 10% perf-profile.children.cycles-pp.page_add_new_anon_rmap
0.90 ± 12% -0.5 0.37 ± 10% perf-profile.children.cycles-pp.mem_cgroup_charge_statistics
0.28 ± 16% -0.2 0.09 ± 13% perf-profile.children.cycles-pp.uncharge_page
0.40 ± 14% -0.2 0.23 ± 10% perf-profile.children.cycles-pp.mem_cgroup_uncharge_list
0.06 ± 14% +0.0 0.08 ± 6% perf-profile.children.cycles-pp.pte_alloc_one
0.07 ± 7% +0.0 0.09 ± 9% perf-profile.children.cycles-pp.find_vma
0.06 ± 14% +0.0 0.08 ± 12% perf-profile.children.cycles-pp.get_task_policy
0.07 ± 7% +0.0 0.09 ± 7% perf-profile.children.cycles-pp.__might_sleep
0.06 ± 13% +0.0 0.09 ± 9% perf-profile.children.cycles-pp._cond_resched
0.08 ± 6% +0.0 0.10 ± 12% perf-profile.children.cycles-pp.up_read
0.07 ± 11% +0.0 0.10 ± 10% perf-profile.children.cycles-pp.unlock_page
0.07 ± 14% +0.0 0.10 ± 10% perf-profile.children.cycles-pp.page_mapping
0.04 ± 58% +0.0 0.07 ± 5% perf-profile.children.cycles-pp.vmacache_find
0.07 ± 7% +0.0 0.10 ± 13% perf-profile.children.cycles-pp.mem_cgroup_update_lru_size
0.06 ± 9% +0.0 0.09 ± 15% perf-profile.children.cycles-pp._raw_spin_unlock_irqrestore
0.09 ± 14% +0.0 0.13 ± 10% perf-profile.children.cycles-pp.free_unref_page_commit
0.09 ± 14% +0.0 0.13 ± 14% perf-profile.children.cycles-pp.perf_swevent_get_recursion_context
0.01 ±173% +0.0 0.06 ± 9% perf-profile.children.cycles-pp.prep_new_page
0.09 ± 8% +0.0 0.14 ± 11% perf-profile.children.cycles-pp.__mod_zone_page_state
0.12 ± 16% +0.0 0.16 ± 10% perf-profile.children.cycles-pp.__list_add_valid
0.12 ± 14% +0.0 0.17 ± 9% perf-profile.children.cycles-pp.___might_sleep
0.08 ± 14% +0.0 0.12 ± 16% perf-profile.children.cycles-pp.cgroup_throttle_swaprate
0.03 ±100% +0.0 0.07 ± 11% perf-profile.children.cycles-pp.page_counter_uncharge
0.03 ±100% +0.0 0.07 ± 11% perf-profile.children.cycles-pp.page_counter_cancel
0.09 ± 8% +0.1 0.15 ± 14% perf-profile.children.cycles-pp.propagate_protected_usage
0.17 ± 8% +0.1 0.23 ± 11% perf-profile.children.cycles-pp.__mod_node_page_state
0.16 ± 13% +0.1 0.22 ± 12% perf-profile.children.cycles-pp.sync_regs
0.20 ± 14% +0.1 0.27 ± 11% perf-profile.children.cycles-pp.___perf_sw_event
0.24 ± 10% +0.1 0.32 ± 12% perf-profile.children.cycles-pp.__mod_lruvec_state
0.30 ± 14% +0.1 0.41 ± 8% perf-profile.children.cycles-pp.xas_load
0.34 ± 12% +0.1 0.47 ± 12% perf-profile.children.cycles-pp.__perf_sw_event
0.33 ± 13% +0.2 0.48 ± 10% perf-profile.children.cycles-pp.free_pages_and_swap_cache
0.68 ± 13% +0.2 0.92 ± 9% perf-profile.children.cycles-pp.find_get_entry
0.85 ± 13% +0.3 1.15 ± 9% perf-profile.children.cycles-pp.find_lock_entry
0.77 ± 13% +0.3 1.07 ± 11% perf-profile.children.cycles-pp.__irqentry_text_end
0.64 ± 13% +0.3 0.96 ± 12% perf-profile.children.cycles-pp.tlb_finish_mmu
0.98 ± 13% +0.3 1.31 ± 9% perf-profile.children.cycles-pp.shmem_getpage_gfp
1.08 ± 13% +0.4 1.52 ± 10% perf-profile.children.cycles-pp.shmem_fault
1.17 ± 12% +0.4 1.61 ± 12% perf-profile.children.cycles-pp.__list_del_entry_valid
1.17 ± 13% +0.5 1.64 ± 10% perf-profile.children.cycles-pp.__do_fault
0.82 ± 10% +0.7 1.50 ± 9% perf-profile.children.cycles-pp.page_counter_try_charge
1.25 ± 12% +0.7 1.96 ± 12% perf-profile.children.cycles-pp.free_pcppages_bulk
1.48 ± 12% +0.8 2.26 ± 12% perf-profile.children.cycles-pp.free_unref_page_list
1.29 ± 13% +0.8 2.10 ± 12% perf-profile.children.cycles-pp.rmqueue_bulk
1.68 ± 13% +0.9 2.61 ± 12% perf-profile.children.cycles-pp.rmqueue
1.17 ± 10% +0.9 2.12 ± 9% perf-profile.children.cycles-pp.try_charge
1.90 ± 13% +1.0 2.90 ± 12% perf-profile.children.cycles-pp.get_page_from_freelist
2.14 ± 13% +1.1 3.22 ± 12% perf-profile.children.cycles-pp.__alloc_pages_nodemask
2.38 ± 13% +1.2 3.54 ± 12% perf-profile.children.cycles-pp.alloc_pages_vma
2.97 ± 11% +1.3 4.23 ± 11% perf-profile.children.cycles-pp._raw_spin_lock
5.43 ± 13% +2.8 8.27 ± 12% perf-profile.children.cycles-pp.release_pages
5.64 ± 13% +2.9 8.59 ± 12% perf-profile.children.cycles-pp.tlb_flush_mmu
7.62 ± 12% +3.3 10.97 ± 12% perf-profile.children.cycles-pp.zap_pte_range
7.63 ± 12% +3.4 10.98 ± 12% perf-profile.children.cycles-pp.unmap_vmas
7.63 ± 12% +3.4 10.98 ± 12% perf-profile.children.cycles-pp.unmap_page_range
4.92 ± 11% +3.4 8.32 ± 9% perf-profile.children.cycles-pp.pagevec_lru_move_fn
8.81 ± 11% +3.4 12.22 ± 10% perf-profile.children.cycles-pp.alloc_set_pte
8.84 ± 11% +3.4 12.26 ± 10% perf-profile.children.cycles-pp.finish_fault
5.01 ± 11% +3.4 8.45 ± 9% perf-profile.children.cycles-pp.lru_cache_add
8.27 ± 12% +3.7 11.94 ± 12% perf-profile.children.cycles-pp.__vm_munmap
8.27 ± 12% +3.7 11.94 ± 12% perf-profile.children.cycles-pp.__x64_sys_munmap
8.27 ± 12% +3.7 11.94 ± 12% perf-profile.children.cycles-pp.unmap_region
8.27 ± 12% +3.7 11.95 ± 12% perf-profile.children.cycles-pp.__do_munmap
8.37 ± 12% +3.7 12.04 ± 12% perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
8.36 ± 12% +3.7 12.04 ± 12% perf-profile.children.cycles-pp.do_syscall_64
8.27 ± 12% +3.7 11.95 ± 12% perf-profile.children.cycles-pp.__munmap
5.65 ± 12% +5.0 10.64 ± 11% perf-profile.children.cycles-pp._raw_spin_lock_irqsave
6.27 ± 12% +5.7 12.01 ± 11% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
7.44 ± 15% -6.7 0.74 ± 13% perf-profile.self.cycles-pp.mem_cgroup_charge
6.43 ± 15% -5.0 1.43 ± 13% perf-profile.self.cycles-pp.get_mem_cgroup_from_mm
2.04 ± 12% -1.3 0.73 ± 10% perf-profile.self.cycles-pp.__count_memcg_events
2.29 ± 11% -0.9 1.42 ± 10% perf-profile.self.cycles-pp.__mod_memcg_state
2.30 ± 16% -0.8 1.51 ± 10% perf-profile.self.cycles-pp.native_irq_return_iret
0.28 ± 16% -0.2 0.08 ± 15% perf-profile.self.cycles-pp.uncharge_page
0.06 ± 9% +0.0 0.08 ± 10% perf-profile.self.cycles-pp.__might_sleep
0.07 ± 11% +0.0 0.10 ± 8% perf-profile.self.cycles-pp.up_read
0.06 ± 6% +0.0 0.09 ± 14% perf-profile.self.cycles-pp._raw_spin_lock_irqsave
0.07 ± 7% +0.0 0.09 ± 11% perf-profile.self.cycles-pp.__perf_sw_event
0.07 ± 7% +0.0 0.09 ± 11% perf-profile.self.cycles-pp.__mod_lruvec_state
0.06 ± 9% +0.0 0.08 ± 12% perf-profile.self.cycles-pp.mem_cgroup_page_lruvec
0.04 ± 58% +0.0 0.07 ± 7% perf-profile.self.cycles-pp.vmacache_find
0.08 ± 14% +0.0 0.10 ± 12% perf-profile.self.cycles-pp.alloc_pages_vma
0.07 ± 14% +0.0 0.10 ± 10% perf-profile.self.cycles-pp.unlock_page
0.07 ± 14% +0.0 0.10 ± 10% perf-profile.self.cycles-pp.page_mapping
0.06 ± 11% +0.0 0.09 ± 11% perf-profile.self.cycles-pp.mem_cgroup_update_lru_size
0.10 ± 12% +0.0 0.14 ± 11% perf-profile.self.cycles-pp.exc_page_fault
0.04 ± 59% +0.0 0.08 ± 10% perf-profile.self.cycles-pp.get_task_policy
0.10 ± 11% +0.0 0.13 ± 12% perf-profile.self.cycles-pp.find_lock_entry
0.09 ± 8% +0.0 0.13 ± 14% perf-profile.self.cycles-pp.__mod_zone_page_state
0.08 ± 13% +0.0 0.12 ± 13% perf-profile.self.cycles-pp.perf_swevent_get_recursion_context
0.13 ± 9% +0.0 0.17 ± 13% perf-profile.self.cycles-pp.do_user_addr_fault
0.10 ± 15% +0.0 0.14 ± 11% perf-profile.self.cycles-pp.lru_cache_add
0.12 ± 10% +0.0 0.16 ± 9% perf-profile.self.cycles-pp.alloc_set_pte
0.06 ± 13% +0.0 0.10 ± 14% perf-profile.self.cycles-pp.cgroup_throttle_swaprate
0.01 ±173% +0.0 0.06 ± 9% perf-profile.self.cycles-pp.free_unref_page_prepare
0.12 ± 13% +0.0 0.16 ± 9% perf-profile.self.cycles-pp.___might_sleep
0.14 ± 12% +0.0 0.18 ± 12% perf-profile.self.cycles-pp.___perf_sw_event
0.10 ± 15% +0.0 0.15 ± 10% perf-profile.self.cycles-pp.__list_add_valid
0.14 ± 11% +0.0 0.18 ± 9% perf-profile.self.cycles-pp.__alloc_pages_nodemask
0.01 ±173% +0.0 0.06 ± 11% perf-profile.self.cycles-pp.page_counter_cancel
0.09 ± 7% +0.1 0.14 ± 12% perf-profile.self.cycles-pp.propagate_protected_usage
0.17 ± 9% +0.1 0.22 ± 11% perf-profile.self.cycles-pp.__mod_node_page_state
0.03 ±100% +0.1 0.08 ± 10% perf-profile.self.cycles-pp.memcg_check_events
0.14 ± 11% +0.1 0.19 ± 11% perf-profile.self.cycles-pp.sync_regs
0.18 ± 11% +0.1 0.25 ± 11% perf-profile.self.cycles-pp.do_fault
0.23 ± 15% +0.1 0.31 ± 12% perf-profile.self.cycles-pp.rmqueue
0.22 ± 11% +0.1 0.31 ± 12% perf-profile.self.cycles-pp.handle_mm_fault
0.18 ± 11% +0.1 0.27 ± 10% perf-profile.self.cycles-pp.page_remove_rmap
0.11 ± 12% +0.1 0.20 ± 12% perf-profile.self.cycles-pp.shmem_fault
0.26 ± 12% +0.1 0.36 ± 8% perf-profile.self.cycles-pp.xas_load
0.38 ± 14% +0.1 0.50 ± 10% perf-profile.self.cycles-pp.find_get_entry
0.35 ± 12% +0.1 0.48 ± 9% perf-profile.self.cycles-pp.__handle_mm_fault
0.35 ± 11% +0.1 0.49 ± 11% perf-profile.self.cycles-pp.release_pages
0.43 ± 13% +0.1 0.58 ± 7% perf-profile.self.cycles-pp.__pagevec_lru_add_fn
0.32 ± 12% +0.2 0.48 ± 11% perf-profile.self.cycles-pp.free_pages_and_swap_cache
0.35 ± 10% +0.3 0.62 ± 9% perf-profile.self.cycles-pp.try_charge
0.77 ± 13% +0.3 1.07 ± 11% perf-profile.self.cycles-pp.__irqentry_text_end
0.81 ± 12% +0.3 1.14 ± 12% perf-profile.self.cycles-pp.free_pcppages_bulk
1.16 ± 12% +0.4 1.60 ± 12% perf-profile.self.cycles-pp.__list_del_entry_valid
0.62 ± 8% +0.5 1.10 ± 9% perf-profile.self.cycles-pp.__mod_memcg_lruvec_state
0.74 ± 10% +0.6 1.36 ± 9% perf-profile.self.cycles-pp.page_counter_try_charge
1.47 ± 10% +0.7 2.22 ± 10% perf-profile.self.cycles-pp.zap_pte_range
2.21 ± 12% +0.8 3.06 ± 10% perf-profile.self.cycles-pp.testcase
6.27 ± 12% +5.7 12.01 ± 11% perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
will-it-scale.72.processes
1.5e+07 +----------------------------------------------------------------+
1.45e+07 |-+ O O O O O O |
| O O O O O O |
1.4e+07 |-+ O |
1.35e+07 |-+ |
| |
1.3e+07 |-+ |
1.25e+07 |-+ |
1.2e+07 |-+ |
| |
1.15e+07 |-+ |
1.1e+07 |-+ |
| ..+... ..+.... |
1.05e+07 |....+...+....+....+...+....+....+...+.. +.. + |
1e+07 +----------------------------------------------------------------+
will-it-scale.per_process_ops
210000 +------------------------------------------------------------------+
| O |
200000 |-+ O O O O O O O O O O O |
| O |
190000 |-+ |
| |
180000 |-+ |
| |
170000 |-+ |
| |
160000 |-+ |
| |
150000 |-+ |
|....+....+...+....+....+....+....+...+....+....+....+...+ |
140000 +------------------------------------------------------------------+
will-it-scale.workload
1.5e+07 +----------------------------------------------------------------+
1.45e+07 |-+ O O O O O O |
| O O O O O O |
1.4e+07 |-+ O |
1.35e+07 |-+ |
| |
1.3e+07 |-+ |
1.25e+07 |-+ |
1.2e+07 |-+ |
| |
1.15e+07 |-+ |
1.1e+07 |-+ |
| ..+... ..+.... |
1.05e+07 |....+...+....+....+...+....+....+...+.. +.. + |
1e+07 +----------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
***************************************************************************************************
lkp-ivb-2ep1: 48 threads Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz with 112G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
gcc-9/performance/x86_64-rhel-8.3/process/50%/debian-10.4-x86_64-20200603.cgz/lkp-ivb-2ep1/page_fault2/will-it-scale/0x42e
commit:
fa02fcd94b ("Merge tag 'media/v5.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media")
4df910620b ("mm: memcg: relayout structure mem_cgroup to avoid cache interference")
fa02fcd94b0c8dff 4df910620bebb5cfe234af16ac8
---------------- ---------------------------
fail:runs %reproduction fail:runs
| | |
5:4 18% 6:4 perf-profile.calltrace.cycles-pp.error_entry.testcase
5:4 18% 6:4 perf-profile.children.cycles-pp.error_entry
4:4 15% 5:4 perf-profile.self.cycles-pp.error_entry
%stddev %change %stddev
\ | \
0.96 ± 16% -0.3 0.71 ± 14% perf-profile.calltrace.cycles-pp.page_add_new_anon_rmap.alloc_set_pte.finish_fault.do_fault.__handle_mm_fault
0.96 ± 15% -0.5 0.47 ± 6% perf-profile.children.cycles-pp.__count_memcg_events
1.42 ± 14% -0.4 1.05 ± 10% perf-profile.children.cycles-pp.__mod_memcg_state
0.97 ± 16% -0.3 0.72 ± 14% perf-profile.children.cycles-pp.page_add_new_anon_rmap
0.43 ± 10% -0.2 0.26 ± 5% perf-profile.children.cycles-pp.mem_cgroup_charge_statistics
0.61 ± 12% -0.2 0.44 ± 10% perf-profile.children.cycles-pp.get_mem_cgroup_from_mm
0.06 ± 11% +0.0 0.08 ± 10% perf-profile.children.cycles-pp.worker_thread
0.07 ± 7% +0.1 0.12 ± 32% perf-profile.children.cycles-pp.ret_from_fork
0.07 ± 7% +0.1 0.12 ± 32% perf-profile.children.cycles-pp.kthread
0.52 ± 12% +0.1 0.60 ± 10% perf-profile.children.cycles-pp.xas_load
0.95 ± 14% -0.5 0.47 ± 7% perf-profile.self.cycles-pp.__count_memcg_events
1.40 ± 15% -0.4 1.04 ± 10% perf-profile.self.cycles-pp.__mod_memcg_state
0.60 ± 12% -0.2 0.42 ± 10% perf-profile.self.cycles-pp.get_mem_cgroup_from_mm
0.41 ± 9% -0.1 0.34 ± 12% perf-profile.self.cycles-pp.mem_cgroup_charge
0.29 ± 13% +0.1 0.35 ± 9% perf-profile.self.cycles-pp.handle_mm_fault
0.43 ± 11% +0.1 0.51 ± 9% perf-profile.self.cycles-pp.xas_load
1.06 ± 11% +0.2 1.29 ± 10% perf-profile.self.cycles-pp.free_pages_and_swap_cache
1.67 ± 11% +0.3 1.94 ± 10% perf-profile.self.cycles-pp.free_pcppages_bulk
0.32 ± 25% +0.7 1.02 ± 71% perf-profile.self.cycles-pp.alloc_set_pte
5158061 +2.0% 5263503 will-it-scale.24.processes
214918 +2.0% 219312 will-it-scale.per_process_ops
5158061 +2.0% 5263503 will-it-scale.workload
31247 ± 57% -64.6% 11065 ±169% numa-numastat.node1.other_node
14458 ± 5% -10.2% 12989 ± 7% numa-vmstat.node0.nr_slab_unreclaimable
11137 ± 4% +8.3% 12057 ± 3% numa-vmstat.node1.nr_slab_reclaimable
12240 ± 8% +18.3% 14486 ± 5% numa-vmstat.node1.nr_slab_unreclaimable
2484583 ± 4% +20.9% 3003853 ± 10% sched_debug.cfs_rq:/.min_vruntime.max
1911501 ± 14% +33.8% 2557320 ± 13% sched_debug.cfs_rq:/.spread0.max
3425 ± 11% -11.9% 3018 ± 2% sched_debug.cpu.nr_switches.stddev
1.557e+09 +2.0% 1.589e+09 proc-vmstat.numa_hit
1.557e+09 +2.0% 1.589e+09 proc-vmstat.numa_local
1.557e+09 +2.0% 1.589e+09 proc-vmstat.pgalloc_normal
1.553e+09 +2.0% 1.584e+09 proc-vmstat.pgfault
1.557e+09 +2.0% 1.589e+09 proc-vmstat.pgfree
57832 ± 5% -10.2% 51959 ± 7% numa-meminfo.node0.SUnreclaim
101264 ± 4% -8.8% 92360 ± 5% numa-meminfo.node0.Slab
44549 ± 4% +8.3% 48229 ± 3% numa-meminfo.node1.KReclaimable
44549 ± 4% +8.3% 48229 ± 3% numa-meminfo.node1.SReclaimable
48963 ± 8% +18.3% 57947 ± 5% numa-meminfo.node1.SUnreclaim
93513 ± 6% +13.5% 106178 ± 4% numa-meminfo.node1.Slab
798.75 ± 6% +12.5% 898.75 ± 7% slabinfo.file_lock_cache.active_objs
798.75 ± 6% +12.5% 898.75 ± 7% slabinfo.file_lock_cache.num_objs
2208 ± 2% +14.7% 2533 ± 8% slabinfo.fsnotify_mark_connector.active_objs
2208 ± 2% +14.7% 2533 ± 8% slabinfo.fsnotify_mark_connector.num_objs
1848 ± 4% +9.0% 2014 ± 5% slabinfo.kmalloc-rcl-96.active_objs
1848 ± 4% +9.0% 2014 ± 5% slabinfo.kmalloc-rcl-96.num_objs
0.03 ± 5% +38.6% 0.04 ± 7% perf-sched.sch_delay.avg.ms.futex_wait_queue_me.futex_wait.do_futex.__x64_sys_futex
0.04 ± 7% +29.9% 0.06 ± 12% perf-sched.sch_delay.max.ms.__x64_sys_pause.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
0.04 ± 7% +46.9% 0.05 ± 17% perf-sched.sch_delay.max.ms.futex_wait_queue_me.futex_wait.do_futex.__x64_sys_futex
0.80 ±158% -94.0% 0.05 ± 10% perf-sched.sch_delay.max.ms.preempt_schedule_common._cond_resched.stop_one_cpu.__set_cpus_allowed_ptr.sched_setaffinity
0.04 ± 15% +34.5% 0.05 ± 5% perf-sched.sch_delay.max.ms.rcu_gp_kthread.kthread.ret_from_fork
3290 ± 50% -60.9% 1286 ± 33% perf-sched.wait_and_delay.max.ms.smpboot_thread_fn.kthread.ret_from_fork
0.37 ± 48% -44.6% 0.21 ± 5% perf-sched.wait_time.max.ms.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
3290 ± 50% -60.9% 1286 ± 33% perf-sched.wait_time.max.ms.smpboot_thread_fn.kthread.ret_from_fork
15251 ± 17% -20.0% 12195 ± 19% softirqs.CPU0.RCU
25042 ± 17% -38.4% 15437 ± 28% softirqs.CPU2.SCHED
23971 ± 11% -19.1% 19385 ± 4% softirqs.CPU21.SCHED
25139 ± 7% -12.0% 22110 ± 10% softirqs.CPU22.SCHED
10852 ± 13% -24.4% 8203 ± 11% softirqs.CPU24.SCHED
20926 ± 20% +44.8% 30292 ± 12% softirqs.CPU26.SCHED
10419 ± 6% -13.7% 8991 softirqs.CPU34.RCU
21252 ± 12% +23.3% 26201 ± 4% softirqs.CPU45.SCHED
4558 ± 35% -39.5% 2760 ± 60% interrupts.CPU0.NMI:Non-maskable_interrupts
4558 ± 35% -39.5% 2760 ± 60% interrupts.CPU0.PMI:Performance_monitoring_interrupts
171.50 ± 17% +28.1% 219.75 ± 12% interrupts.CPU2.RES:Rescheduling_interrupts
150.50 ± 4% +33.9% 201.50 ± 7% interrupts.CPU21.RES:Rescheduling_interrupts
944.00 ± 8% +51.4% 1429 ± 31% interrupts.CPU22.CAL:Function_call_interrupts
143.50 ± 14% +31.9% 189.25 ± 7% interrupts.CPU22.RES:Rescheduling_interrupts
187.75 ± 13% -30.2% 131.00 ± 18% interrupts.CPU26.RES:Rescheduling_interrupts
6516 ± 24% -52.8% 3074 ± 7% interrupts.CPU31.NMI:Non-maskable_interrupts
6516 ± 24% -52.8% 3074 ± 7% interrupts.CPU31.PMI:Performance_monitoring_interrupts
6557 ± 23% -37.7% 4084 ± 26% interrupts.CPU36.NMI:Non-maskable_interrupts
6557 ± 23% -37.7% 4084 ± 26% interrupts.CPU36.PMI:Performance_monitoring_interrupts
819.25 ± 18% +42.9% 1170 ± 15% interrupts.CPU40.CAL:Function_call_interrupts
935.25 ± 5% +12.2% 1049 ± 7% interrupts.CPU42.CAL:Function_call_interrupts
5249 ± 28% -46.8% 2793 ± 41% interrupts.CPU5.NMI:Non-maskable_interrupts
5249 ± 28% -46.8% 2793 ± 41% interrupts.CPU5.PMI:Performance_monitoring_interrupts
1099 ± 4% -15.9% 924.50 ± 6% interrupts.CPU8.CAL:Function_call_interrupts
27.32 -1.2% 27.00 perf-stat.i.MPKI
4.61e+09 +2.1% 4.705e+09 perf-stat.i.branch-instructions
66.43 +0.7 67.16 perf-stat.i.cache-miss-rate%
4.384e+08 +2.0% 4.471e+08 perf-stat.i.cache-misses
6.595e+08 +0.9% 6.655e+08 perf-stat.i.cache-references
2.99 -1.9% 2.93 perf-stat.i.cpi
165.29 -2.0% 162.05 perf-stat.i.cycles-between-cache-misses
8.043e+09 +1.8% 8.187e+09 perf-stat.i.dTLB-loads
5.408e+09 +2.1% 5.524e+09 perf-stat.i.dTLB-stores
5985395 +1.7% 6086521 perf-stat.i.iTLB-load-misses
2.413e+10 +2.1% 2.464e+10 perf-stat.i.instructions
0.34 +1.9% 0.34 perf-stat.i.ipc
403.98 +1.9% 411.70 perf-stat.i.metric.M/sec
5141091 +2.1% 5246543 perf-stat.i.minor-faults
4.65 -0.6 4.07 perf-stat.i.node-load-miss-rate%
13596718 -11.5% 12039508 perf-stat.i.node-load-misses
2.839e+08 +1.7% 2.887e+08 perf-stat.i.node-loads
3.06 -0.5 2.57 perf-stat.i.node-store-miss-rate%
9836492 -14.4% 8417873 perf-stat.i.node-store-misses
3.192e+08 +2.3% 3.266e+08 perf-stat.i.node-stores
5141100 +2.1% 5246556 perf-stat.i.page-faults
27.32 -1.2% 27.00 perf-stat.overall.MPKI
66.48 +0.7 67.19 perf-stat.overall.cache-miss-rate%
2.99 -1.9% 2.93 perf-stat.overall.cpi
164.46 -1.8% 161.48 perf-stat.overall.cycles-between-cache-misses
0.33 +2.0% 0.34 perf-stat.overall.ipc
4.57 -0.6 4.00 perf-stat.overall.node-load-miss-rate%
2.99 -0.5 2.51 perf-stat.overall.node-store-miss-rate%
4.594e+09 +2.1% 4.69e+09 perf-stat.ps.branch-instructions
4.369e+08 +2.0% 4.456e+08 perf-stat.ps.cache-misses
6.572e+08 +0.9% 6.632e+08 perf-stat.ps.cache-references
8.016e+09 +1.8% 8.159e+09 perf-stat.ps.dTLB-loads
5.39e+09 +2.1% 5.505e+09 perf-stat.ps.dTLB-stores
5965014 +1.7% 6065670 perf-stat.ps.iTLB-load-misses
2.405e+10 +2.1% 2.456e+10 perf-stat.ps.instructions
5123634 +2.0% 5228497 perf-stat.ps.minor-faults
13550037 -11.5% 11998434 perf-stat.ps.node-load-misses
2.829e+08 +1.7% 2.877e+08 perf-stat.ps.node-loads
9802170 -14.4% 8389142 perf-stat.ps.node-store-misses
3.182e+08 +2.3% 3.255e+08 perf-stat.ps.node-stores
5123644 +2.0% 5228510 perf-stat.ps.page-faults
7.271e+12 +2.1% 7.426e+12 perf-stat.total.instructions
***************************************************************************************************
lkp-hsw-4ex1: 144 threads Intel(R) Xeon(R) CPU E7-8890 v3 @ 2.50GHz with 512G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
gcc-9/performance/x86_64-rhel-8.3/process/100%/debian-10.4-x86_64-20200603.cgz/lkp-hsw-4ex1/page_fault2/will-it-scale/0x16
commit:
fa02fcd94b ("Merge tag 'media/v5.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media")
4df910620b ("mm: memcg: relayout structure mem_cgroup to avoid cache interference")
fa02fcd94b0c8dff 4df910620bebb5cfe234af16ac8
---------------- ---------------------------
fail:runs %reproduction fail:runs
| | |
3:4 5% 3:4 perf-profile.calltrace.cycles-pp.error_entry.testcase
3:4 6% 3:4 perf-profile.children.cycles-pp.error_entry
2:4 5% 3:4 perf-profile.self.cycles-pp.error_entry
%stddev %change %stddev
\ | \
10826219 +12.1% 12136182 will-it-scale.144.processes
75181 +12.1% 84278 will-it-scale.per_process_ops
6427 -0.8% 6374 will-it-scale.time.minor_page_faults
10826219 +12.1% 12136182 will-it-scale.workload
10304 ± 3% +12.7% 11616 ± 3% slabinfo.skbuff_head_cache.active_objs
10320 ± 3% +12.6% 11616 ± 3% slabinfo.skbuff_head_cache.num_objs
3952279 ± 6% -10.7% 3528860 ± 5% numa-meminfo.node0.AnonPages.max
3835803 ± 4% -10.6% 3428691 ± 3% numa-meminfo.node1.AnonPages.max
40533 ± 17% -25.1% 30339 ± 10% numa-meminfo.node3.KReclaimable
40533 ± 17% -25.1% 30339 ± 10% numa-meminfo.node3.SReclaimable
52626 ± 15% -21.2% 41452 ± 2% numa-meminfo.node3.SUnreclaim
93160 ± 15% -22.9% 71791 ± 4% numa-meminfo.node3.Slab
22237 ± 2% +3.7% 23055 proc-vmstat.nr_active_anon
22237 ± 2% +3.7% 23055 proc-vmstat.nr_zone_active_anon
3.267e+09 +12.2% 3.664e+09 proc-vmstat.numa_hit
3.267e+09 +12.2% 3.664e+09 proc-vmstat.numa_local
3.268e+09 +12.2% 3.667e+09 proc-vmstat.pgalloc_normal
3.255e+09 +12.2% 3.653e+09 proc-vmstat.pgfault
3.268e+09 +12.2% 3.667e+09 proc-vmstat.pgfree
8.355e+08 +9.7% 9.168e+08 numa-numastat.node0.local_node
8.355e+08 +9.7% 9.168e+08 numa-numastat.node0.numa_hit
8.106e+08 +13.1% 9.166e+08 numa-numastat.node1.local_node
8.106e+08 +13.1% 9.166e+08 numa-numastat.node1.numa_hit
8.092e+08 +13.4% 9.173e+08 numa-numastat.node2.local_node
8.092e+08 +13.4% 9.174e+08 numa-numastat.node2.numa_hit
8.124e+08 +12.8% 9.163e+08 numa-numastat.node3.local_node
8.124e+08 +12.8% 9.163e+08 numa-numastat.node3.numa_hit
4.171e+08 +10.0% 4.587e+08 numa-vmstat.node0.numa_hit
4.171e+08 +10.0% 4.587e+08 numa-vmstat.node0.numa_local
2026 +43.5% 2908 ± 25% numa-vmstat.node1.nr_mapped
4.054e+08 +13.3% 4.593e+08 numa-vmstat.node1.numa_hit
4.053e+08 +13.3% 4.592e+08 numa-vmstat.node1.numa_local
4.052e+08 +13.5% 4.598e+08 numa-vmstat.node2.numa_hit
4.051e+08 +13.5% 4.597e+08 numa-vmstat.node2.numa_local
10134 ± 17% -25.2% 7584 ± 10% numa-vmstat.node3.nr_slab_reclaimable
13156 ± 15% -21.2% 10362 ± 2% numa-vmstat.node3.nr_slab_unreclaimable
4.07e+08 +12.6% 4.582e+08 numa-vmstat.node3.numa_hit
4.068e+08 +12.6% 4.581e+08 numa-vmstat.node3.numa_local
585.75 ± 9% -21.5% 459.71 ± 3% sched_debug.cfs_rq:/.nr_spread_over.max
77.50 ± 8% -17.3% 64.13 ± 6% sched_debug.cfs_rq:/.nr_spread_over.stddev
608344 ± 41% +64.1% 998173 ± 12% sched_debug.cfs_rq:/.spread0.avg
811885 ± 31% +47.7% 1199394 ± 10% sched_debug.cfs_rq:/.spread0.max
-554391 -65.3% -192546 sched_debug.cfs_rq:/.spread0.min
2652761 ± 24% -28.7% 1890128 ± 22% sched_debug.cpu.avg_idle.max
233778 ± 17% -21.8% 182852 ± 22% sched_debug.cpu.avg_idle.stddev
21294 ± 37% -37.9% 13231 ± 43% sched_debug.cpu.max_idle_balance_cost.stddev
18116 ± 11% -19.0% 14665 ± 8% sched_debug.cpu.nr_switches.max
2627 ± 3% -11.9% 2315 ± 5% sched_debug.cpu.nr_switches.stddev
-21.46 -35.9% -13.75 sched_debug.cpu.nr_uninterruptible.min
14236 ± 14% -21.1% 11229 ± 8% sched_debug.cpu.sched_count.max
2290 ± 3% -13.4% 1983 ± 3% sched_debug.cpu.sched_count.stddev
7163 ± 7% -29.0% 5084 ± 6% sched_debug.cpu.ttwu_count.max
1060 ± 3% -17.4% 876.21 ± 3% sched_debug.cpu.ttwu_count.stddev
799.30 ± 8% -20.7% 634.07 ± 6% sched_debug.cpu.ttwu_local.stddev
56345 ± 20% -32.8% 37847 ± 10% syscalls.sys_close.max
10214940 ±129% -96.2% 386927 ± 29% syscalls.sys_mmap.max
3.599e+09 ± 4% -6.7e+08 2.933e+09 ± 4% syscalls.sys_mmap.noise.100%
3.664e+09 ± 4% -6.5e+08 3.019e+09 ± 4% syscalls.sys_mmap.noise.2%
3.655e+09 ± 4% -6.4e+08 3.011e+09 ± 4% syscalls.sys_mmap.noise.25%
3.664e+09 ± 4% -6.5e+08 3.019e+09 ± 4% syscalls.sys_mmap.noise.5%
3.637e+09 ± 4% -6.5e+08 2.988e+09 ± 4% syscalls.sys_mmap.noise.50%
3.616e+09 ± 4% -6.6e+08 2.96e+09 ± 4% syscalls.sys_mmap.noise.75%
3.562e+08 ± 4% +1.2e+08 4.787e+08 ± 31% syscalls.sys_openat.noise.100%
2119808 ± 57% -60.4% 840502 ± 24% syscalls.sys_write.max
2.979e+09 ± 44% -1.3e+09 1.709e+09 ± 5% syscalls.sys_write.noise.100%
3.002e+09 ± 44% -1.3e+09 1.735e+09 ± 5% syscalls.sys_write.noise.2%
2.999e+09 ± 44% -1.3e+09 1.733e+09 ± 5% syscalls.sys_write.noise.25%
3.002e+09 ± 44% -1.3e+09 1.735e+09 ± 5% syscalls.sys_write.noise.5%
2.995e+09 ± 44% -1.3e+09 1.728e+09 ± 5% syscalls.sys_write.noise.50%
2.989e+09 ± 44% -1.3e+09 1.723e+09 ± 5% syscalls.sys_write.noise.75%
134582 -16.9% 111883 ± 30% interrupts.CAL:Function_call_interrupts
5042 ± 34% +45.0% 7311 ± 18% interrupts.CPU0.NMI:Non-maskable_interrupts
5042 ± 34% +45.0% 7311 ± 18% interrupts.CPU0.PMI:Performance_monitoring_interrupts
276.00 ± 4% +21.8% 336.25 ± 12% interrupts.CPU0.RES:Rescheduling_interrupts
4063 +40.9% 5726 ± 28% interrupts.CPU105.NMI:Non-maskable_interrupts
4063 +40.9% 5726 ± 28% interrupts.CPU105.PMI:Performance_monitoring_interrupts
6629 ± 23% -39.0% 4045 ± 2% interrupts.CPU108.NMI:Non-maskable_interrupts
6629 ± 23% -39.0% 4045 ± 2% interrupts.CPU108.PMI:Performance_monitoring_interrupts
61.25 ± 94% -75.1% 15.25 ± 46% interrupts.CPU116.RES:Rescheduling_interrupts
168.50 ± 13% -43.8% 94.75 ± 18% interrupts.CPU126.RES:Rescheduling_interrupts
1522 ± 29% -58.1% 637.75 ± 30% interrupts.CPU129.CAL:Function_call_interrupts
79.25 ± 41% -60.9% 31.00 ± 31% interrupts.CPU131.RES:Rescheduling_interrupts
5031 ± 33% +53.7% 7733 ± 8% interrupts.CPU132.NMI:Non-maskable_interrupts
5031 ± 33% +53.7% 7733 ± 8% interrupts.CPU132.PMI:Performance_monitoring_interrupts
4054 +52.6% 6187 ± 27% interrupts.CPU137.NMI:Non-maskable_interrupts
4054 +52.6% 6187 ± 27% interrupts.CPU137.PMI:Performance_monitoring_interrupts
8008 -21.8% 6263 ± 21% interrupts.CPU21.NMI:Non-maskable_interrupts
8008 -21.8% 6263 ± 21% interrupts.CPU21.PMI:Performance_monitoring_interrupts
5988 ± 18% -25.2% 4479 ± 17% interrupts.CPU25.NMI:Non-maskable_interrupts
5988 ± 18% -25.2% 4479 ± 17% interrupts.CPU25.PMI:Performance_monitoring_interrupts
167.00 +13.8% 190.00 ± 3% interrupts.CPU30.RES:Rescheduling_interrupts
170.25 ± 8% +16.6% 198.50 ± 10% interrupts.CPU31.RES:Rescheduling_interrupts
908.75 ± 14% -31.4% 623.50 ± 45% interrupts.CPU39.CAL:Function_call_interrupts
4024 +44.1% 5799 ± 27% interrupts.CPU45.NMI:Non-maskable_interrupts
4024 +44.1% 5799 ± 27% interrupts.CPU45.PMI:Performance_monitoring_interrupts
4612 ± 21% +52.6% 7039 ± 23% interrupts.CPU54.NMI:Non-maskable_interrupts
4612 ± 21% +52.6% 7039 ± 23% interrupts.CPU54.PMI:Performance_monitoring_interrupts
175.75 ± 14% -43.5% 99.25 ± 27% interrupts.CPU54.RES:Rescheduling_interrupts
304.25 ± 82% -65.6% 104.75 ± 25% interrupts.CPU55.RES:Rescheduling_interrupts
6694 ± 19% -30.1% 4676 ± 23% interrupts.CPU58.NMI:Non-maskable_interrupts
6694 ± 19% -30.1% 4676 ± 23% interrupts.CPU58.PMI:Performance_monitoring_interrupts
1982 ± 26% -46.9% 1053 ± 19% interrupts.CPU73.CAL:Function_call_interrupts
5503 ± 27% -26.9% 4022 interrupts.CPU80.NMI:Non-maskable_interrupts
5503 ± 27% -26.9% 4022 interrupts.CPU80.PMI:Performance_monitoring_interrupts
6543 ± 23% -23.3% 5020 ± 34% interrupts.CPU88.NMI:Non-maskable_interrupts
6543 ± 23% -23.3% 5020 ± 34% interrupts.CPU88.PMI:Performance_monitoring_interrupts
6059 ± 33% -33.8% 4011 interrupts.CPU90.NMI:Non-maskable_interrupts
6059 ± 33% -33.8% 4011 interrupts.CPU90.PMI:Performance_monitoring_interrupts
16073 +14.0% 18329 ± 4% interrupts.RES:Rescheduling_interrupts
2.276e+10 +5.6% 2.403e+10 perf-stat.i.branch-instructions
57785027 +9.2% 63125578 perf-stat.i.branch-misses
33.59 +1.3 34.86 perf-stat.i.cache-miss-rate%
2.99e+08 +11.5% 3.332e+08 perf-stat.i.cache-misses
8.857e+08 +7.5% 9.52e+08 perf-stat.i.cache-references
4.00 -6.1% 3.76 perf-stat.i.cpi
1425 -10.3% 1277 perf-stat.i.cycles-between-cache-misses
38840111 ± 19% +45.7% 56590709 ± 18% perf-stat.i.dTLB-load-misses
2.805e+10 +6.9% 2.999e+10 perf-stat.i.dTLB-loads
86765075 ± 2% +17.0% 1.015e+08 perf-stat.i.dTLB-store-misses
9.928e+09 +12.1% 1.113e+10 perf-stat.i.dTLB-stores
32364704 +12.3% 36335103 perf-stat.i.iTLB-load-misses
1.029e+11 +6.3% 1.094e+11 perf-stat.i.instructions
3187 -5.5% 3013 perf-stat.i.instructions-per-iTLB-miss
0.25 +6.2% 0.27 perf-stat.i.ipc
429.12 +7.4% 460.92 perf-stat.i.metric.M/sec
10735651 +12.3% 12059205 perf-stat.i.minor-faults
5.53 -1.9 3.59 perf-stat.i.node-load-miss-rate%
12242331 -30.4% 8517490 perf-stat.i.node-load-misses
2.309e+08 +12.7% 2.603e+08 perf-stat.i.node-loads
14.81 -0.7 14.14 perf-stat.i.node-store-miss-rate%
8057655 +10.5% 8902557 perf-stat.i.node-store-misses
46809246 +16.6% 54566564 perf-stat.i.node-stores
10735652 +12.3% 12059206 perf-stat.i.page-faults
0.25 +0.0 0.26 perf-stat.overall.branch-miss-rate%
33.79 +1.2 35.02 perf-stat.overall.cache-miss-rate%
4.00 -6.0% 3.76 perf-stat.overall.cpi
1375 -10.3% 1233 perf-stat.overall.cycles-between-cache-misses
3182 -5.3% 3012 perf-stat.overall.instructions-per-iTLB-miss
0.25 +6.3% 0.27 perf-stat.overall.ipc
5.01 -1.9 3.16 perf-stat.overall.node-load-miss-rate%
14.69 -0.7 14.03 perf-stat.overall.node-store-miss-rate%
2874609 -5.2% 2726260 perf-stat.overall.path-length
2.266e+10 +5.5% 2.391e+10 perf-stat.ps.branch-instructions
57209629 +9.3% 62547538 perf-stat.ps.branch-misses
2.977e+08 +11.4% 3.318e+08 perf-stat.ps.cache-misses
8.813e+08 +7.5% 9.475e+08 perf-stat.ps.cache-references
38567680 ± 19% +45.8% 56225398 ± 18% perf-stat.ps.dTLB-load-misses
2.792e+10 +6.8% 2.983e+10 perf-stat.ps.dTLB-loads
86358963 ± 2% +17.0% 1.01e+08 perf-stat.ps.dTLB-store-misses
9.868e+09 +12.1% 1.106e+10 perf-stat.ps.dTLB-stores
32193889 +12.3% 36145969 perf-stat.ps.iTLB-load-misses
1.025e+11 +6.3% 1.089e+11 perf-stat.ps.instructions
10680325 +12.3% 11997386 perf-stat.ps.minor-faults
12145565 -30.4% 8458562 perf-stat.ps.node-load-misses
2.302e+08 +12.7% 2.593e+08 perf-stat.ps.node-loads
8004247 +10.6% 8849146 perf-stat.ps.node-store-misses
46470686 +16.7% 54225133 perf-stat.ps.node-stores
10680326 +12.3% 11997387 perf-stat.ps.page-faults
3.112e+13 +6.3% 3.308e+13 perf-stat.total.instructions
41.00 -6.5 34.52 ± 5% perf-profile.calltrace.cycles-pp.alloc_set_pte.finish_fault.do_fault.__handle_mm_fault.handle_mm_fault
41.07 -6.5 34.60 ± 5% perf-profile.calltrace.cycles-pp.finish_fault.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
38.25 -6.1 32.14 ± 6% perf-profile.calltrace.cycles-pp.pagevec_lru_move_fn.lru_cache_add.alloc_set_pte.finish_fault.do_fault
38.38 -6.1 32.28 ± 6% perf-profile.calltrace.cycles-pp.lru_cache_add.alloc_set_pte.finish_fault.do_fault.__handle_mm_fault
36.53 -6.0 30.50 ± 6% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.pagevec_lru_move_fn.lru_cache_add.alloc_set_pte
36.57 -6.0 30.54 ± 6% perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.pagevec_lru_move_fn.lru_cache_add.alloc_set_pte.finish_fault
4.88 -2.8 2.11 ± 3% perf-profile.calltrace.cycles-pp.mem_cgroup_charge.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
64.74 -1.7 63.03 perf-profile.calltrace.cycles-pp.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
65.63 -1.6 63.99 perf-profile.calltrace.cycles-pp.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
65.78 -1.6 64.15 perf-profile.calltrace.cycles-pp.exc_page_fault.asm_exc_page_fault.testcase
72.33 -1.5 70.84 perf-profile.calltrace.cycles-pp.testcase
71.88 -1.3 70.54 perf-profile.calltrace.cycles-pp.asm_exc_page_fault.testcase
63.02 -1.3 61.72 perf-profile.calltrace.cycles-pp.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault
63.45 -1.3 62.17 perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
1.83 -1.2 0.58 perf-profile.calltrace.cycles-pp.get_mem_cgroup_from_mm.mem_cgroup_charge.do_fault.__handle_mm_fault.handle_mm_fault
1.17 ± 2% -0.4 0.73 perf-profile.calltrace.cycles-pp.page_add_new_anon_rmap.alloc_set_pte.finish_fault.do_fault.__handle_mm_fault
0.75 ± 4% -0.2 0.58 ± 3% perf-profile.calltrace.cycles-pp.page_remove_rmap.zap_pte_range.unmap_page_range.unmap_vmas.unmap_region
2.23 -0.1 2.15 ± 2% perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.release_pages.tlb_flush_mmu.tlb_finish_mmu.unmap_region
2.22 -0.1 2.15 ± 2% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.release_pages.tlb_flush_mmu.tlb_finish_mmu
1.21 -0.1 1.14 ± 2% perf-profile.calltrace.cycles-pp.__pagevec_lru_add_fn.pagevec_lru_move_fn.lru_cache_add.alloc_set_pte.finish_fault
0.77 +0.0 0.80 perf-profile.calltrace.cycles-pp.__irqentry_text_end.testcase
0.81 +0.1 0.87 ± 3% perf-profile.calltrace.cycles-pp.try_charge.mem_cgroup_charge.do_fault.__handle_mm_fault.handle_mm_fault
1.28 +0.1 1.37 ± 2% perf-profile.calltrace.cycles-pp.find_lock_entry.shmem_getpage_gfp.shmem_fault.__do_fault.do_fault
1.10 +0.1 1.19 ± 2% perf-profile.calltrace.cycles-pp.find_get_entry.find_lock_entry.shmem_getpage_gfp.shmem_fault.__do_fault
1.45 +0.1 1.56 ± 2% perf-profile.calltrace.cycles-pp.shmem_getpage_gfp.shmem_fault.__do_fault.do_fault.__handle_mm_fault
1.64 +0.1 1.77 ± 2% perf-profile.calltrace.cycles-pp.shmem_fault.__do_fault.do_fault.__handle_mm_fault.handle_mm_fault
1.73 +0.1 1.88 perf-profile.calltrace.cycles-pp.__do_fault.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
2.70 +0.1 2.85 perf-profile.calltrace.cycles-pp.release_pages.tlb_flush_mmu.tlb_finish_mmu.unmap_region.__do_munmap
0.78 +0.2 0.94 ± 4% perf-profile.calltrace.cycles-pp.__list_del_entry_valid.rmqueue_bulk.rmqueue.get_page_from_freelist.__alloc_pages_nodemask
2.74 +0.2 2.90 perf-profile.calltrace.cycles-pp.tlb_flush_mmu.tlb_finish_mmu.unmap_region.__do_munmap.__vm_munmap
2.74 +0.2 2.90 perf-profile.calltrace.cycles-pp.tlb_finish_mmu.unmap_region.__do_munmap.__vm_munmap.__x64_sys_munmap
0.38 ± 57% +0.2 0.54 ± 4% perf-profile.calltrace.cycles-pp.page_counter_try_charge.try_charge.mem_cgroup_charge.do_fault.__handle_mm_fault
11.34 +0.6 11.92 perf-profile.calltrace.cycles-pp.copy_page.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
0.76 ± 3% +1.1 1.90 ± 11% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.free_pcppages_bulk.free_unref_page_list.release_pages
0.77 ± 4% +1.1 1.91 ± 11% perf-profile.calltrace.cycles-pp._raw_spin_lock.free_pcppages_bulk.free_unref_page_list.release_pages.tlb_flush_mmu
21.43 +1.2 22.63 perf-profile.calltrace.cycles-pp.release_pages.tlb_flush_mmu.zap_pte_range.unmap_page_range.unmap_vmas
23.92 +1.2 25.14 perf-profile.calltrace.cycles-pp.zap_pte_range.unmap_page_range.unmap_vmas.unmap_region.__do_munmap
23.93 +1.2 25.15 perf-profile.calltrace.cycles-pp.unmap_vmas.unmap_region.__do_munmap.__vm_munmap.__x64_sys_munmap
23.93 +1.2 25.15 perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.unmap_region.__do_munmap.__vm_munmap
21.76 +1.2 22.99 perf-profile.calltrace.cycles-pp.tlb_flush_mmu.zap_pte_range.unmap_page_range.unmap_vmas.unmap_region
1.88 ± 2% +1.3 3.16 ± 7% perf-profile.calltrace.cycles-pp.free_pcppages_bulk.free_unref_page_list.release_pages.tlb_flush_mmu.zap_pte_range
2.15 +1.3 3.46 ± 7% perf-profile.calltrace.cycles-pp.free_unref_page_list.release_pages.tlb_flush_mmu.zap_pte_range.unmap_page_range
26.70 +1.4 28.07 perf-profile.calltrace.cycles-pp.unmap_region.__do_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64
26.70 +1.4 28.07 perf-profile.calltrace.cycles-pp.__do_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe
26.72 +1.4 28.09 perf-profile.calltrace.cycles-pp.__munmap
26.70 +1.4 28.07 perf-profile.calltrace.cycles-pp.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
26.70 +1.4 28.07 perf-profile.calltrace.cycles-pp.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
26.72 +1.4 28.09 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__munmap
26.70 +1.4 28.08 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
1.21 ± 9% +6.8 8.03 ± 18% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.rmqueue_bulk.rmqueue.get_page_from_freelist
1.21 ± 8% +6.8 8.04 ± 18% perf-profile.calltrace.cycles-pp._raw_spin_lock.rmqueue_bulk.rmqueue.get_page_from_freelist.__alloc_pages_nodemask
2.13 ± 4% +7.0 9.12 ± 16% perf-profile.calltrace.cycles-pp.rmqueue_bulk.rmqueue.get_page_from_freelist.__alloc_pages_nodemask.alloc_pages_vma
2.58 ± 3% +7.1 9.65 ± 16% perf-profile.calltrace.cycles-pp.rmqueue.get_page_from_freelist.__alloc_pages_nodemask.alloc_pages_vma.do_fault
2.77 ± 3% +7.1 9.84 ± 15% perf-profile.calltrace.cycles-pp.get_page_from_freelist.__alloc_pages_nodemask.alloc_pages_vma.do_fault.__handle_mm_fault
3.11 ± 3% +7.1 10.21 ± 15% perf-profile.calltrace.cycles-pp.__alloc_pages_nodemask.alloc_pages_vma.do_fault.__handle_mm_fault.handle_mm_fault
3.45 ± 3% +7.1 10.55 ± 14% perf-profile.calltrace.cycles-pp.alloc_pages_vma.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
41.03 -6.5 34.56 ± 5% perf-profile.children.cycles-pp.alloc_set_pte
41.08 -6.5 34.61 ± 5% perf-profile.children.cycles-pp.finish_fault
56.64 -6.1 50.52 ± 3% perf-profile.children.cycles-pp._raw_spin_lock_irqsave
38.30 -6.1 32.18 ± 6% perf-profile.children.cycles-pp.pagevec_lru_move_fn
38.39 -6.1 32.30 ± 6% perf-profile.children.cycles-pp.lru_cache_add
4.90 -2.8 2.12 ± 3% perf-profile.children.cycles-pp.mem_cgroup_charge
64.76 -1.7 63.05 perf-profile.children.cycles-pp.handle_mm_fault
65.67 -1.6 64.03 perf-profile.children.cycles-pp.do_user_addr_fault
65.80 -1.6 64.17 perf-profile.children.cycles-pp.exc_page_fault
70.09 -1.5 68.63 perf-profile.children.cycles-pp.asm_exc_page_fault
73.11 -1.4 71.74 perf-profile.children.cycles-pp.testcase
63.05 -1.3 61.74 perf-profile.children.cycles-pp.do_fault
63.47 -1.3 62.19 perf-profile.children.cycles-pp.__handle_mm_fault
1.84 -1.2 0.59 perf-profile.children.cycles-pp.get_mem_cgroup_from_mm
1.18 ± 4% -0.8 0.37 ± 4% perf-profile.children.cycles-pp.__count_memcg_events
1.53 ± 2% -0.8 0.72 ± 5% perf-profile.children.cycles-pp.__mod_memcg_state
2.07 ± 2% -0.7 1.33 ± 4% perf-profile.children.cycles-pp.__mod_memcg_lruvec_state
1.17 ± 2% -0.4 0.73 ± 2% perf-profile.children.cycles-pp.page_add_new_anon_rmap
0.56 ± 4% -0.3 0.23 ± 4% perf-profile.children.cycles-pp.mem_cgroup_charge_statistics
0.75 ± 4% -0.2 0.59 ± 3% perf-profile.children.cycles-pp.page_remove_rmap
1.22 -0.1 1.16 ± 2% perf-profile.children.cycles-pp.__pagevec_lru_add_fn
0.13 ± 3% -0.1 0.07 ± 5% perf-profile.children.cycles-pp.uncharge_page
0.29 -0.1 0.23 perf-profile.children.cycles-pp.mem_cgroup_uncharge_list
0.30 ± 3% -0.0 0.27 ± 4% perf-profile.children.cycles-pp._raw_spin_unlock_irqrestore
0.05 +0.0 0.06 perf-profile.children.cycles-pp.rcu_all_qs
0.05 ± 8% +0.0 0.07 ± 7% perf-profile.children.cycles-pp.perf_exclude_event
0.05 +0.0 0.06 ± 6% perf-profile.children.cycles-pp.pte_alloc_one
0.14 ± 3% +0.0 0.16 ± 2% perf-profile.children.cycles-pp.scheduler_tick
0.07 ± 5% +0.0 0.09 ± 4% perf-profile.children.cycles-pp.mem_cgroup_update_lru_size
0.19 ± 3% +0.0 0.21 ± 2% perf-profile.children.cycles-pp.update_process_times
0.17 ± 2% +0.0 0.19 ± 3% perf-profile.children.cycles-pp.__mod_zone_page_state
0.12 ± 5% +0.0 0.14 ± 3% perf-profile.children.cycles-pp.cgroup_throttle_swaprate
0.21 ± 4% +0.0 0.23 ± 2% perf-profile.children.cycles-pp.tick_sched_timer
0.19 ± 2% +0.0 0.21 ± 2% perf-profile.children.cycles-pp.tick_sched_handle
0.27 +0.0 0.29 ± 2% perf-profile.children.cycles-pp.__hrtimer_run_queues
0.28 ± 2% +0.0 0.30 perf-profile.children.cycles-pp.__mod_node_page_state
0.78 +0.0 0.80 perf-profile.children.cycles-pp.__irqentry_text_end
0.36 ± 2% +0.0 0.39 perf-profile.children.cycles-pp.__mod_lruvec_state
0.35 ± 2% +0.0 0.38 ± 4% perf-profile.children.cycles-pp.free_pages_and_swap_cache
0.42 +0.0 0.46 perf-profile.children.cycles-pp.___perf_sw_event
0.00 +0.1 0.05 perf-profile.children.cycles-pp.__tlb_remove_page_size
0.82 +0.1 0.87 ± 4% perf-profile.children.cycles-pp.try_charge
0.61 +0.1 0.68 perf-profile.children.cycles-pp.__perf_sw_event
2.40 +0.1 2.46 perf-profile.children.cycles-pp.irqentry_exit_to_user_mode
1.11 +0.1 1.19 ± 2% perf-profile.children.cycles-pp.find_get_entry
1.30 +0.1 1.39 ± 2% perf-profile.children.cycles-pp.find_lock_entry
1.47 +0.1 1.58 ± 2% perf-profile.children.cycles-pp.shmem_getpage_gfp
1.64 +0.1 1.77 perf-profile.children.cycles-pp.shmem_fault
1.74 +0.1 1.88 perf-profile.children.cycles-pp.__do_fault
2.75 +0.2 2.91 perf-profile.children.cycles-pp.tlb_finish_mmu
1.54 ± 2% +0.2 1.76 ± 2% perf-profile.children.cycles-pp.__list_del_entry_valid
11.37 +0.6 11.96 perf-profile.children.cycles-pp.copy_page
23.93 +1.2 25.15 perf-profile.children.cycles-pp.unmap_vmas
23.93 +1.2 25.15 perf-profile.children.cycles-pp.unmap_page_range
23.93 +1.2 25.15 perf-profile.children.cycles-pp.zap_pte_range
24.27 +1.4 25.63 perf-profile.children.cycles-pp.release_pages
26.71 +1.4 28.08 perf-profile.children.cycles-pp.__do_munmap
26.71 +1.4 28.08 perf-profile.children.cycles-pp.unmap_region
26.80 +1.4 28.17 perf-profile.children.cycles-pp.do_syscall_64
26.82 +1.4 28.19 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
26.72 +1.4 28.09 perf-profile.children.cycles-pp.__munmap
26.70 +1.4 28.08 perf-profile.children.cycles-pp.__vm_munmap
26.70 +1.4 28.08 perf-profile.children.cycles-pp.__x64_sys_munmap
24.51 +1.4 25.90 perf-profile.children.cycles-pp.tlb_flush_mmu
2.14 +1.5 3.65 ± 7% perf-profile.children.cycles-pp.free_pcppages_bulk
2.45 +1.6 4.01 ± 7% perf-profile.children.cycles-pp.free_unref_page_list
58.69 +2.1 60.75 perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
2.14 ± 4% +7.0 9.16 ± 16% perf-profile.children.cycles-pp.rmqueue_bulk
2.61 ± 4% +7.1 9.70 ± 16% perf-profile.children.cycles-pp.rmqueue
2.82 ± 3% +7.1 9.92 ± 15% perf-profile.children.cycles-pp.get_page_from_freelist
3.46 ± 3% +7.1 10.57 ± 14% perf-profile.children.cycles-pp.alloc_pages_vma
3.19 ± 3% +7.1 10.30 ± 15% perf-profile.children.cycles-pp.__alloc_pages_nodemask
3.29 ± 4% +8.2 11.52 ± 13% perf-profile.children.cycles-pp._raw_spin_lock
1.82 -1.2 0.57 perf-profile.self.cycles-pp.get_mem_cgroup_from_mm
1.60 ± 3% -1.2 0.37 ± 4% perf-profile.self.cycles-pp.mem_cgroup_charge
1.18 ± 3% -0.8 0.36 ± 4% perf-profile.self.cycles-pp.__count_memcg_events
1.52 ± 3% -0.8 0.71 ± 5% perf-profile.self.cycles-pp.__mod_memcg_state
0.13 ± 3% -0.1 0.07 ± 5% perf-profile.self.cycles-pp.uncharge_page
0.10 +0.0 0.11 ± 3% perf-profile.self.cycles-pp.perf_swevent_get_recursion_context
0.07 ± 6% +0.0 0.08 perf-profile.self.cycles-pp.mem_cgroup_update_lru_size
0.25 +0.0 0.26 perf-profile.self.cycles-pp.rmqueue
0.12 ± 4% +0.0 0.14 perf-profile.self.cycles-pp.lru_cache_add
0.18 ± 2% +0.0 0.20 ± 2% perf-profile.self.cycles-pp.shmem_fault
0.16 +0.0 0.18 ± 2% perf-profile.self.cycles-pp.shmem_getpage_gfp
0.10 +0.0 0.12 ± 5% perf-profile.self.cycles-pp.free_unref_page_list
0.28 +0.0 0.30 ± 3% perf-profile.self.cycles-pp.handle_mm_fault
0.16 +0.0 0.18 ± 2% perf-profile.self.cycles-pp.page_remove_rmap
0.41 +0.0 0.44 ± 2% perf-profile.self.cycles-pp.__handle_mm_fault
0.30 ± 2% +0.0 0.33 ± 3% perf-profile.self.cycles-pp.try_charge
0.78 +0.0 0.80 perf-profile.self.cycles-pp.__irqentry_text_end
0.26 ± 3% +0.0 0.29 perf-profile.self.cycles-pp.__mod_node_page_state
0.31 ± 2% +0.0 0.34 perf-profile.self.cycles-pp.___perf_sw_event
0.27 ± 3% +0.0 0.30 ± 3% perf-profile.self.cycles-pp.alloc_set_pte
0.34 ± 2% +0.0 0.38 ± 3% perf-profile.self.cycles-pp.free_pages_and_swap_cache
0.38 +0.0 0.41 ± 2% perf-profile.self.cycles-pp.release_pages
0.47 +0.0 0.52 ± 4% perf-profile.self.cycles-pp.page_counter_try_charge
0.74 +0.0 0.78 perf-profile.self.cycles-pp.find_get_entry
0.42 +0.0 0.47 ± 2% perf-profile.self.cycles-pp.__pagevec_lru_add_fn
2.30 +0.1 2.37 perf-profile.self.cycles-pp.irqentry_exit_to_user_mode
0.31 ± 22% +0.1 0.38 ± 21% perf-profile.self.cycles-pp.do_fault
0.54 +0.1 0.62 ± 3% perf-profile.self.cycles-pp.__mod_memcg_lruvec_state
2.10 +0.1 2.23 perf-profile.self.cycles-pp.testcase
1.02 +0.1 1.14 ± 3% perf-profile.self.cycles-pp.free_pcppages_bulk
1.30 +0.1 1.43 ± 3% perf-profile.self.cycles-pp.zap_pte_range
1.53 ± 2% +0.2 1.75 ± 2% perf-profile.self.cycles-pp.__list_del_entry_valid
11.31 +0.6 11.90 perf-profile.self.cycles-pp.copy_page
58.69 +2.1 60.75 perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Oliver Sang
View attachment "config-5.10.0-rc5-00023-g4df910620beb" of type "text/plain" (170131 bytes)
View attachment "job-script" of type "text/plain" (7861 bytes)
View attachment "job.yaml" of type "text/plain" (5201 bytes)
View attachment "reproduce" of type "text/plain" (343 bytes)
Powered by blists - more mailing lists