[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20201102091543.GM31092@shao2-debian>
Date: Mon, 2 Nov 2020 17:15:43 +0800
From: kernel test robot <rong.a.chen@...el.com>
To: Waiman Long <longman@...hat.com>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Shakeel Butt <shakeelb@...gle.com>,
Michal Hocko <mhocko@...e.com>,
Chris Down <chris@...isdown.name>,
Johannes Weiner <hannes@...xchg.org>,
Roman Gushchin <guro@...com>, Tejun Heo <tj@...nel.org>,
Vladimir Davydov <vdavydov.dev@...il.com>,
Yafang Shao <laoar.shao@...il.com>,
LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org,
lkp@...el.com, ying.huang@...el.com, feng.tang@...el.com,
zhengjun.xing@...el.com
Subject: [mm/memcg] bd0b230fe1: will-it-scale.per_process_ops -22.7%
regression
Greeting,
FYI, we noticed a -22.7% regression of will-it-scale.per_process_ops due to commit:
commit: bd0b230fe14554bfffbae54e19038716f96f5a41 ("mm/memcg: unify swap and memsw page counters")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
in testcase: will-it-scale
on test machine: 144 threads Intel(R) Xeon(R) CPU E7-8890 v3 @ 2.50GHz with 512G memory
with following parameters:
nr_task: 50%
mode: process
test: page_fault2
cpufreq_governor: performance
ucode: 0x16
test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
test-url: https://github.com/antonblanchard/will-it-scale
If you fix the issue, kindly add following tag
Reported-by: kernel test robot <rong.a.chen@...el.com>
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
gcc-9/performance/x86_64-rhel-8.3/process/50%/debian-10.4-x86_64-20200603.cgz/lkp-hsw-4ex1/page_fault2/will-it-scale/0x16
commit:
8d387a5f17 ("mm/memcg: simplify mem_cgroup_get_max()")
bd0b230fe1 ("mm/memcg: unify swap and memsw page counters")
8d387a5f172f26ff bd0b230fe14554bfffbae54e190
---------------- ---------------------------
fail:runs %reproduction fail:runs
| | |
5:5 -23% 4:5 perf-profile.calltrace.cycles-pp.error_entry.testcase
5:5 -23% 4:5 perf-profile.children.cycles-pp.error_entry
5:5 -19% 4:5 perf-profile.self.cycles-pp.error_entry
%stddev %change %stddev
\ | \
187666 -22.7% 145157 will-it-scale.per_process_ops
13511995 -22.7% 10451324 will-it-scale.workload
1895023 ±196% -98.7% 24188 ± 4% cpuidle.POLL.time
280275 ±192% -97.3% 7540 ± 3% cpuidle.POLL.usage
34107 ± 26% -27.5% 24740 ± 10% numa-meminfo.node1.KReclaimable
4234 ± 11% -15.6% 3573 ± 2% numa-meminfo.node1.PageTables
34107 ± 26% -27.5% 24740 ± 10% numa-meminfo.node1.SReclaimable
1195173 +1.2% 1209077 proc-vmstat.nr_anon_pages
4325 -2.3% 4223 proc-vmstat.nr_page_table_pages
4.07e+09 -22.6% 3.149e+09 proc-vmstat.numa_hit
4.069e+09 -22.6% 3.149e+09 proc-vmstat.numa_local
4.072e+09 -22.6% 3.151e+09 proc-vmstat.pgalloc_normal
4.059e+09 -22.6% 3.141e+09 proc-vmstat.pgfault
4.069e+09 -22.6% 3.147e+09 proc-vmstat.pgfree
1.395e+09 -9.9e+07 1.296e+09 ± 5% syscalls.sys_mmap.noise.75%
2.512e+09 ± 17% +1.2e+09 3.667e+09 ± 13% syscalls.sys_write.noise.100%
2.526e+09 ± 17% +1.2e+09 3.684e+09 ± 13% syscalls.sys_write.noise.2%
2.523e+09 ± 17% +1.2e+09 3.681e+09 ± 13% syscalls.sys_write.noise.25%
2.525e+09 ± 17% +1.2e+09 3.684e+09 ± 13% syscalls.sys_write.noise.5%
2.52e+09 ± 17% +1.2e+09 3.677e+09 ± 13% syscalls.sys_write.noise.50%
2.516e+09 ± 17% +1.2e+09 3.672e+09 ± 13% syscalls.sys_write.noise.75%
1.029e+09 -21.7% 8.052e+08 ± 2% numa-numastat.node0.local_node
1.029e+09 -21.7% 8.053e+08 ± 2% numa-numastat.node0.numa_hit
1.02e+09 -23.0% 7.853e+08 numa-numastat.node1.local_node
1.02e+09 -23.0% 7.853e+08 numa-numastat.node1.numa_hit
1.013e+09 -22.8% 7.817e+08 numa-numastat.node2.local_node
1.013e+09 -22.8% 7.818e+08 numa-numastat.node2.numa_hit
1.011e+09 -23.1% 7.771e+08 ± 2% numa-numastat.node3.local_node
1.011e+09 -23.1% 7.772e+08 ± 2% numa-numastat.node3.numa_hit
9210 ± 8% +12.5% 10362 ± 8% softirqs.CPU13.RCU
20302 ± 8% +31.3% 26656 ± 10% softirqs.CPU142.SCHED
16688 ± 16% +52.8% 25498 ± 29% softirqs.CPU15.SCHED
17137 ± 23% +62.3% 27811 ± 12% softirqs.CPU20.SCHED
23421 ± 21% -40.4% 13969 ± 33% softirqs.CPU36.SCHED
23782 ± 7% -29.0% 16876 ± 17% softirqs.CPU70.SCHED
27401 ± 9% -34.4% 17978 ± 42% softirqs.CPU87.SCHED
25692 ± 13% -44.6% 14223 ± 20% softirqs.CPU92.SCHED
5.114e+08 -21.6% 4.012e+08 ± 2% numa-vmstat.node0.numa_hit
5.114e+08 -21.6% 4.012e+08 ± 2% numa-vmstat.node0.numa_local
1058 ± 11% -15.6% 893.00 ± 2% numa-vmstat.node1.nr_page_table_pages
8526 ± 26% -27.5% 6184 ± 10% numa-vmstat.node1.nr_slab_reclaimable
5.074e+08 -22.9% 3.91e+08 numa-vmstat.node1.numa_hit
5.073e+08 -22.9% 3.909e+08 numa-vmstat.node1.numa_local
5.04e+08 -22.7% 3.895e+08 numa-vmstat.node2.numa_hit
5.039e+08 -22.7% 3.894e+08 numa-vmstat.node2.numa_local
5.029e+08 -23.0% 3.87e+08 ± 2% numa-vmstat.node3.numa_hit
5.028e+08 -23.1% 3.869e+08 ± 2% numa-vmstat.node3.numa_local
6778 ± 54% -98.0% 135.96 ± 62% sched_debug.cfs_rq:/.exec_clock.min
29780 ± 6% +52.2% 45320 ± 6% sched_debug.cfs_rq:/.exec_clock.stddev
528699 ± 51% -94.3% 30214 ± 19% sched_debug.cfs_rq:/.min_vruntime.min
2205697 ± 6% +52.3% 3359526 ± 6% sched_debug.cfs_rq:/.min_vruntime.stddev
17.95 ± 5% -37.5% 11.22 ± 2% sched_debug.cfs_rq:/.nr_spread_over.avg
59.80 ± 28% -40.5% 35.60 ± 24% sched_debug.cfs_rq:/.nr_spread_over.max
1.80 ± 38% -100.0% 0.00 sched_debug.cfs_rq:/.nr_spread_over.min
8.78 ± 5% -16.2% 7.36 ± 7% sched_debug.cfs_rq:/.nr_spread_over.stddev
2205731 ± 6% +52.3% 3359553 ± 6% sched_debug.cfs_rq:/.spread0.stddev
23138 ± 2% +39.2% 32199 ± 18% sched_debug.cpu.nr_switches.max
3068 ± 3% +19.9% 3680 ± 10% sched_debug.cpu.nr_switches.stddev
19891 ± 4% +47.3% 29292 ± 20% sched_debug.cpu.sched_count.max
694.57 ± 4% -12.4% 608.67 ± 3% sched_debug.cpu.sched_count.min
2602 ± 4% +28.2% 3335 ± 10% sched_debug.cpu.sched_count.stddev
9769 ± 4% +48.7% 14531 ± 21% sched_debug.cpu.sched_goidle.max
25.50 ± 22% -51.2% 12.43 ± 10% sched_debug.cpu.sched_goidle.min
1315 ± 4% +29.2% 1699 ± 9% sched_debug.cpu.sched_goidle.stddev
259.50 ± 2% -16.7% 216.20 ± 5% sched_debug.cpu.ttwu_count.min
233.47 -12.9% 203.27 ± 3% sched_debug.cpu.ttwu_local.min
136.40 ± 22% -41.3% 80.00 ± 35% interrupts.CPU1.RES:Rescheduling_interrupts
445.80 ± 49% +54.3% 688.00 interrupts.CPU116.CAL:Function_call_interrupts
2384 ± 39% +77.2% 4225 ± 26% interrupts.CPU116.NMI:Non-maskable_interrupts
2384 ± 39% +77.2% 4225 ± 26% interrupts.CPU116.PMI:Performance_monitoring_interrupts
3140 ± 25% +82.5% 5732 ± 36% interrupts.CPU12.NMI:Non-maskable_interrupts
3140 ± 25% +82.5% 5732 ± 36% interrupts.CPU12.PMI:Performance_monitoring_interrupts
6641 ± 17% -48.0% 3452 ± 40% interrupts.CPU128.NMI:Non-maskable_interrupts
6641 ± 17% -48.0% 3452 ± 40% interrupts.CPU128.PMI:Performance_monitoring_interrupts
6211 ± 25% -41.3% 3643 ± 40% interrupts.CPU14.NMI:Non-maskable_interrupts
6211 ± 25% -41.3% 3643 ± 40% interrupts.CPU14.PMI:Performance_monitoring_interrupts
156.20 ± 19% -60.4% 61.80 ± 45% interrupts.CPU15.RES:Rescheduling_interrupts
450.60 ± 61% +314.9% 1869 ±110% interrupts.CPU17.CAL:Function_call_interrupts
401.20 ± 62% +105.1% 823.00 ± 28% interrupts.CPU2.CAL:Function_call_interrupts
3781 ± 29% +91.4% 7239 ± 22% interrupts.CPU23.NMI:Non-maskable_interrupts
3781 ± 29% +91.4% 7239 ± 22% interrupts.CPU23.PMI:Performance_monitoring_interrupts
131.20 ± 16% -38.3% 81.00 ± 37% interrupts.CPU23.RES:Rescheduling_interrupts
6565 ± 27% -47.8% 3430 ± 57% interrupts.CPU30.NMI:Non-maskable_interrupts
6565 ± 27% -47.8% 3430 ± 57% interrupts.CPU30.PMI:Performance_monitoring_interrupts
2524 ± 33% +53.1% 3866 ± 18% interrupts.CPU39.NMI:Non-maskable_interrupts
2524 ± 33% +53.1% 3866 ± 18% interrupts.CPU39.PMI:Performance_monitoring_interrupts
5453 ± 26% -34.5% 3569 ± 27% interrupts.CPU43.NMI:Non-maskable_interrupts
5453 ± 26% -34.5% 3569 ± 27% interrupts.CPU43.PMI:Performance_monitoring_interrupts
2524 ± 36% +65.3% 4172 ± 28% interrupts.CPU48.NMI:Non-maskable_interrupts
2524 ± 36% +65.3% 4172 ± 28% interrupts.CPU48.PMI:Performance_monitoring_interrupts
487.40 ± 58% +124.0% 1092 ± 55% interrupts.CPU79.CAL:Function_call_interrupts
585.20 ± 40% +61.9% 947.20 ± 23% interrupts.CPU80.CAL:Function_call_interrupts
487.60 ± 54% +67.9% 818.80 ± 16% interrupts.CPU81.CAL:Function_call_interrupts
15.97 +11.6% 17.83 perf-stat.i.MPKI
1.333e+10 -27.3% 9.69e+09 perf-stat.i.branch-instructions
0.48 +0.1 0.53 perf-stat.i.branch-miss-rate%
62963150 -20.1% 50310834 perf-stat.i.branch-misses
37.28 -1.9 35.40 perf-stat.i.cache-miss-rate%
4.08e+08 -21.8% 3.189e+08 perf-stat.i.cache-misses
1.092e+09 -17.7% 8.983e+08 perf-stat.i.cache-references
3.06 +35.8% 4.16 perf-stat.i.cpi
143.56 -1.4% 141.50 perf-stat.i.cpu-migrations
523.77 +28.4% 672.47 perf-stat.i.cycles-between-cache-misses
2.068e+10 -25.8% 1.534e+10 perf-stat.i.dTLB-loads
95643337 -22.7% 73905709 perf-stat.i.dTLB-store-misses
1.225e+10 -22.1% 9.544e+09 perf-stat.i.dTLB-stores
94.47 -2.1 92.36 perf-stat.i.iTLB-load-miss-rate%
40594103 -23.3% 31146350 perf-stat.i.iTLB-load-misses
2304247 ± 11% +9.1% 2513651 perf-stat.i.iTLB-loads
6.823e+10 -26.2% 5.033e+10 perf-stat.i.instructions
1692 -3.6% 1632 perf-stat.i.instructions-per-iTLB-miss
0.33 -26.3% 0.24 perf-stat.i.ipc
332.55 -25.1% 249.20 perf-stat.i.metric.M/sec
13449087 -22.7% 10398888 perf-stat.i.minor-faults
4.30 +1.7 6.01 perf-stat.i.node-load-miss-rate%
13864893 +11.3% 15438022 perf-stat.i.node-load-misses
3.294e+08 -22.9% 2.539e+08 perf-stat.i.node-loads
18.37 +0.4 18.79 perf-stat.i.node-store-miss-rate%
11546036 -22.7% 8923297 perf-stat.i.node-store-misses
51372725 -24.8% 38626751 perf-stat.i.node-stores
13449089 -22.7% 10398890 perf-stat.i.page-faults
16.00 +11.5% 17.85 perf-stat.overall.MPKI
0.47 +0.0 0.52 perf-stat.overall.branch-miss-rate%
37.36 -1.9 35.50 perf-stat.overall.cache-miss-rate%
3.06 +35.9% 4.16 perf-stat.overall.cpi
512.20 +28.3% 656.96 perf-stat.overall.cycles-between-cache-misses
94.64 -2.1 92.54 perf-stat.overall.iTLB-load-miss-rate%
1680 -3.9% 1616 perf-stat.overall.instructions-per-iTLB-miss
0.33 -26.4% 0.24 perf-stat.overall.ipc
4.04 +1.7 5.73 perf-stat.overall.node-load-miss-rate%
18.35 +0.4 18.77 perf-stat.overall.node-store-miss-rate%
1519587 -4.6% 1449452 perf-stat.overall.path-length
1.329e+10 -27.3% 9.658e+09 perf-stat.ps.branch-instructions
62761890 -20.1% 50132370 perf-stat.ps.branch-misses
4.065e+08 -21.8% 3.178e+08 perf-stat.ps.cache-misses
1.088e+09 -17.7% 8.953e+08 perf-stat.ps.cache-references
142.33 -1.6% 140.11 perf-stat.ps.cpu-migrations
2.061e+10 -25.8% 1.529e+10 perf-stat.ps.dTLB-loads
95301115 -22.7% 73646080 perf-stat.ps.dTLB-store-misses
1.221e+10 -22.1% 9.511e+09 perf-stat.ps.dTLB-stores
40451925 -23.3% 31039602 perf-stat.ps.iTLB-load-misses
2295342 ± 11% +9.1% 2503862 perf-stat.ps.iTLB-loads
6.799e+10 -26.2% 5.016e+10 perf-stat.ps.instructions
13401817 -22.7% 10363171 perf-stat.ps.minor-faults
13816941 +11.3% 15384985 perf-stat.ps.node-load-misses
3.282e+08 -22.9% 2.53e+08 perf-stat.ps.node-loads
11506624 -22.7% 8893312 perf-stat.ps.node-store-misses
51195452 -24.8% 38495647 perf-stat.ps.node-stores
13401819 -22.7% 10363173 perf-stat.ps.page-faults
2.053e+13 -26.2% 1.515e+13 perf-stat.total.instructions
11.81 ± 11% -3.5 8.26 ± 10% perf-profile.calltrace.cycles-pp.__munmap
11.81 ± 11% -3.5 8.26 ± 10% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__munmap
11.80 ± 11% -3.5 8.26 ± 10% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
11.80 ± 11% -3.5 8.26 ± 10% perf-profile.calltrace.cycles-pp.__do_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe
11.80 ± 11% -3.5 8.26 ± 10% perf-profile.calltrace.cycles-pp.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
11.80 ± 11% -3.5 8.26 ± 10% perf-profile.calltrace.cycles-pp.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
11.80 ± 11% -3.5 8.25 ± 10% perf-profile.calltrace.cycles-pp.unmap_region.__do_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64
12.30 ± 10% -3.4 8.86 ± 10% perf-profile.calltrace.cycles-pp.finish_fault.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
12.25 ± 10% -3.4 8.83 ± 10% perf-profile.calltrace.cycles-pp.alloc_set_pte.finish_fault.do_fault.__handle_mm_fault.handle_mm_fault
8.17 ± 11% -3.3 4.90 ± 10% perf-profile.calltrace.cycles-pp.lru_cache_add.alloc_set_pte.finish_fault.do_fault.__handle_mm_fault
8.04 ± 11% -3.2 4.79 ± 10% perf-profile.calltrace.cycles-pp.pagevec_lru_move_fn.lru_cache_add.alloc_set_pte.finish_fault.do_fault
10.87 ± 11% -3.2 7.63 ± 10% perf-profile.calltrace.cycles-pp.unmap_vmas.unmap_region.__do_munmap.__vm_munmap.__x64_sys_munmap
10.87 ± 11% -3.2 7.63 ± 10% perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.unmap_region.__do_munmap.__vm_munmap
10.86 ± 11% -3.2 7.62 ± 10% perf-profile.calltrace.cycles-pp.zap_pte_range.unmap_page_range.unmap_vmas.unmap_region.__do_munmap
6.14 ± 11% -3.0 3.18 ± 9% perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.pagevec_lru_move_fn.lru_cache_add.alloc_set_pte.finish_fault
6.08 ± 11% -2.9 3.14 ± 9% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.pagevec_lru_move_fn.lru_cache_add.alloc_set_pte
7.57 ± 12% -2.7 4.88 ± 9% perf-profile.calltrace.cycles-pp.tlb_flush_mmu.zap_pte_range.unmap_page_range.unmap_vmas.unmap_region
7.14 ± 13% -2.6 4.56 ± 9% perf-profile.calltrace.cycles-pp.release_pages.tlb_flush_mmu.zap_pte_range.unmap_page_range.unmap_vmas
3.95 ± 16% -2.1 1.86 ± 7% perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.release_pages.tlb_flush_mmu.zap_pte_range.unmap_page_range
3.93 ± 17% -2.1 1.85 ± 7% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.release_pages.tlb_flush_mmu.zap_pte_range
3.25 ± 10% -0.8 2.44 ± 10% perf-profile.calltrace.cycles-pp.alloc_pages_vma.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
2.86 ± 10% -0.7 2.13 ± 9% perf-profile.calltrace.cycles-pp.__alloc_pages_nodemask.alloc_pages_vma.do_fault.__handle_mm_fault.handle_mm_fault
2.62 ± 10% -0.7 1.93 ± 10% perf-profile.calltrace.cycles-pp.get_page_from_freelist.__alloc_pages_nodemask.alloc_pages_vma.do_fault.__handle_mm_fault
2.36 ± 11% -0.7 1.71 ± 9% perf-profile.calltrace.cycles-pp.rmqueue.get_page_from_freelist.__alloc_pages_nodemask.alloc_pages_vma.do_fault
1.83 ± 11% -0.6 1.23 ± 11% perf-profile.calltrace.cycles-pp.try_charge.mem_cgroup_charge.do_fault.__handle_mm_fault.handle_mm_fault
1.89 ± 11% -0.6 1.32 ± 9% perf-profile.calltrace.cycles-pp.rmqueue_bulk.rmqueue.get_page_from_freelist.__alloc_pages_nodemask.alloc_pages_vma
1.76 ± 9% -0.5 1.30 ± 10% perf-profile.calltrace.cycles-pp.free_unref_page_list.release_pages.tlb_flush_mmu.zap_pte_range.unmap_page_range
1.30 ± 12% -0.4 0.86 ± 11% perf-profile.calltrace.cycles-pp.page_counter_try_charge.try_charge.mem_cgroup_charge.do_fault.__handle_mm_fault
1.53 ± 9% -0.4 1.11 ± 10% perf-profile.calltrace.cycles-pp.free_pcppages_bulk.free_unref_page_list.release_pages.tlb_flush_mmu.zap_pte_range
0.92 ± 13% -0.3 0.62 ± 10% perf-profile.calltrace.cycles-pp.tlb_finish_mmu.unmap_region.__do_munmap.__vm_munmap.__x64_sys_munmap
0.92 ± 13% -0.3 0.62 ± 10% perf-profile.calltrace.cycles-pp.tlb_flush_mmu.tlb_finish_mmu.unmap_region.__do_munmap.__vm_munmap
0.89 ± 13% -0.3 0.60 ± 10% perf-profile.calltrace.cycles-pp.release_pages.tlb_flush_mmu.tlb_finish_mmu.unmap_region.__do_munmap
1.02 ± 11% -0.2 0.78 ± 9% perf-profile.calltrace.cycles-pp.__list_del_entry_valid.rmqueue_bulk.rmqueue.get_page_from_freelist.__alloc_pages_nodemask
1.03 ± 9% -0.2 0.79 ± 10% perf-profile.calltrace.cycles-pp.__irqentry_text_end.testcase
0.98 ± 12% +0.3 1.27 ± 13% perf-profile.calltrace.cycles-pp.__mod_memcg_lruvec_state.page_add_new_anon_rmap.alloc_set_pte.finish_fault.do_fault
1.42 ± 9% +0.3 1.75 ± 9% perf-profile.calltrace.cycles-pp.shmem_fault.__do_fault.do_fault.__handle_mm_fault.handle_mm_fault
0.70 ± 13% +0.4 1.10 ± 14% perf-profile.calltrace.cycles-pp.__mod_memcg_state.__mod_memcg_lruvec_state.page_add_new_anon_rmap.alloc_set_pte.finish_fault
1.23 ± 9% +0.4 1.64 ± 10% perf-profile.calltrace.cycles-pp.shmem_getpage_gfp.shmem_fault.__do_fault.do_fault.__handle_mm_fault
1.05 ± 9% +0.5 1.51 ± 9% perf-profile.calltrace.cycles-pp.find_lock_entry.shmem_getpage_gfp.shmem_fault.__do_fault.do_fault
0.71 ± 12% +0.5 1.18 ± 13% perf-profile.calltrace.cycles-pp.__count_memcg_events.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
0.83 ± 9% +0.5 1.34 ± 10% perf-profile.calltrace.cycles-pp.find_get_entry.find_lock_entry.shmem_getpage_gfp.shmem_fault.__do_fault
0.37 ± 81% +0.6 0.93 ± 15% perf-profile.calltrace.cycles-pp.mem_cgroup_charge_statistics.mem_cgroup_charge.do_fault.__handle_mm_fault.handle_mm_fault
0.36 ± 81% +0.6 0.91 ± 15% perf-profile.calltrace.cycles-pp.__count_memcg_events.mem_cgroup_charge_statistics.mem_cgroup_charge.do_fault.__handle_mm_fault
2.69 ± 9% +4.3 6.95 ± 14% perf-profile.calltrace.cycles-pp.get_mem_cgroup_from_mm.mem_cgroup_charge.do_fault.__handle_mm_fault.handle_mm_fault
7.04 ± 9% +10.2 17.26 ± 13% perf-profile.calltrace.cycles-pp.mem_cgroup_charge.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
11.69 ± 13% -5.8 5.91 ± 8% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
10.58 ± 13% -5.3 5.29 ± 8% perf-profile.children.cycles-pp._raw_spin_lock_irqsave
11.81 ± 11% -3.5 8.26 ± 10% perf-profile.children.cycles-pp.__munmap
11.80 ± 11% -3.5 8.26 ± 10% perf-profile.children.cycles-pp.__do_munmap
11.80 ± 11% -3.5 8.26 ± 10% perf-profile.children.cycles-pp.__x64_sys_munmap
11.80 ± 11% -3.5 8.26 ± 10% perf-profile.children.cycles-pp.__vm_munmap
11.80 ± 11% -3.5 8.26 ± 10% perf-profile.children.cycles-pp.unmap_region
11.88 ± 11% -3.5 8.34 ± 10% perf-profile.children.cycles-pp.do_syscall_64
11.89 ± 11% -3.5 8.35 ± 10% perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
12.30 ± 10% -3.4 8.86 ± 10% perf-profile.children.cycles-pp.finish_fault
12.26 ± 10% -3.4 8.83 ± 10% perf-profile.children.cycles-pp.alloc_set_pte
8.18 ± 11% -3.3 4.90 ± 10% perf-profile.children.cycles-pp.lru_cache_add
8.05 ± 11% -3.2 4.80 ± 9% perf-profile.children.cycles-pp.pagevec_lru_move_fn
10.87 ± 11% -3.2 7.63 ± 10% perf-profile.children.cycles-pp.unmap_vmas
10.87 ± 11% -3.2 7.63 ± 10% perf-profile.children.cycles-pp.unmap_page_range
10.87 ± 11% -3.2 7.63 ± 10% perf-profile.children.cycles-pp.zap_pte_range
8.49 ± 12% -3.0 5.51 ± 9% perf-profile.children.cycles-pp.tlb_flush_mmu
8.20 ± 13% -2.9 5.29 ± 9% perf-profile.children.cycles-pp.release_pages
3.91 ± 10% -0.9 3.06 ± 9% perf-profile.children.cycles-pp._raw_spin_lock
3.27 ± 10% -0.8 2.45 ± 10% perf-profile.children.cycles-pp.alloc_pages_vma
2.95 ± 10% -0.7 2.20 ± 10% perf-profile.children.cycles-pp.__alloc_pages_nodemask
2.67 ± 10% -0.7 1.98 ± 10% perf-profile.children.cycles-pp.get_page_from_freelist
2.39 ± 11% -0.7 1.73 ± 9% perf-profile.children.cycles-pp.rmqueue
1.84 ± 11% -0.6 1.23 ± 11% perf-profile.children.cycles-pp.try_charge
1.90 ± 11% -0.6 1.33 ± 9% perf-profile.children.cycles-pp.rmqueue_bulk
2.01 ± 9% -0.5 1.49 ± 10% perf-profile.children.cycles-pp.free_unref_page_list
1.74 ± 9% -0.5 1.27 ± 10% perf-profile.children.cycles-pp.free_pcppages_bulk
1.32 ± 12% -0.4 0.87 ± 12% perf-profile.children.cycles-pp.page_counter_try_charge
1.54 ± 10% -0.3 1.22 ± 9% perf-profile.children.cycles-pp.__list_del_entry_valid
0.93 ± 13% -0.3 0.62 ± 10% perf-profile.children.cycles-pp.tlb_finish_mmu
1.03 ± 9% -0.2 0.79 ± 10% perf-profile.children.cycles-pp.__irqentry_text_end
0.43 ± 7% -0.1 0.33 ± 10% perf-profile.children.cycles-pp.free_pages_and_swap_cache
0.41 ± 7% -0.1 0.33 ± 10% perf-profile.children.cycles-pp.__perf_sw_event
0.41 ± 9% -0.1 0.33 ± 8% perf-profile.children.cycles-pp.xas_load
0.31 ± 9% -0.1 0.23 ± 6% perf-profile.children.cycles-pp.__mod_lruvec_state
0.25 ± 8% -0.1 0.19 ± 10% perf-profile.children.cycles-pp.___perf_sw_event
0.22 ± 10% -0.1 0.16 ± 6% perf-profile.children.cycles-pp.__mod_node_page_state
0.09 ± 21% -0.0 0.05 ± 50% perf-profile.children.cycles-pp._raw_spin_unlock_irqrestore
0.22 ± 8% -0.0 0.17 ± 9% perf-profile.children.cycles-pp.sync_regs
0.15 ± 10% -0.0 0.11 ± 8% perf-profile.children.cycles-pp.__list_add_valid
0.13 ± 12% -0.0 0.10 ± 8% perf-profile.children.cycles-pp.memcg_check_events
0.09 ± 8% -0.0 0.06 ± 12% perf-profile.children.cycles-pp.mem_cgroup_page_lruvec
0.10 ± 9% -0.0 0.08 ± 14% perf-profile.children.cycles-pp.up_read
0.11 ± 10% -0.0 0.09 ± 12% perf-profile.children.cycles-pp.shmem_get_policy
0.10 ± 5% -0.0 0.08 ± 13% perf-profile.children.cycles-pp.unlock_page
0.09 ± 11% -0.0 0.07 ± 14% perf-profile.children.cycles-pp._cond_resched
0.09 ± 5% -0.0 0.07 ± 17% perf-profile.children.cycles-pp.find_vma
0.25 ± 8% +0.2 0.42 ± 14% perf-profile.children.cycles-pp.mem_cgroup_uncharge_list
0.11 ± 8% +0.2 0.31 ± 15% perf-profile.children.cycles-pp.uncharge_page
0.15 ± 10% +0.2 0.38 ± 12% perf-profile.children.cycles-pp.lock_page_memcg
1.42 ± 9% +0.3 1.75 ± 9% perf-profile.children.cycles-pp.shmem_fault
0.57 ± 11% +0.4 0.93 ± 15% perf-profile.children.cycles-pp.mem_cgroup_charge_statistics
1.24 ± 9% +0.4 1.65 ± 9% perf-profile.children.cycles-pp.shmem_getpage_gfp
1.07 ± 9% +0.5 1.52 ± 10% perf-profile.children.cycles-pp.find_lock_entry
0.84 ± 9% +0.5 1.35 ± 10% perf-profile.children.cycles-pp.find_get_entry
1.70 ± 11% +0.7 2.39 ± 9% perf-profile.children.cycles-pp.native_irq_return_iret
1.28 ± 11% +0.8 2.11 ± 14% perf-profile.children.cycles-pp.__count_memcg_events
2.69 ± 9% +4.3 6.98 ± 14% perf-profile.children.cycles-pp.get_mem_cgroup_from_mm
7.06 ± 9% +10.2 17.28 ± 13% perf-profile.children.cycles-pp.mem_cgroup_charge
11.69 ± 13% -5.8 5.91 ± 8% perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
2.90 ± 9% -0.6 2.29 ± 10% perf-profile.self.cycles-pp.testcase
2.03 ± 8% -0.5 1.55 ± 11% perf-profile.self.cycles-pp.zap_pte_range
1.07 ± 11% -0.4 0.63 ± 11% perf-profile.self.cycles-pp.__mod_memcg_lruvec_state
1.21 ± 12% -0.4 0.79 ± 11% perf-profile.self.cycles-pp.page_counter_try_charge
1.53 ± 10% -0.3 1.21 ± 9% perf-profile.self.cycles-pp.__list_del_entry_valid
1.03 ± 9% -0.2 0.79 ± 10% perf-profile.self.cycles-pp.__irqentry_text_end
1.05 ± 8% -0.2 0.82 ± 10% perf-profile.self.cycles-pp.free_pcppages_bulk
0.53 ± 10% -0.2 0.37 ± 12% perf-profile.self.cycles-pp.try_charge
0.43 ± 8% -0.1 0.33 ± 10% perf-profile.self.cycles-pp.free_pages_and_swap_cache
0.46 ± 10% -0.1 0.36 ± 10% perf-profile.self.cycles-pp.release_pages
0.46 ± 11% -0.1 0.36 ± 8% perf-profile.self.cycles-pp.__handle_mm_fault
0.36 ± 8% -0.1 0.29 ± 9% perf-profile.self.cycles-pp.xas_load
0.18 ± 10% -0.1 0.11 ± 11% perf-profile.self.cycles-pp.shmem_fault
0.27 ± 9% -0.1 0.20 ± 11% perf-profile.self.cycles-pp.page_remove_rmap
0.29 ± 9% -0.1 0.23 ± 9% perf-profile.self.cycles-pp.handle_mm_fault
0.14 ± 10% -0.1 0.09 ± 11% perf-profile.self.cycles-pp.page_add_new_anon_rmap
0.21 ± 8% -0.1 0.16 ± 4% perf-profile.self.cycles-pp.__mod_node_page_state
0.08 ± 17% -0.0 0.03 ± 82% perf-profile.self.cycles-pp.memcg_check_events
0.16 ± 9% -0.0 0.12 ± 9% perf-profile.self.cycles-pp.shmem_getpage_gfp
0.19 ± 8% -0.0 0.15 ± 9% perf-profile.self.cycles-pp.sync_regs
0.19 ± 8% -0.0 0.14 ± 11% perf-profile.self.cycles-pp.___perf_sw_event
0.17 ± 9% -0.0 0.13 ± 10% perf-profile.self.cycles-pp.do_user_addr_fault
0.12 ± 9% -0.0 0.09 ± 10% perf-profile.self.cycles-pp.find_lock_entry
0.08 ± 11% -0.0 0.05 ± 7% perf-profile.self.cycles-pp.mem_cgroup_page_lruvec
0.09 ± 25% -0.0 0.06 ± 12% perf-profile.self.cycles-pp._raw_spin_lock_irqsave
0.08 ± 6% -0.0 0.06 ± 12% perf-profile.self.cycles-pp.get_task_policy
0.11 ± 8% +0.2 0.31 ± 15% perf-profile.self.cycles-pp.uncharge_page
0.15 ± 10% +0.2 0.38 ± 13% perf-profile.self.cycles-pp.lock_page_memcg
0.43 ± 9% +0.6 1.01 ± 10% perf-profile.self.cycles-pp.find_get_entry
1.69 ± 11% +0.7 2.39 ± 9% perf-profile.self.cycles-pp.native_irq_return_iret
1.28 ± 11% +0.8 2.11 ± 14% perf-profile.self.cycles-pp.__count_memcg_events
2.68 ± 9% +4.3 6.93 ± 14% perf-profile.self.cycles-pp.get_mem_cgroup_from_mm
1.84 ± 10% +6.2 8.01 ± 13% perf-profile.self.cycles-pp.mem_cgroup_charge
will-it-scale.per_process_ops
195000 +------------------------------------------------------------------+
190000 |-+ .+.+. .+.+. +.+. .+. .|
| +. + +.+ : +. .+ +. .+ |
185000 |-+ : : : +. + |
180000 |-+ : : : |
175000 |-+ : : : |
170000 |-+ : : : |
|.+. .+.+ +.+. .+.+.+.+. .+.+ |
165000 |-+ + +. + |
160000 |-+ |
155000 |-+ |
150000 |-+ |
| O O O O O O O O O O O O |
145000 |-O O O O O O O O O O O O O O |
140000 +------------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Rong Chen
View attachment "config-5.9.0-02768-gbd0b230fe14554" of type "text/plain" (170569 bytes)
View attachment "job-script" of type "text/plain" (7859 bytes)
View attachment "job.yaml" of type "text/plain" (5344 bytes)
View attachment "reproduce" of type "text/plain" (343 bytes)
Powered by blists - more mailing lists