[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20190620015813.GM7221@shao2-debian>
Date: Thu, 20 Jun 2019 09:58:13 +0800
From: kernel test robot <rong.a.chen@...el.com>
To: Johannes Weiner <hannes@...xchg.org>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
kernel test robot <rong.a.chen@...el.com>,
Michal Hocko <mhocko@...nel.org>,
Shakeel Butt <shakeelb@...gle.com>,
Roman Gushchin <guro@...com>,
Andrew Morton <akpm@...ux-foundation.org>,
LKML <linux-kernel@...r.kernel.org>, lkp@...org
Subject: [mm] 815744d751: will-it-scale.per_process_ops 43.3% improvement
Greeting,
FYI, we noticed a 43.3% improvement of will-it-scale.per_process_ops due to commit:
commit: 815744d75152078cde5391fc1e3c2d4424323fb6 ("mm: memcontrol: don't batch updates of local VM stats and events")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
in testcase: will-it-scale
on test machine: 192 threads Intel(R) Xeon(R) CPU @ 2.20GHz with 192G memory
with following parameters:
nr_task: 100%
mode: process
test: page_fault3
cpufreq_governor: performance
test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
test-url: https://github.com/antonblanchard/will-it-scale
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
gcc-7/performance/x86_64-rhel-7.6/process/100%/debian-x86_64-2019-05-14.cgz/lkp-csl-2ap1/page_fault3/will-it-scale
commit:
c11fb13a11 ("Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid")
815744d751 ("mm: memcontrol: don't batch updates of local VM stats and events")
c11fb13a117e5a67 815744d75152078cde5391fc1e3
---------------- ---------------------------
fail:runs %reproduction fail:runs
| | |
16:4 183% 23:4 perf-profile.calltrace.cycles-pp.sync_regs.error_entry.testcase
18:4 204% 26:4 perf-profile.calltrace.cycles-pp.error_entry.testcase
0:4 3% 0:4 perf-profile.children.cycles-pp.error_exit
19:4 216% 28:4 perf-profile.children.cycles-pp.error_entry
0:4 2% 0:4 perf-profile.self.cycles-pp.error_exit
2:4 24% 3:4 perf-profile.self.cycles-pp.error_entry
%stddev %change %stddev
\ | \
470022 +43.3% 673338 will-it-scale.per_process_ops
90244291 +43.3% 1.293e+08 will-it-scale.workload
33.13 -2.1% 32.44 boot-time.dhcp
13.47 ± 6% -1.0 12.47 mpstat.cpu.all.usr%
0.14 ± 10% +76.8% 0.25 ± 7% turbostat.CPU%c1
609.72 +9.7% 668.75 turbostat.PkgWatt
22201 ± 2% -11.1% 19733 ± 3% numa-meminfo.node2.Inactive
22201 ± 2% -12.0% 19545 ± 4% numa-meminfo.node2.Inactive(anon)
6060 ± 4% +15.5% 7000 ± 3% numa-meminfo.node2.KernelStack
30791 ± 32% -47.1% 16275 ± 22% numa-meminfo.node3.Inactive
30675 ± 33% -47.5% 16090 ± 22% numa-meminfo.node3.Inactive(anon)
15906 -1.4% 15683 proc-vmstat.nr_page_table_pages
60975063 +38.1% 84216668 proc-vmstat.numa_hit
60881605 +38.2% 84123351 proc-vmstat.numa_local
61181699 +38.1% 84469061 proc-vmstat.pgalloc_normal
2.713e+10 +43.3% 3.887e+10 proc-vmstat.pgfault
58818843 ± 5% +38.1% 81212599 ± 4% proc-vmstat.pgfree
14513924 ± 2% +41.8% 20585689 ± 2% numa-numastat.node0.local_node
14537287 ± 2% +41.7% 20601299 ± 2% numa-numastat.node0.numa_hit
15413984 ± 2% +37.3% 21168399 numa-numastat.node1.local_node
15437340 ± 2% +37.3% 21191634 numa-numastat.node1.numa_hit
15397026 +38.9% 21394060 numa-numastat.node2.local_node
15428108 +38.8% 21417417 numa-numastat.node2.numa_hit
15632711 +35.1% 21119563 numa-numastat.node3.local_node
15648357 +35.2% 21150655 numa-numastat.node3.numa_hit
444.19 ± 12% +41.5% 628.34 ± 3% sched_debug.cfs_rq:/.exec_clock.stddev
172846 ± 7% +19.1% 205841 sched_debug.cfs_rq:/.min_vruntime.stddev
0.05 ± 8% -22.1% 0.04 ± 5% sched_debug.cfs_rq:/.nr_running.stddev
171875 ± 7% +19.2% 204919 sched_debug.cfs_rq:/.spread0.stddev
280585 ± 89% -93.9% 17176 ± 64% sched_debug.cpu.avg_idle.min
24193 ± 31% +61.1% 38975 ± 24% sched_debug.cpu.nr_switches.max
1136 ± 13% +34.4% 1527 ± 16% sched_debug.cpu.ttwu_count.stddev
1018 ± 15% +44.2% 1467 ± 18% sched_debug.cpu.ttwu_local.stddev
8487551 ± 2% +35.9% 11538420 ± 2% numa-vmstat.node0.numa_hit
8464705 ± 2% +36.1% 11523132 ± 2% numa-vmstat.node0.numa_local
8798431 +32.5% 11661199 numa-vmstat.node1.numa_hit
8689509 ± 2% +32.9% 11552294 numa-vmstat.node1.numa_local
5568 ± 2% -12.9% 4847 ± 5% numa-vmstat.node2.nr_inactive_anon
6057 ± 4% +15.6% 7001 ± 3% numa-vmstat.node2.nr_kernel_stack
5571 ± 2% -12.9% 4851 ± 5% numa-vmstat.node2.nr_zone_inactive_anon
8648618 +35.6% 11724115 numa-vmstat.node2.numa_hit
8532135 +36.1% 11615170 numa-vmstat.node2.numa_local
7672 ± 33% -48.1% 3980 ± 17% numa-vmstat.node3.nr_inactive_anon
7673 ± 33% -48.1% 3983 ± 17% numa-vmstat.node3.nr_zone_inactive_anon
8891497 +30.8% 11626904 numa-vmstat.node3.numa_hit
8789979 +30.9% 11510256 numa-vmstat.node3.numa_local
130.25 ±165% -98.7% 1.75 ± 74% interrupts.CPU109.RES:Rescheduling_interrupts
131.00 ±137% -97.7% 3.00 ± 97% interrupts.CPU118.RES:Rescheduling_interrupts
598.75 ±121% -93.5% 38.75 ± 62% interrupts.CPU13.RES:Rescheduling_interrupts
10.25 ±156% +5961.0% 621.25 ±167% interrupts.CPU145.RES:Rescheduling_interrupts
688.50 ±129% -95.4% 31.50 ±107% interrupts.CPU16.RES:Rescheduling_interrupts
1.25 ± 34% +10960.0% 138.25 ±113% interrupts.CPU161.RES:Rescheduling_interrupts
779.50 ±149% -94.9% 39.75 ± 92% interrupts.CPU17.RES:Rescheduling_interrupts
104.25 ± 69% -93.8% 6.50 ± 35% interrupts.CPU177.RES:Rescheduling_interrupts
100.25 ± 90% -89.3% 10.75 ±118% interrupts.CPU182.RES:Rescheduling_interrupts
494.25 ± 60% -79.0% 103.75 ± 16% interrupts.CPU2.RES:Rescheduling_interrupts
4480 ± 16% -40.5% 2665 ± 55% interrupts.CPU24.CAL:Function_call_interrupts
19.50 ±152% +928.2% 200.50 ± 91% interrupts.CPU29.RES:Rescheduling_interrupts
5259 ± 34% +59.7% 8397 interrupts.CPU39.NMI:Non-maskable_interrupts
5259 ± 34% +59.7% 8397 interrupts.CPU39.PMI:Performance_monitoring_interrupts
5258 ± 34% +59.6% 8390 interrupts.CPU42.NMI:Non-maskable_interrupts
5258 ± 34% +59.6% 8390 interrupts.CPU42.PMI:Performance_monitoring_interrupts
5253 ± 34% +59.8% 8393 interrupts.CPU43.NMI:Non-maskable_interrupts
5253 ± 34% +59.8% 8393 interrupts.CPU43.PMI:Performance_monitoring_interrupts
5248 ± 34% +59.9% 8394 interrupts.CPU44.NMI:Non-maskable_interrupts
5248 ± 34% +59.9% 8394 interrupts.CPU44.PMI:Performance_monitoring_interrupts
5261 ± 34% +60.0% 8419 interrupts.CPU57.NMI:Non-maskable_interrupts
5261 ± 34% +60.0% 8419 interrupts.CPU57.PMI:Performance_monitoring_interrupts
7874 +20.1% 9459 ± 11% interrupts.CPU95.RES:Rescheduling_interrupts
217.00 ± 49% -80.2% 43.00 ± 93% interrupts.CPU96.RES:Rescheduling_interrupts
53003 ± 5% -13.3% 45949 ± 6% interrupts.RES:Rescheduling_interrupts
2.68 -31.4% 1.84 perf-stat.i.MPKI
4.694e+10 +43.1% 6.715e+10 perf-stat.i.branch-instructions
1.238e+08 +43.4% 1.775e+08 ± 2% perf-stat.i.branch-misses
59.57 +9.1 68.63 perf-stat.i.cache-miss-rate%
3.661e+08 +14.0% 4.173e+08 perf-stat.i.cache-misses
2.49 -31.0% 1.72 perf-stat.i.cpi
1557 -12.7% 1360 perf-stat.i.cycles-between-cache-misses
0.02 ±109% -0.0 0.00 ± 13% perf-stat.i.dTLB-load-miss-rate%
6.494e+10 +44.9% 9.411e+10 perf-stat.i.dTLB-loads
1.59e+09 +43.4% 2.28e+09 perf-stat.i.dTLB-store-misses
3.43e+10 +44.8% 4.965e+10 perf-stat.i.dTLB-stores
2.29e+11 +44.2% 3.301e+11 perf-stat.i.instructions
0.40 +44.8% 0.58 perf-stat.i.ipc
90220636 +43.3% 1.293e+08 perf-stat.i.minor-faults
6.99 ± 2% -6.0 0.96 ± 17% perf-stat.i.node-load-miss-rate%
2563994 ± 2% -67.3% 839546 ± 12% perf-stat.i.node-load-misses
34196685 ± 2% +154.9% 87152794 ± 5% perf-stat.i.node-loads
14.00 -7.5 6.48 perf-stat.i.node-store-miss-rate%
14863715 ± 2% -38.3% 9165413 ± 2% perf-stat.i.node-store-misses
91337940 +44.8% 1.323e+08 perf-stat.i.node-stores
90221535 +43.3% 1.293e+08 perf-stat.i.page-faults
2.68 -31.4% 1.84 perf-stat.overall.MPKI
59.57 +9.1 68.63 perf-stat.overall.cache-miss-rate%
2.49 -31.0% 1.72 perf-stat.overall.cpi
1557 -12.7% 1359 perf-stat.overall.cycles-between-cache-misses
0.02 ±109% -0.0 0.00 ± 13% perf-stat.overall.dTLB-load-miss-rate%
0.40 +44.8% 0.58 perf-stat.overall.ipc
6.98 ± 2% -6.0 0.96 ± 17% perf-stat.overall.node-load-miss-rate%
14.00 -7.5 6.48 perf-stat.overall.node-store-miss-rate%
4.677e+10 +43.1% 6.692e+10 perf-stat.ps.branch-instructions
1.234e+08 +43.4% 1.769e+08 ± 2% perf-stat.ps.branch-misses
3.647e+08 +14.0% 4.158e+08 perf-stat.ps.cache-misses
6.471e+10 +45.0% 9.379e+10 perf-stat.ps.dTLB-loads
1.584e+09 +43.5% 2.273e+09 perf-stat.ps.dTLB-store-misses
3.417e+10 +44.8% 4.949e+10 perf-stat.ps.dTLB-stores
2.281e+11 +44.2% 3.29e+11 perf-stat.ps.instructions
89897978 +43.3% 1.288e+08 perf-stat.ps.minor-faults
2554791 ± 2% -67.2% 836695 ± 12% perf-stat.ps.node-load-misses
34073965 ± 2% +154.9% 86855995 ± 5% perf-stat.ps.node-loads
14810461 ± 2% -38.3% 9134179 ± 2% perf-stat.ps.node-store-misses
91010555 +44.9% 1.318e+08 perf-stat.ps.node-stores
89898192 +43.3% 1.288e+08 perf-stat.ps.page-faults
6.812e+13 +44.2% 9.822e+13 perf-stat.total.instructions
11934 ± 6% +35.0% 16111 ± 12% softirqs.CPU120.RCU
11745 ± 4% +14.5% 13450 ± 5% softirqs.CPU122.RCU
11990 ± 7% +11.6% 13378 ± 4% softirqs.CPU124.RCU
11979 ± 5% +12.4% 13466 ± 4% softirqs.CPU126.RCU
11997 ± 6% +11.4% 13370 ± 5% softirqs.CPU127.RCU
12165 ± 3% +12.4% 13677 ± 4% softirqs.CPU128.RCU
12213 ± 4% +10.0% 13431 ± 4% softirqs.CPU129.RCU
11827 ± 4% +16.9% 13831 ± 5% softirqs.CPU130.RCU
11469 ± 9% +19.7% 13725 ± 4% softirqs.CPU131.RCU
11869 ± 5% +15.5% 13711 ± 3% softirqs.CPU132.RCU
11751 ± 4% +15.7% 13596 ± 4% softirqs.CPU134.RCU
11675 ± 5% +16.6% 13615 ± 6% softirqs.CPU135.RCU
11900 ± 5% +15.0% 13687 ± 3% softirqs.CPU136.RCU
11959 ± 5% +14.0% 13636 ± 4% softirqs.CPU137.RCU
11940 ± 5% +13.7% 13576 ± 3% softirqs.CPU138.RCU
11905 ± 6% +15.9% 13804 ± 5% softirqs.CPU139.RCU
12342 ± 5% +11.4% 13750 ± 6% softirqs.CPU140.RCU
11828 ± 4% +13.3% 13401 ± 3% softirqs.CPU141.RCU
11823 ± 4% +14.5% 13536 ± 3% softirqs.CPU142.RCU
11658 ± 7% +15.6% 13472 ± 3% softirqs.CPU143.RCU
133947 ± 19% -23.1% 102992 ± 3% softirqs.CPU143.TIMER
12294 ± 3% +6.5% 13090 ± 3% softirqs.CPU145.RCU
12118 ± 3% +7.3% 12999 softirqs.CPU146.RCU
12079 ± 3% +9.6% 13240 ± 2% softirqs.CPU149.RCU
11937 ± 3% +11.0% 13256 ± 2% softirqs.CPU155.RCU
12003 ± 3% +11.2% 13348 ± 4% softirqs.CPU156.RCU
11979 ± 6% +9.1% 13075 ± 4% softirqs.CPU158.RCU
11992 ± 3% +9.6% 13148 ± 4% softirqs.CPU159.RCU
12283 ± 5% +14.0% 13997 ± 9% softirqs.CPU167.RCU
11803 +12.4% 13267 ± 3% softirqs.CPU180.RCU
12018 ± 5% +6.8% 12838 ± 4% softirqs.CPU187.RCU
12493 ± 5% +13.6% 14192 ± 4% softirqs.CPU27.RCU
12587 ± 6% +13.8% 14328 ± 6% softirqs.CPU30.RCU
12864 ± 3% +9.6% 14103 ± 4% softirqs.CPU33.RCU
12555 ± 4% +12.9% 14181 ± 6% softirqs.CPU34.RCU
12422 ± 4% +17.1% 14545 ± 4% softirqs.CPU35.RCU
12235 +17.1% 14328 ± 2% softirqs.CPU36.RCU
12710 ± 5% +10.1% 13989 ± 3% softirqs.CPU37.RCU
12441 ± 4% +14.6% 14262 ± 3% softirqs.CPU38.RCU
12457 ± 4% +12.4% 14000 ± 3% softirqs.CPU39.RCU
12503 ± 5% +13.5% 14188 ± 2% softirqs.CPU40.RCU
12430 ± 4% +15.9% 14408 ± 3% softirqs.CPU41.RCU
12494 ± 5% +14.5% 14310 ± 3% softirqs.CPU42.RCU
12776 ± 4% +13.9% 14547 ± 5% softirqs.CPU43.RCU
12466 ± 4% +12.6% 14040 ± 2% softirqs.CPU45.RCU
12361 ± 5% +15.8% 14313 ± 3% softirqs.CPU46.RCU
11235 ± 8% +26.8% 14244 ± 2% softirqs.CPU47.RCU
118604 ± 8% -12.5% 103803 ± 2% softirqs.CPU47.TIMER
12832 ± 4% +8.0% 13857 ± 5% softirqs.CPU62.RCU
12303 ± 4% +7.4% 13208 ± 7% softirqs.CPU73.RCU
12603 ± 5% +19.3% 15040 ± 9% softirqs.CPU8.RCU
10873 ± 18% +24.4% 13522 ± 3% softirqs.CPU81.RCU
12259 +9.4% 13412 ± 3% softirqs.CPU83.RCU
12381 ± 2% +9.2% 13517 ± 4% softirqs.CPU87.RCU
12363 ± 2% +8.7% 13440 ± 3% softirqs.CPU88.RCU
12498 ± 2% +9.3% 13656 ± 3% softirqs.CPU89.RCU
12202 ± 2% +9.5% 13359 ± 5% softirqs.CPU92.RCU
12291 ± 3% +10.9% 13629 ± 3% softirqs.CPU94.RCU
15.61 -9.2 6.40 perf-profile.calltrace.cycles-pp.alloc_set_pte.finish_fault.__handle_mm_fault.handle_mm_fault.__do_page_fault
15.89 -9.1 6.81 perf-profile.calltrace.cycles-pp.finish_fault.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault
40.86 -8.9 31.99 perf-profile.calltrace.cycles-pp.handle_mm_fault.__do_page_fault.do_page_fault.page_fault.testcase
12.32 -8.8 3.57 ± 3% perf-profile.calltrace.cycles-pp.page_add_file_rmap.alloc_set_pte.finish_fault.__handle_mm_fault.handle_mm_fault
47.45 -8.1 39.30 perf-profile.calltrace.cycles-pp.__do_page_fault.do_page_fault.page_fault.testcase
49.07 -7.6 41.47 perf-profile.calltrace.cycles-pp.do_page_fault.page_fault.testcase
78.70 -7.5 71.19 perf-profile.calltrace.cycles-pp.testcase
9.92 -6.2 3.72 ± 16% perf-profile.calltrace.cycles-pp.page_remove_rmap.unmap_page_range.unmap_vmas.unmap_region.__do_munmap
6.82 ± 5% -5.5 1.31 ± 8% perf-profile.calltrace.cycles-pp.__count_memcg_events.handle_mm_fault.__do_page_fault.do_page_fault.page_fault
6.45 ± 3% -5.2 1.23 ± 25% perf-profile.calltrace.cycles-pp.__mod_memcg_state.__mod_lruvec_state.page_remove_rmap.unmap_page_range.unmap_vmas
6.23 ± 4% -5.2 1.06 ± 3% perf-profile.calltrace.cycles-pp.__mod_memcg_state.__mod_lruvec_state.page_add_file_rmap.alloc_set_pte.finish_fault
7.19 ± 2% -5.0 2.14 ± 18% perf-profile.calltrace.cycles-pp.__mod_lruvec_state.page_remove_rmap.unmap_page_range.unmap_vmas.unmap_region
7.00 ± 3% -5.0 2.04 ± 2% perf-profile.calltrace.cycles-pp.__mod_lruvec_state.page_add_file_rmap.alloc_set_pte.finish_fault.__handle_mm_fault
32.31 -4.0 28.28 perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault.page_fault
14.76 -3.8 10.98 ± 9% perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.unmap_region.__do_munmap.__vm_munmap
14.80 -3.8 11.04 ± 9% perf-profile.calltrace.cycles-pp.unmap_vmas.unmap_region.__do_munmap.__vm_munmap.__x64_sys_munmap
14.83 -3.8 11.07 ± 9% perf-profile.calltrace.cycles-pp.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.munmap
14.83 -3.8 11.07 ± 9% perf-profile.calltrace.cycles-pp.__do_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe
14.83 -3.8 11.07 ± 9% perf-profile.calltrace.cycles-pp.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.munmap
14.83 -3.8 11.08 ± 9% perf-profile.calltrace.cycles-pp.munmap
14.82 -3.8 11.07 ± 9% perf-profile.calltrace.cycles-pp.unmap_region.__do_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64
14.83 -3.8 11.08 ± 9% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.munmap
14.83 -3.8 11.08 ± 9% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.munmap
3.96 ± 6% -3.7 0.27 ±100% perf-profile.calltrace.cycles-pp.lock_page_memcg.page_add_file_rmap.alloc_set_pte.finish_fault.__handle_mm_fault
1.72 -0.9 0.81 perf-profile.calltrace.cycles-pp._raw_spin_lock.alloc_set_pte.finish_fault.__handle_mm_fault.handle_mm_fault
1.30 ± 3% -0.6 0.69 perf-profile.calltrace.cycles-pp.down_read_trylock.__do_page_fault.do_page_fault.page_fault.testcase
1.31 -0.6 0.71 perf-profile.calltrace.cycles-pp.up_read.__do_page_fault.do_page_fault.page_fault.testcase
0.84 -0.2 0.64 perf-profile.calltrace.cycles-pp.unlock_page.fault_dirty_shared_page.__handle_mm_fault.handle_mm_fault.__do_page_fault
0.55 +0.1 0.67 ± 4% perf-profile.calltrace.cycles-pp.release_pages.tlb_flush_mmu.unmap_page_range.unmap_vmas.unmap_region
0.53 ± 3% +0.2 0.72 ± 2% perf-profile.calltrace.cycles-pp.current_time.file_update_time.__handle_mm_fault.handle_mm_fault.__do_page_fault
0.70 +0.2 0.90 ± 3% perf-profile.calltrace.cycles-pp.tlb_flush_mmu.unmap_page_range.unmap_vmas.unmap_region.__do_munmap
1.91 +0.3 2.18 perf-profile.calltrace.cycles-pp.fault_dirty_shared_page.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault
0.53 ± 2% +0.3 0.82 ± 4% perf-profile.calltrace.cycles-pp.vmacache_find.find_vma.__do_page_fault.do_page_fault.page_fault
0.92 +0.3 1.24 ± 3% perf-profile.calltrace.cycles-pp.___perf_sw_event.__perf_sw_event.do_page_fault.page_fault.testcase
0.60 ± 2% +0.3 0.93 ± 4% perf-profile.calltrace.cycles-pp.find_vma.__do_page_fault.do_page_fault.page_fault.testcase
0.92 +0.3 1.25 perf-profile.calltrace.cycles-pp.file_update_time.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault
0.41 ± 57% +0.4 0.77 ± 3% perf-profile.calltrace.cycles-pp.set_page_dirty.fault_dirty_shared_page.__handle_mm_fault.handle_mm_fault.__do_page_fault
0.96 +0.4 1.34 perf-profile.calltrace.cycles-pp.swapgs_restore_regs_and_return_to_usermode.testcase
1.37 +0.4 1.80 perf-profile.calltrace.cycles-pp.__perf_sw_event.do_page_fault.page_fault.testcase
0.84 ± 3% +0.5 1.30 ± 2% perf-profile.calltrace.cycles-pp.___perf_sw_event.__perf_sw_event.__do_page_fault.do_page_fault.page_fault
0.00 +0.6 0.55 ± 2% perf-profile.calltrace.cycles-pp.set_page_dirty.unmap_page_range.unmap_vmas.unmap_region.__do_munmap
0.00 +0.6 0.56 ± 4% perf-profile.calltrace.cycles-pp.__mod_node_page_state.__mod_lruvec_state.page_add_file_rmap.alloc_set_pte.finish_fault
0.00 +0.6 0.56 ± 2% perf-profile.calltrace.cycles-pp.page_mapping.set_page_dirty.fault_dirty_shared_page.__handle_mm_fault.handle_mm_fault
1.42 ± 2% +0.7 2.12 perf-profile.calltrace.cycles-pp.__perf_sw_event.__do_page_fault.do_page_fault.page_fault.testcase
1.74 +0.8 2.55 ± 2% perf-profile.calltrace.cycles-pp.xas_load.find_get_entry.find_lock_entry.shmem_getpage_gfp.shmem_fault
8.78 +2.7 11.48 ± 3% perf-profile.calltrace.cycles-pp.find_lock_entry.shmem_getpage_gfp.shmem_fault.__do_fault.__handle_mm_fault
9.54 +2.9 12.42 ± 3% perf-profile.calltrace.cycles-pp.shmem_getpage_gfp.shmem_fault.__do_fault.__handle_mm_fault.handle_mm_fault
10.25 +3.1 13.40 ± 3% perf-profile.calltrace.cycles-pp.shmem_fault.__do_fault.__handle_mm_fault.handle_mm_fault.__do_page_fault
10.57 +3.2 13.80 ± 3% perf-profile.calltrace.cycles-pp.__do_fault.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault
4.94 ± 2% +4.4 9.29 ± 4% perf-profile.calltrace.cycles-pp.find_get_entry.find_lock_entry.shmem_getpage_gfp.shmem_fault.__do_fault
84.06 +12.0 96.02 perf-profile.calltrace.cycles-pp.page_fault.testcase
12.70 ± 3% -10.4 2.31 ± 13% perf-profile.children.cycles-pp.__mod_memcg_state
14.21 ± 2% -10.0 4.23 ± 10% perf-profile.children.cycles-pp.__mod_lruvec_state
15.70 -9.2 6.54 perf-profile.children.cycles-pp.alloc_set_pte
15.93 -9.1 6.86 perf-profile.children.cycles-pp.finish_fault
40.99 -8.8 32.19 perf-profile.children.cycles-pp.handle_mm_fault
12.36 -8.8 3.60 ± 3% perf-profile.children.cycles-pp.page_add_file_rmap
47.57 -8.1 39.47 perf-profile.children.cycles-pp.__do_page_fault
49.12 -7.6 41.53 perf-profile.children.cycles-pp.do_page_fault
9.96 -6.2 3.77 ± 16% perf-profile.children.cycles-pp.page_remove_rmap
6.82 ± 5% -5.5 1.32 ± 8% perf-profile.children.cycles-pp.__count_memcg_events
5.18 ± 6% -4.4 0.77 ± 14% perf-profile.children.cycles-pp.lock_page_memcg
32.44 -4.0 28.45 perf-profile.children.cycles-pp.__handle_mm_fault
14.80 -3.8 11.04 ± 9% perf-profile.children.cycles-pp.unmap_vmas
14.80 -3.8 11.04 ± 9% perf-profile.children.cycles-pp.unmap_page_range
14.88 -3.8 11.12 ± 9% perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
14.88 -3.8 11.12 ± 9% perf-profile.children.cycles-pp.do_syscall_64
14.83 -3.8 11.07 ± 9% perf-profile.children.cycles-pp.__x64_sys_munmap
14.83 -3.8 11.07 ± 9% perf-profile.children.cycles-pp.__do_munmap
14.83 -3.8 11.07 ± 9% perf-profile.children.cycles-pp.__vm_munmap
14.82 -3.8 11.07 ± 9% perf-profile.children.cycles-pp.unmap_region
14.83 -3.8 11.08 ± 9% perf-profile.children.cycles-pp.munmap
10.73 ± 8% -1.6 9.09 perf-profile.children.cycles-pp.native_irq_return_iret
1.76 -0.9 0.86 perf-profile.children.cycles-pp._raw_spin_lock
1.31 -0.6 0.71 perf-profile.children.cycles-pp.up_read
1.31 ± 3% -0.6 0.72 perf-profile.children.cycles-pp.down_read_trylock
0.84 -0.2 0.64 perf-profile.children.cycles-pp.unlock_page
0.37 ± 5% -0.2 0.18 ± 4% perf-profile.children.cycles-pp.__unlock_page_memcg
0.05 ± 9% +0.0 0.07 perf-profile.children.cycles-pp.p4d_offset
0.06 ± 6% +0.0 0.09 ± 4% perf-profile.children.cycles-pp.get_page_from_freelist
0.07 +0.0 0.10 ± 4% perf-profile.children.cycles-pp.pte_alloc_one
0.07 ± 6% +0.0 0.10 ± 5% perf-profile.children.cycles-pp.__alloc_pages_nodemask
0.07 ± 5% +0.0 0.10 ± 4% perf-profile.children.cycles-pp.__x86_indirect_thunk_rax
0.04 ± 57% +0.0 0.07 ± 6% perf-profile.children.cycles-pp.prep_new_page
0.03 ±100% +0.0 0.07 ± 7% perf-profile.children.cycles-pp.clear_page_erms
0.10 ± 4% +0.0 0.14 ± 10% perf-profile.children.cycles-pp.task_tick_fair
0.01 ±173% +0.0 0.05 ± 9% perf-profile.children.cycles-pp.unlock_page_memcg
0.03 ±100% +0.0 0.07 ± 10% perf-profile.children.cycles-pp.native_set_pte_at
0.10 +0.0 0.15 ± 3% perf-profile.children.cycles-pp.PageHuge
0.12 ± 6% +0.0 0.17 ± 14% perf-profile.children.cycles-pp.scheduler_tick
0.15 ± 8% +0.0 0.20 ± 2% perf-profile.children.cycles-pp._vm_normal_page
0.13 ± 3% +0.1 0.18 perf-profile.children.cycles-pp.page_rmapping
0.14 ± 3% +0.1 0.19 ± 8% perf-profile.children.cycles-pp.ktime_get_update_offsets_now
0.16 ± 2% +0.1 0.21 ± 4% perf-profile.children.cycles-pp.fpregs_assert_state_consistent
0.12 ± 4% +0.1 0.18 ± 2% perf-profile.children.cycles-pp.rcu_all_qs
0.00 +0.1 0.06 ± 7% perf-profile.children.cycles-pp.hrtimer_active
0.17 ± 6% +0.1 0.22 ± 15% perf-profile.children.cycles-pp.tick_sched_handle
0.13 ± 3% +0.1 0.19 ± 2% perf-profile.children.cycles-pp.pmd_pfn
0.16 ± 6% +0.1 0.22 ± 14% perf-profile.children.cycles-pp.update_process_times
0.17 ± 2% +0.1 0.24 perf-profile.children.cycles-pp.pmd_page_vaddr
0.21 ± 8% +0.1 0.27 ± 2% perf-profile.children.cycles-pp.perf_exclude_event
0.11 ± 23% +0.1 0.17 ± 11% perf-profile.children.cycles-pp.timespec64_trunc
0.09 ± 4% +0.1 0.16 ± 32% perf-profile.children.cycles-pp.mem_cgroup_from_task
0.22 ± 7% +0.1 0.30 ± 15% perf-profile.children.cycles-pp.tick_sched_timer
0.15 ± 3% +0.1 0.24 perf-profile.children.cycles-pp.free_pages_and_swap_cache
0.20 ± 2% +0.1 0.29 ± 2% perf-profile.children.cycles-pp.mark_page_accessed
0.20 ± 14% +0.1 0.29 ± 15% perf-profile.children.cycles-pp.ktime_get_coarse_real_ts64
0.25 ± 4% +0.1 0.34 perf-profile.children.cycles-pp.__might_sleep
0.12 ± 4% +0.1 0.23 ± 2% perf-profile.children.cycles-pp.__tlb_remove_page_size
0.27 ± 4% +0.1 0.37 ± 12% perf-profile.children.cycles-pp.__hrtimer_run_queues
0.23 ± 2% +0.1 0.34 ± 2% perf-profile.children.cycles-pp._cond_resched
0.36 ± 2% +0.1 0.48 ± 2% perf-profile.children.cycles-pp.prepare_exit_to_usermode
0.31 ± 4% +0.1 0.43 ± 4% perf-profile.children.cycles-pp.xas_start
0.56 +0.1 0.69 ± 3% perf-profile.children.cycles-pp.release_pages
0.66 +0.1 0.79 perf-profile.children.cycles-pp.___might_sleep
0.38 +0.1 0.51 perf-profile.children.cycles-pp.__set_page_dirty_no_writeback
0.69 ± 4% +0.1 0.83 ± 11% perf-profile.children.cycles-pp.apic_timer_interrupt
0.58 ± 4% +0.2 0.75 ± 12% perf-profile.children.cycles-pp.hrtimer_interrupt
0.61 ± 4% +0.2 0.79 ± 11% perf-profile.children.cycles-pp.smp_apic_timer_interrupt
0.56 ± 2% +0.2 0.75 ± 2% perf-profile.children.cycles-pp.current_time
0.59 ± 4% +0.2 0.78 ± 5% perf-profile.children.cycles-pp.pmd_devmap_trans_unstable
0.71 +0.2 0.92 ± 3% perf-profile.children.cycles-pp.tlb_flush_mmu
1.97 +0.3 2.27 perf-profile.children.cycles-pp.fault_dirty_shared_page
0.55 ± 2% +0.3 0.85 ± 4% perf-profile.children.cycles-pp.vmacache_find
1.17 ± 2% +0.3 1.48 perf-profile.children.cycles-pp.page_mapping
0.94 +0.3 1.28 perf-profile.children.cycles-pp.file_update_time
0.72 ± 2% +0.3 1.06 ± 9% perf-profile.children.cycles-pp.__mod_node_page_state
0.63 ± 2% +0.3 0.97 ± 3% perf-profile.children.cycles-pp.find_vma
0.96 +0.4 1.34 perf-profile.children.cycles-pp.swapgs_restore_regs_and_return_to_usermode
0.93 ± 2% +0.4 1.38 ± 2% perf-profile.children.cycles-pp.set_page_dirty
1.89 ± 2% +0.8 2.68 perf-profile.children.cycles-pp.___perf_sw_event
1.79 +0.8 2.61 perf-profile.children.cycles-pp.xas_load
2.81 +1.1 3.94 perf-profile.children.cycles-pp.__perf_sw_event
4.28 +1.9 6.15 perf-profile.children.cycles-pp.sync_regs
67.10 +2.4 69.48 perf-profile.children.cycles-pp.page_fault
8.85 +2.7 11.58 ± 3% perf-profile.children.cycles-pp.find_lock_entry
9.56 +2.9 12.45 ± 3% perf-profile.children.cycles-pp.shmem_getpage_gfp
10.27 +3.1 13.42 ± 3% perf-profile.children.cycles-pp.shmem_fault
10.58 +3.2 13.81 ± 3% perf-profile.children.cycles-pp.__do_fault
85.11 +3.8 88.87 perf-profile.children.cycles-pp.testcase
4.97 ± 2% +4.4 9.34 ± 4% perf-profile.children.cycles-pp.find_get_entry
12.60 ± 3% -10.3 2.25 ± 14% perf-profile.self.cycles-pp.__mod_memcg_state
6.80 ± 5% -5.5 1.29 ± 8% perf-profile.self.cycles-pp.__count_memcg_events
5.11 ± 6% -4.4 0.71 ± 14% perf-profile.self.cycles-pp.lock_page_memcg
2.84 ± 2% -1.7 1.09 perf-profile.self.cycles-pp.find_lock_entry
10.73 ± 8% -1.6 9.08 perf-profile.self.cycles-pp.native_irq_return_iret
1.73 -0.9 0.83 perf-profile.self.cycles-pp._raw_spin_lock
1.29 ± 4% -0.6 0.68 perf-profile.self.cycles-pp.down_read_trylock
1.30 -0.6 0.70 perf-profile.self.cycles-pp.up_read
1.37 -0.3 1.05 ± 3% perf-profile.self.cycles-pp.page_add_file_rmap
0.82 -0.2 0.61 perf-profile.self.cycles-pp.unlock_page
0.35 ± 4% -0.2 0.17 ± 3% perf-profile.self.cycles-pp.__unlock_page_memcg
0.05 +0.0 0.07 ± 5% perf-profile.self.cycles-pp.__x86_indirect_thunk_rax
0.14 ± 6% +0.0 0.16 ± 10% perf-profile.self.cycles-pp.perf_swevent_event
0.08 +0.0 0.11 ± 3% perf-profile.self.cycles-pp.PageHuge
0.11 ± 4% +0.0 0.14 perf-profile.self.cycles-pp.page_rmapping
0.09 ± 4% +0.0 0.13 perf-profile.self.cycles-pp.find_vma
0.03 ±100% +0.0 0.07 ± 7% perf-profile.self.cycles-pp.clear_page_erms
0.22 ± 6% +0.0 0.26 perf-profile.self.cycles-pp.__do_fault
0.09 +0.0 0.13 ± 3% perf-profile.self.cycles-pp.rcu_all_qs
0.16 ± 2% +0.0 0.21 ± 3% perf-profile.self.cycles-pp.prepare_exit_to_usermode
0.01 ±173% +0.0 0.06 ± 14% perf-profile.self.cycles-pp.native_set_pte_at
0.13 ± 8% +0.0 0.18 ± 3% perf-profile.self.cycles-pp._vm_normal_page
0.13 ± 5% +0.0 0.18 ± 9% perf-profile.self.cycles-pp.ktime_get_update_offsets_now
0.17 ± 9% +0.0 0.22 perf-profile.self.cycles-pp.perf_exclude_event
0.11 +0.1 0.16 ± 4% perf-profile.self.cycles-pp._cond_resched
0.00 +0.1 0.05 perf-profile.self.cycles-pp.unlock_page_memcg
0.12 ± 3% +0.1 0.17 ± 2% perf-profile.self.cycles-pp.pmd_pfn
0.00 +0.1 0.05 ± 8% perf-profile.self.cycles-pp.pmd_devmap
0.15 ± 5% +0.1 0.21 ± 4% perf-profile.self.cycles-pp.fpregs_assert_state_consistent
0.16 ± 2% +0.1 0.22 perf-profile.self.cycles-pp.pmd_page_vaddr
0.00 +0.1 0.06 ± 7% perf-profile.self.cycles-pp.hrtimer_active
0.10 ± 19% +0.1 0.16 ± 9% perf-profile.self.cycles-pp.timespec64_trunc
0.08 ± 10% +0.1 0.15 ± 33% perf-profile.self.cycles-pp.mem_cgroup_from_task
0.80 +0.1 0.87 perf-profile.self.cycles-pp.__mod_lruvec_state
0.15 ± 2% +0.1 0.23 ± 3% perf-profile.self.cycles-pp.free_pages_and_swap_cache
0.19 ± 2% +0.1 0.27 ± 3% perf-profile.self.cycles-pp.mark_page_accessed
0.23 ± 5% +0.1 0.32 perf-profile.self.cycles-pp.__might_sleep
0.18 ± 15% +0.1 0.28 ± 16% perf-profile.self.cycles-pp.ktime_get_coarse_real_ts64
0.20 ± 3% +0.1 0.29 perf-profile.self.cycles-pp.do_page_fault
0.23 +0.1 0.33 perf-profile.self.cycles-pp.finish_fault
0.11 ± 4% +0.1 0.21 ± 4% perf-profile.self.cycles-pp.__tlb_remove_page_size
0.28 ± 4% +0.1 0.39 ± 3% perf-profile.self.cycles-pp.xas_start
0.21 ± 17% +0.1 0.32 ± 6% perf-profile.self.cycles-pp.fault_dirty_shared_page
0.54 +0.1 0.66 ± 4% perf-profile.self.cycles-pp.release_pages
0.34 +0.1 0.46 perf-profile.self.cycles-pp.__set_page_dirty_no_writeback
0.65 +0.1 0.78 perf-profile.self.cycles-pp.___might_sleep
0.39 +0.1 0.53 ± 2% perf-profile.self.cycles-pp.file_update_time
0.29 +0.2 0.44 perf-profile.self.cycles-pp.set_page_dirty
0.70 ± 6% +0.2 0.86 ± 3% perf-profile.self.cycles-pp.shmem_getpage_gfp
0.56 ± 4% +0.2 0.74 ± 5% perf-profile.self.cycles-pp.pmd_devmap_trans_unstable
0.71 ± 2% +0.3 0.97 perf-profile.self.cycles-pp.shmem_fault
0.59 +0.3 0.85 perf-profile.self.cycles-pp.swapgs_restore_regs_and_return_to_usermode
0.73 +0.3 1.00 perf-profile.self.cycles-pp.page_fault
1.12 ± 2% +0.3 1.41 perf-profile.self.cycles-pp.page_mapping
0.52 ± 2% +0.3 0.82 ± 4% perf-profile.self.cycles-pp.vmacache_find
0.90 +0.3 1.22 ± 2% perf-profile.self.cycles-pp.__perf_sw_event
0.71 ± 2% +0.3 1.05 ± 9% perf-profile.self.cycles-pp.__mod_node_page_state
0.91 +0.4 1.29 perf-profile.self.cycles-pp.alloc_set_pte
1.29 +0.6 1.88 perf-profile.self.cycles-pp.__do_page_fault
1.57 +0.7 2.23 ± 2% perf-profile.self.cycles-pp.handle_mm_fault
1.48 +0.7 2.17 ± 2% perf-profile.self.cycles-pp.xas_load
1.56 ± 3% +0.7 2.26 ± 2% perf-profile.self.cycles-pp.___perf_sw_event
2.55 ± 4% +1.1 3.64 perf-profile.self.cycles-pp.__handle_mm_fault
2.99 +1.8 4.83 ± 8% perf-profile.self.cycles-pp.unmap_page_range
4.27 +1.9 6.13 perf-profile.self.cycles-pp.sync_regs
3.14 ± 3% +3.5 6.62 ± 6% perf-profile.self.cycles-pp.find_get_entry
18.54 +10.1 28.67 perf-profile.self.cycles-pp.testcase
will-it-scale.per_process_ops
700000 +-+---------------------O------------------------------------------+
| O O O O O O O O O O O O O |
O O O O O O O |
650000 +-+ |
| |
| |
600000 +-+ |
| |
550000 +-+ |
| |
| |
500000 +-+ |
| .+.. .+.. +.. ..+..+..+..+.. .+.. .+.. |
|..+..+. +. +.. .. +..+. +..+. +..+. .|
450000 +-+----------------------------------------------------------------+
will-it-scale.workload
1.35e+08 +-+--------------------------------------------------------------+
1.3e+08 +-+ O O O O O O O O O O O O O |
O O O O O O O O |
1.25e+08 +-+ |
1.2e+08 +-+ |
| |
1.15e+08 +-+ |
1.1e+08 +-+ |
1.05e+08 +-+ |
| |
1e+08 +-+ |
9.5e+07 +-+ |
|..+.. .+.. .+.. .+.. .+.+..+..+.. .+.. .+.. .|
9e+07 +-+ +. +. +..+. +..+. +..+. +..+. +. |
8.5e+07 +-+--------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Rong Chen
View attachment "config-5.2.0-rc4-00046-g815744d7" of type "text/plain" (196393 bytes)
View attachment "job-script" of type "text/plain" (7331 bytes)
View attachment "job.yaml" of type "text/plain" (4949 bytes)
View attachment "reproduce" of type "text/plain" (316 bytes)
Powered by blists - more mailing lists