[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20210304074339.GA17830@xsang-OptiPlex-9020>
Date: Thu, 4 Mar 2021 15:43:39 +0800
From: kernel test robot <oliver.sang@...el.com>
To: "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
Cc: Will Deacon <will@...nel.org>, LKML <linux-kernel@...r.kernel.org>,
lkp@...ts.01.org, lkp@...el.com, ying.huang@...el.com,
feng.tang@...el.com, zhengjun.xing@...el.com
Subject: [mm] f9ce0be71d: vm-scalability.throughput 2.2% improvement
Greeting,
FYI, we noticed a 2.2% improvement of vm-scalability.throughput due to commit:
commit: f9ce0be71d1fbb038ada15ced83474b0e63f264d ("mm: Cleanup faultaround and finish_fault() codepaths")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
in testcase: vm-scalability
on test machine: 192 threads Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz with 192G memory
with following parameters:
runtime: 300s
size: 2T
test: shm-xread-seq-mt
cpufreq_governor: performance
ucode: 0x5003006
test-description: The motivation behind this suite is to exercise functions and regions of the mm/ of the Linux kernel which are of interest to us.
test-url: https://git.kernel.org/cgit/linux/kernel/git/wfg/vm-scalability.git/
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml
bin/lkp run compatible-job.yaml
=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/runtime/size/tbox_group/test/testcase/ucode:
gcc-9/performance/x86_64-rhel-8.3/debian-10.4-x86_64-20200603.cgz/300s/2T/lkp-csl-2ap4/shm-xread-seq-mt/vm-scalability/0x5003006
commit:
v5.11-rc4
f9ce0be71d ("mm: Cleanup faultaround and finish_fault() codepaths")
v5.11-rc4 f9ce0be71d1fbb038ada15ced83
---------------- ---------------------------
%stddev %change %stddev
\ | \
173458 +2.2% 177341 vm-scalability.median
33304080 +2.2% 34049512 vm-scalability.throughput
48910 +3.6% 50687 vm-scalability.time.involuntary_context_switches
39787122 +2.9% 40925278 vm-scalability.time.maximum_resident_set_size
2.062e+08 -13.8% 1.777e+08 vm-scalability.time.minor_page_faults
6097 -1.3% 6015 vm-scalability.time.system_time
3482 +2.4% 3565 vm-scalability.time.user_time
1.99e+08 +15.0% 2.288e+08 vm-scalability.time.voluntary_context_switches
1.001e+10 +2.2% 1.023e+10 vm-scalability.workload
1.026e+10 ± 5% +20.6% 1.238e+10 cpuidle.C1.time
261208 ± 7% -6.3% 244703 ± 3% numa-meminfo.node3.Unevictable
1288500 +15.0% 1481820 vmstat.system.cs
1700497 +16.2% 1975167 meminfo.Active
1699451 +16.2% 1974120 meminfo.Active(anon)
65301 ± 7% -6.3% 61175 ± 3% numa-vmstat.node3.nr_unevictable
65301 ± 7% -6.3% 61175 ± 3% numa-vmstat.node3.nr_zone_unevictable
24025 ± 11% +17.8% 28292 ± 16% softirqs.CPU0.RCU
21799 ± 13% +18.2% 25762 ± 14% softirqs.CPU32.RCU
91559 ± 5% -12.5% 80098 ± 2% sched_debug.cpu.avg_idle.min
1027173 +15.0% 1181071 sched_debug.cpu.nr_switches.avg
1045282 +15.1% 1203225 sched_debug.cpu.nr_switches.max
963018 +15.2% 1109090 sched_debug.cpu.nr_switches.min
1005 ± 4% +9.8% 1104 ± 3% sched_debug.cpu.nr_uninterruptible.max
424979 +16.2% 493705 proc-vmstat.nr_active_anon
5625206 +3.7% 5835573 proc-vmstat.nr_file_pages
5016493 +2.8% 5157613 proc-vmstat.nr_inactive_anon
4863940 +3.0% 5007893 proc-vmstat.nr_mapped
10901 +2.4% 11167 proc-vmstat.nr_page_table_pages
5373920 +3.9% 5584284 proc-vmstat.nr_shmem
38336 +1.2% 38812 proc-vmstat.nr_slab_reclaimable
424979 +16.2% 493705 proc-vmstat.nr_zone_active_anon
5016493 +2.8% 5157613 proc-vmstat.nr_zone_inactive_anon
20657870 +2.1% 21091358 proc-vmstat.numa_hit
20397835 +2.1% 20831238 proc-vmstat.numa_local
20771666 +2.2% 21222215 proc-vmstat.pgalloc_normal
2.08e+08 -13.7% 1.795e+08 proc-vmstat.pgfault
0.01 ± 14% +88.9% 0.03 ± 45% perf-sched.sch_delay.max.ms.pipe_read.new_sync_read.vfs_read.ksys_read
1.30 -14.1% 1.12 perf-sched.total_wait_and_delay.average.ms
4252932 +16.4% 4948640 perf-sched.total_wait_and_delay.count.ms
1.29 -14.2% 1.11 perf-sched.total_wait_time.average.ms
0.39 -14.1% 0.34 perf-sched.wait_and_delay.avg.ms.io_schedule.__lock_page.find_lock_entry.shmem_getpage_gfp
4239058 +16.4% 4933941 perf-sched.wait_and_delay.count.io_schedule.__lock_page.find_lock_entry.shmem_getpage_gfp
0.54 ± 2% -20.1% 0.43 ± 3% perf-sched.wait_time.avg.ms.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_exc_page_fault.[unknown]
0.59 ± 6% -19.1% 0.48 ± 6% perf-sched.wait_time.avg.ms.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
0.62 ± 6% -16.3% 0.52 ± 4% perf-sched.wait_time.avg.ms.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_call_function_single.[unknown]
0.63 ± 4% -20.0% 0.50 ± 4% perf-sched.wait_time.avg.ms.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown]
0.39 -14.2% 0.33 perf-sched.wait_time.avg.ms.io_schedule.__lock_page.find_lock_entry.shmem_getpage_gfp
0.50 ± 5% -13.6% 0.43 ± 5% perf-sched.wait_time.avg.ms.preempt_schedule_common._cond_resched.find_lock_entry.shmem_getpage_gfp.shmem_fault
0.65 -15.7% 0.55 perf-sched.wait_time.avg.ms.preempt_schedule_common._cond_resched.stop_one_cpu.migrate_task_to.task_numa_migrate
0.65 ± 3% -15.9% 0.55 ± 4% perf-sched.wait_time.avg.ms.preempt_schedule_common._cond_resched.wait_for_completion.stop_two_cpus.migrate_swap
0.01 ± 17% -70.8% 0.00 ±141% perf-sched.wait_time.max.ms.schedule_timeout.wait_for_completion.stop_one_cpu.affine_move_task
0.05 ±180% +626.6% 0.40 ± 73% perf-sched.wait_time.max.ms.schedule_timeout.wait_for_completion.stop_two_cpus.migrate_swap
3.165e+10 +2.9% 3.257e+10 perf-stat.i.branch-instructions
39566694 +6.2% 42012793 perf-stat.i.branch-misses
21.12 +0.5 21.62 perf-stat.i.cache-miss-rate%
79986940 ± 2% +7.2% 85718194 perf-stat.i.cache-misses
3.813e+08 +5.0% 4.005e+08 perf-stat.i.cache-references
1302391 +14.9% 1496345 perf-stat.i.context-switches
1.14 -1.8% 1.12 perf-stat.i.cpi
5322 ± 2% +6.9% 5687 ± 3% perf-stat.i.cpu-migrations
1397 ± 2% -5.9% 1315 perf-stat.i.cycles-between-cache-misses
0.02 ± 7% -0.0 0.02 ± 4% perf-stat.i.dTLB-load-miss-rate%
2.897e+10 +3.0% 2.985e+10 perf-stat.i.dTLB-loads
0.02 ± 3% -0.0 0.02 ± 2% perf-stat.i.dTLB-store-miss-rate%
912764 ± 3% -10.3% 818415 ± 2% perf-stat.i.dTLB-store-misses
4.641e+09 +4.8% 4.865e+09 perf-stat.i.dTLB-stores
56.57 -0.9 55.69 perf-stat.i.iTLB-load-miss-rate%
13897524 +7.1% 14877776 perf-stat.i.iTLB-load-misses
10457579 +10.9% 11594117 perf-stat.i.iTLB-loads
9.929e+10 +3.0% 1.023e+11 perf-stat.i.instructions
7270 -3.9% 6988 perf-stat.i.instructions-per-iTLB-miss
0.88 +2.0% 0.90 perf-stat.i.ipc
342.11 +3.1% 352.80 perf-stat.i.metric.M/sec
679156 -13.7% 585798 perf-stat.i.minor-faults
25722342 +8.4% 27890585 perf-stat.i.node-load-misses
12004199 +4.9% 12587968 perf-stat.i.node-store-misses
679160 -13.7% 585802 perf-stat.i.page-faults
3.84 +2.0% 3.92 perf-stat.overall.MPKI
0.13 +0.0 0.13 perf-stat.overall.branch-miss-rate%
1.13 -2.1% 1.10 perf-stat.overall.cpi
1398 ± 2% -5.9% 1315 perf-stat.overall.cycles-between-cache-misses
0.02 ± 2% -0.0 0.02 ± 4% perf-stat.overall.dTLB-load-miss-rate%
0.02 ± 3% -0.0 0.02 ± 2% perf-stat.overall.dTLB-store-miss-rate%
57.06 -0.9 56.20 perf-stat.overall.iTLB-load-miss-rate%
7143 -3.8% 6872 perf-stat.overall.instructions-per-iTLB-miss
0.89 +2.1% 0.91 perf-stat.overall.ipc
3.154e+10 +2.9% 3.246e+10 perf-stat.ps.branch-instructions
39429793 +6.2% 41875636 perf-stat.ps.branch-misses
79703603 ± 2% +7.2% 85431984 perf-stat.ps.cache-misses
3.799e+08 +5.1% 3.992e+08 perf-stat.ps.cache-references
1297689 +14.9% 1491245 perf-stat.ps.context-switches
5304 ± 2% +6.9% 5669 ± 3% perf-stat.ps.cpu-migrations
2.886e+10 +3.1% 2.975e+10 perf-stat.ps.dTLB-loads
909500 ± 3% -10.3% 815638 ± 2% perf-stat.ps.dTLB-store-misses
4.625e+09 +4.8% 4.849e+09 perf-stat.ps.dTLB-stores
13848723 +7.1% 14828605 perf-stat.ps.iTLB-load-misses
10420305 +10.9% 11555148 perf-stat.ps.iTLB-loads
9.893e+10 +3.0% 1.019e+11 perf-stat.ps.instructions
676707 -13.7% 583803 perf-stat.ps.minor-faults
25630907 +8.5% 27796964 perf-stat.ps.node-load-misses
11961063 +4.9% 12545303 perf-stat.ps.node-store-misses
676711 -13.7% 583807 perf-stat.ps.page-faults
3.04e+13 +3.0% 3.132e+13 perf-stat.total.instructions
568.50 ±116% +599.1% 3974 ± 86% interrupts.40:PCI-MSI.524291-edge.eth0-TxRx-2
1834011 +8.0% 1980233 interrupts.CAL:Function_call_interrupts
941.00 ± 3% +9.2% 1027 ± 4% interrupts.CPU108.RES:Rescheduling_interrupts
935.83 ± 3% +15.4% 1079 ± 5% interrupts.CPU112.RES:Rescheduling_interrupts
947.33 ± 3% +11.6% 1057 ± 5% interrupts.CPU116.RES:Rescheduling_interrupts
568.50 ±116% +599.1% 3974 ± 86% interrupts.CPU12.40:PCI-MSI.524291-edge.eth0-TxRx-2
9697 ± 2% +10.0% 10664 ± 3% interrupts.CPU120.CAL:Function_call_interrupts
9242 ± 3% +11.6% 10315 ± 4% interrupts.CPU129.CAL:Function_call_interrupts
909.00 ± 4% +18.4% 1076 ± 6% interrupts.CPU129.RES:Rescheduling_interrupts
941.00 ± 4% +14.6% 1078 ± 6% interrupts.CPU133.RES:Rescheduling_interrupts
9317 ± 2% +10.2% 10265 ± 5% interrupts.CPU134.CAL:Function_call_interrupts
954.50 ± 4% +11.8% 1067 ± 6% interrupts.CPU134.RES:Rescheduling_interrupts
9260 ± 2% +9.6% 10154 ± 2% interrupts.CPU152.CAL:Function_call_interrupts
9200 ± 2% +7.9% 9927 ± 2% interrupts.CPU160.CAL:Function_call_interrupts
880.17 ± 5% +15.2% 1014 ± 5% interrupts.CPU161.RES:Rescheduling_interrupts
9365 +11.1% 10409 ± 4% interrupts.CPU163.CAL:Function_call_interrupts
900.67 ± 6% +12.1% 1010 ± 5% interrupts.CPU165.RES:Rescheduling_interrupts
9731 +9.1% 10617 ± 2% interrupts.CPU171.CAL:Function_call_interrupts
959.67 ± 4% +12.5% 1080 ± 2% interrupts.CPU179.RES:Rescheduling_interrupts
9765 ± 3% +8.9% 10632 ± 3% interrupts.CPU180.CAL:Function_call_interrupts
9679 ± 2% +10.3% 10679 ± 2% interrupts.CPU183.CAL:Function_call_interrupts
979.83 ± 4% +11.2% 1089 ± 2% interrupts.CPU183.RES:Rescheduling_interrupts
9961 ± 3% +9.5% 10903 ± 3% interrupts.CPU186.CAL:Function_call_interrupts
9657 ± 2% +10.5% 10670 ± 3% interrupts.CPU187.CAL:Function_call_interrupts
9848 ± 3% +8.1% 10644 ± 3% interrupts.CPU188.CAL:Function_call_interrupts
1081 ± 7% +9.8% 1186 ± 4% interrupts.CPU24.RES:Rescheduling_interrupts
951.67 ± 4% +10.8% 1054 ± 6% interrupts.CPU30.RES:Rescheduling_interrupts
9201 ± 3% +9.6% 10081 ± 3% interrupts.CPU31.CAL:Function_call_interrupts
9479 ± 5% +8.9% 10320 ± 4% interrupts.CPU32.CAL:Function_call_interrupts
9199 ± 3% +9.9% 10111 ± 4% interrupts.CPU33.CAL:Function_call_interrupts
8993 ± 3% +11.0% 9983 ± 4% interrupts.CPU36.CAL:Function_call_interrupts
954.33 ± 4% +11.4% 1063 ± 4% interrupts.CPU36.RES:Rescheduling_interrupts
9225 ± 3% +12.0% 10335 ± 4% interrupts.CPU40.CAL:Function_call_interrupts
9020 +9.0% 9828 ± 3% interrupts.CPU42.CAL:Function_call_interrupts
948.17 ± 5% +11.3% 1055 ± 3% interrupts.CPU47.RES:Rescheduling_interrupts
983.00 ± 2% +8.8% 1069 ± 3% interrupts.CPU55.RES:Rescheduling_interrupts
9020 +9.2% 9850 ± 2% interrupts.CPU58.CAL:Function_call_interrupts
9165 ± 2% +10.0% 10078 ± 4% interrupts.CPU63.CAL:Function_call_interrupts
10242 ± 3% +11.7% 11440 ± 2% interrupts.CPU74.CAL:Function_call_interrupts
9899 +8.9% 10784 ± 2% interrupts.CPU75.CAL:Function_call_interrupts
9840 ± 3% +10.5% 10872 ± 2% interrupts.CPU79.CAL:Function_call_interrupts
9700 ± 2% +12.1% 10878 ± 2% interrupts.CPU84.CAL:Function_call_interrupts
9503 ± 4% +11.4% 10590 interrupts.CPU87.CAL:Function_call_interrupts
1012 ± 5% +10.3% 1116 ± 2% interrupts.CPU88.RES:Rescheduling_interrupts
9408 ± 3% +9.8% 10330 interrupts.CPU89.CAL:Function_call_interrupts
9726 ± 2% +11.7% 10861 ± 2% interrupts.CPU90.CAL:Function_call_interrupts
9478 +11.2% 10540 ± 2% interrupts.CPU91.CAL:Function_call_interrupts
9636 +10.4% 10638 ± 2% interrupts.CPU92.CAL:Function_call_interrupts
980.83 ± 5% +14.5% 1123 ± 4% interrupts.CPU92.RES:Rescheduling_interrupts
1002 ± 5% +10.2% 1105 ± 3% interrupts.CPU93.RES:Rescheduling_interrupts
9237 +8.2% 9999 ± 2% interrupts.CPU95.CAL:Function_call_interrupts
1933 +18.2% 2285 interrupts.IWI:IRQ_work_interrupts
12.90 -12.9 0.00 perf-profile.calltrace.cycles-pp.alloc_set_pte.finish_fault.do_fault.__handle_mm_fault.handle_mm_fault
12.86 -12.9 0.00 perf-profile.calltrace.cycles-pp._raw_spin_lock.alloc_set_pte.finish_fault.do_fault.__handle_mm_fault
12.69 -12.7 0.00 perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.alloc_set_pte.finish_fault.do_fault
9.82 -9.8 0.00 perf-profile.calltrace.cycles-pp.alloc_set_pte.filemap_map_pages.do_fault.__handle_mm_fault.handle_mm_fault
9.81 -9.8 0.00 perf-profile.calltrace.cycles-pp._raw_spin_lock.alloc_set_pte.filemap_map_pages.do_fault.__handle_mm_fault
9.74 -9.7 0.00 perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.alloc_set_pte.filemap_map_pages.do_fault
11.65 -2.6 9.06 perf-profile.calltrace.cycles-pp.filemap_map_pages.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
32.10 -0.7 31.43 perf-profile.calltrace.cycles-pp.asm_exc_page_fault.do_access
31.83 -0.6 31.20 perf-profile.calltrace.cycles-pp.exc_page_fault.asm_exc_page_fault.do_access
38.68 -0.6 38.05 perf-profile.calltrace.cycles-pp.do_access
31.82 -0.6 31.19 perf-profile.calltrace.cycles-pp.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.do_access
31.23 -0.5 30.71 perf-profile.calltrace.cycles-pp.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.do_access
31.12 -0.5 30.60 perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
30.27 -0.5 29.76 perf-profile.calltrace.cycles-pp.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault
1.21 -0.1 1.12 perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irq.__lock_page.find_lock_entry.shmem_getpage_gfp
1.43 -0.1 1.36 perf-profile.calltrace.cycles-pp._raw_spin_lock_irq.__lock_page.find_lock_entry.shmem_getpage_gfp.shmem_fault
0.55 ± 2% +0.0 0.58 ± 2% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
0.56 ± 2% +0.0 0.59 ± 2% perf-profile.calltrace.cycles-pp._raw_spin_lock.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault
0.71 ± 2% +0.0 0.76 perf-profile.calltrace.cycles-pp.sched_ttwu_pending.flush_smp_call_function_from_idle.do_idle.cpu_startup_entry.start_secondary
3.28 +0.1 3.34 perf-profile.calltrace.cycles-pp.__lock_page.find_lock_entry.shmem_getpage_gfp.shmem_fault.__do_fault
3.77 +0.1 3.85 perf-profile.calltrace.cycles-pp.find_lock_entry.shmem_getpage_gfp.shmem_fault.__do_fault.do_fault
0.96 +0.1 1.05 perf-profile.calltrace.cycles-pp.flush_smp_call_function_from_idle.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
4.13 +0.1 4.22 perf-profile.calltrace.cycles-pp.shmem_getpage_gfp.shmem_fault.__do_fault.do_fault.__handle_mm_fault
4.21 +0.1 4.34 perf-profile.calltrace.cycles-pp.__do_fault.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
4.20 +0.1 4.33 perf-profile.calltrace.cycles-pp.shmem_fault.__do_fault.do_fault.__handle_mm_fault.handle_mm_fault
1.23 +0.1 1.38 ± 2% perf-profile.calltrace.cycles-pp.__schedule.schedule.io_schedule.__lock_page.find_lock_entry
1.26 +0.1 1.41 perf-profile.calltrace.cycles-pp.io_schedule.__lock_page.find_lock_entry.shmem_getpage_gfp.shmem_fault
1.25 +0.2 1.40 ± 2% perf-profile.calltrace.cycles-pp.schedule.io_schedule.__lock_page.find_lock_entry.shmem_getpage_gfp
0.90 +0.2 1.08 perf-profile.calltrace.cycles-pp.wake_page_function.__wake_up_common.wake_up_page_bit.do_fault.__handle_mm_fault
1.00 +0.2 1.20 perf-profile.calltrace.cycles-pp.__wake_up_common.wake_up_page_bit.do_fault.__handle_mm_fault.handle_mm_fault
0.73 +0.2 0.94 perf-profile.calltrace.cycles-pp.try_to_wake_up.wake_page_function.__wake_up_common.wake_up_page_bit.do_fault
17.99 +0.2 18.22 perf-profile.calltrace.cycles-pp.do_rw_once
1.21 +0.2 1.45 perf-profile.calltrace.cycles-pp.wake_up_page_bit.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
43.62 +0.3 43.88 perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.cpuidle_enter.do_idle.cpu_startup_entry
49.22 +0.5 49.74 perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
49.23 +0.5 49.74 perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
49.23 +0.5 49.75 perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64_no_verify
0.00 +0.5 0.52 ± 2% perf-profile.calltrace.cycles-pp.__schedule.schedule_idle.do_idle.cpu_startup_entry.start_secondary
49.49 +0.5 50.02 perf-profile.calltrace.cycles-pp.secondary_startup_64_no_verify
0.00 +0.5 0.54 ± 2% perf-profile.calltrace.cycles-pp.schedule_idle.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
0.00 +0.6 0.63 perf-profile.calltrace.cycles-pp.next_uptodate_page.filemap_map_pages.do_fault.__handle_mm_fault.handle_mm_fault
12.91 +1.7 14.57 perf-profile.calltrace.cycles-pp.finish_fault.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
0.00 +7.3 7.26 perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.filemap_map_pages.do_fault.__handle_mm_fault
0.00 +7.3 7.33 perf-profile.calltrace.cycles-pp._raw_spin_lock.filemap_map_pages.do_fault.__handle_mm_fault.handle_mm_fault
0.00 +14.3 14.30 perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.finish_fault.do_fault.__handle_mm_fault
0.00 +14.5 14.51 perf-profile.calltrace.cycles-pp._raw_spin_lock.finish_fault.do_fault.__handle_mm_fault.handle_mm_fault
22.73 -22.7 0.00 perf-profile.children.cycles-pp.alloc_set_pte
11.65 -2.6 9.07 perf-profile.children.cycles-pp.filemap_map_pages
24.32 -0.9 23.42 perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
23.32 -0.8 22.54 perf-profile.children.cycles-pp._raw_spin_lock
32.12 -0.7 31.46 perf-profile.children.cycles-pp.asm_exc_page_fault
31.84 -0.6 31.21 perf-profile.children.cycles-pp.exc_page_fault
31.83 -0.6 31.20 perf-profile.children.cycles-pp.do_user_addr_fault
41.54 -0.6 40.94 perf-profile.children.cycles-pp.do_access
31.24 -0.5 30.72 perf-profile.children.cycles-pp.handle_mm_fault
31.13 -0.5 30.61 perf-profile.children.cycles-pp.__handle_mm_fault
30.28 -0.5 29.76 perf-profile.children.cycles-pp.do_fault
0.23 ± 7% -0.1 0.15 ± 5% perf-profile.children.cycles-pp.up_read
1.50 -0.1 1.43 perf-profile.children.cycles-pp._raw_spin_lock_irq
0.18 ± 12% -0.0 0.14 ± 12% perf-profile.children.cycles-pp.down_read_trylock
0.07 ± 7% -0.0 0.03 ± 70% perf-profile.children.cycles-pp.__perf_sw_event
0.26 ± 4% -0.0 0.24 ± 2% perf-profile.children.cycles-pp.ktime_get_update_offsets_now
0.20 ± 3% -0.0 0.18 ± 2% perf-profile.children.cycles-pp.native_irq_return_iret
0.09 -0.0 0.08 ± 6% perf-profile.children.cycles-pp.___perf_sw_event
0.11 +0.0 0.12 perf-profile.children.cycles-pp.set_next_entity
0.08 ± 4% +0.0 0.09 ± 4% perf-profile.children.cycles-pp.__list_add_valid
0.16 ± 4% +0.0 0.18 ± 2% perf-profile.children.cycles-pp.perf_trace_sched_wakeup_template
0.13 ± 2% +0.0 0.15 ± 3% perf-profile.children.cycles-pp.nr_iowait_cpu
0.12 ± 5% +0.0 0.14 ± 3% perf-profile.children.cycles-pp.update_rq_clock
0.10 ± 4% +0.0 0.12 ± 5% perf-profile.children.cycles-pp.delayacct_end
0.13 ± 4% +0.0 0.15 ± 6% perf-profile.children.cycles-pp.llist_reverse_order
0.19 ± 2% +0.0 0.21 ± 2% perf-profile.children.cycles-pp.pick_next_task_fair
0.19 ± 3% +0.0 0.21 ± 3% perf-profile.children.cycles-pp.flush_smp_call_function_queue
0.07 ± 5% +0.0 0.09 ± 7% perf-profile.children.cycles-pp.___might_sleep
0.39 +0.0 0.41 perf-profile.children.cycles-pp.enqueue_task_fair
0.26 ± 2% +0.0 0.28 ± 2% perf-profile.children.cycles-pp.update_load_avg
0.20 ± 3% +0.0 0.22 ± 4% perf-profile.children.cycles-pp.update_curr
0.16 ± 2% +0.0 0.19 ± 2% perf-profile.children.cycles-pp.__list_del_entry_valid
0.18 ± 4% +0.0 0.20 ± 3% perf-profile.children.cycles-pp.select_task_rq_fair
0.17 ± 3% +0.0 0.19 ± 2% perf-profile.children.cycles-pp.__smp_call_single_queue
0.17 ± 3% +0.0 0.19 ± 2% perf-profile.children.cycles-pp.llist_add_batch
0.14 ± 8% +0.0 0.18 ± 11% perf-profile.children.cycles-pp.xas_load
0.38 ± 3% +0.0 0.41 ± 2% perf-profile.children.cycles-pp.dequeue_task_fair
0.35 ± 3% +0.0 0.39 ± 2% perf-profile.children.cycles-pp.dequeue_entity
0.30 ± 2% +0.0 0.34 ± 3% perf-profile.children.cycles-pp.find_get_entry
0.39 ± 3% +0.0 0.43 ± 3% perf-profile.children.cycles-pp.finish_task_switch
0.37 ± 2% +0.0 0.42 ± 2% perf-profile.children.cycles-pp.ttwu_queue_wakelist
0.71 ± 2% +0.1 0.77 perf-profile.children.cycles-pp.sched_ttwu_pending
0.64 +0.1 0.69 perf-profile.children.cycles-pp.ttwu_do_activate
0.01 ±223% +0.1 0.06 ± 21% perf-profile.children.cycles-pp.xas_start
3.28 +0.1 3.34 perf-profile.children.cycles-pp.__lock_page
3.78 +0.1 3.86 perf-profile.children.cycles-pp.find_lock_entry
0.97 +0.1 1.06 perf-profile.children.cycles-pp.flush_smp_call_function_from_idle
4.13 +0.1 4.22 perf-profile.children.cycles-pp.shmem_getpage_gfp
0.46 ± 5% +0.1 0.55 ± 2% perf-profile.children.cycles-pp.schedule_idle
4.21 +0.1 4.34 perf-profile.children.cycles-pp.__do_fault
4.20 +0.1 4.33 perf-profile.children.cycles-pp.shmem_fault
1.26 +0.1 1.41 perf-profile.children.cycles-pp.io_schedule
0.61 +0.2 0.76 perf-profile.children.cycles-pp._raw_spin_lock_irqsave
1.25 +0.2 1.40 ± 2% perf-profile.children.cycles-pp.schedule
15.37 +0.2 15.53 perf-profile.children.cycles-pp.do_rw_once
1.45 +0.2 1.62 perf-profile.children.cycles-pp.wake_page_function
1.62 +0.2 1.81 perf-profile.children.cycles-pp.__wake_up_common
1.94 +0.2 2.17 perf-profile.children.cycles-pp.wake_up_page_bit
1.17 +0.2 1.40 perf-profile.children.cycles-pp.try_to_wake_up
1.68 +0.2 1.92 perf-profile.children.cycles-pp.__schedule
43.85 +0.3 44.12 perf-profile.children.cycles-pp.intel_idle
49.23 +0.5 49.75 perf-profile.children.cycles-pp.start_secondary
49.49 +0.5 50.01 perf-profile.children.cycles-pp.do_idle
49.49 +0.5 50.02 perf-profile.children.cycles-pp.secondary_startup_64_no_verify
49.49 +0.5 50.02 perf-profile.children.cycles-pp.cpu_startup_entry
0.00 +0.6 0.64 perf-profile.children.cycles-pp.next_uptodate_page
12.91 +1.7 14.58 perf-profile.children.cycles-pp.finish_fault
24.20 -0.9 23.29 perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
0.90 -0.7 0.21 ± 5% perf-profile.self.cycles-pp.filemap_map_pages
0.23 ± 9% -0.1 0.15 ± 5% perf-profile.self.cycles-pp.up_read
0.29 ± 6% -0.0 0.24 ± 7% perf-profile.self.cycles-pp.__handle_mm_fault
0.18 ± 12% -0.0 0.14 ± 12% perf-profile.self.cycles-pp.down_read_trylock
0.19 ± 3% -0.0 0.16 ± 5% perf-profile.self.cycles-pp.find_lock_entry
0.50 ± 2% -0.0 0.48 perf-profile.self.cycles-pp.__lock_page
0.25 ± 4% -0.0 0.23 ± 4% perf-profile.self.cycles-pp.ktime_get_update_offsets_now
0.20 ± 3% -0.0 0.18 ± 2% perf-profile.self.cycles-pp.native_irq_return_iret
0.12 ± 3% +0.0 0.14 ± 3% perf-profile.self.cycles-pp.sched_ttwu_pending
0.08 ± 6% +0.0 0.09 perf-profile.self.cycles-pp.enqueue_entity
0.13 ± 2% +0.0 0.15 ± 3% perf-profile.self.cycles-pp.nr_iowait_cpu
0.07 +0.0 0.09 ± 5% perf-profile.self.cycles-pp.update_curr
0.08 ± 6% +0.0 0.09 ± 4% perf-profile.self.cycles-pp.__list_add_valid
0.07 ± 5% +0.0 0.09 ± 4% perf-profile.self.cycles-pp.try_to_wake_up
0.09 ± 5% +0.0 0.11 ± 3% perf-profile.self.cycles-pp.select_task_rq_fair
0.07 ± 5% +0.0 0.09 ± 7% perf-profile.self.cycles-pp.___might_sleep
0.17 +0.0 0.19 ± 4% perf-profile.self.cycles-pp.__wake_up_common
0.25 +0.0 0.28 ± 2% perf-profile.self.cycles-pp._raw_spin_lock_irq
0.16 ± 2% +0.0 0.19 ± 3% perf-profile.self.cycles-pp.llist_add_batch
0.16 ± 2% +0.0 0.19 ± 2% perf-profile.self.cycles-pp.__list_del_entry_valid
0.07 ± 18% +0.0 0.11 ± 8% perf-profile.self.cycles-pp.shmem_fault
0.34 ± 3% +0.0 0.38 ± 3% perf-profile.self.cycles-pp._raw_spin_lock
0.34 ± 2% +0.0 0.38 perf-profile.self.cycles-pp.menu_select
0.37 ± 3% +0.0 0.42 ± 3% perf-profile.self.cycles-pp.finish_task_switch
0.41 +0.1 0.47 perf-profile.self.cycles-pp.__schedule
0.51 +0.1 0.65 perf-profile.self.cycles-pp._raw_spin_lock_irqsave
43.85 +0.3 44.12 perf-profile.self.cycles-pp.intel_idle
0.00 +0.6 0.62 perf-profile.self.cycles-pp.next_uptodate_page
vm-scalability.time.maximum_resident_set_size
4.25e+07 +----------------------------------------------------------------+
| O O |
4.2e+07 |O+ |
| OOO |
4.15e+07 |O+ |
| O O O O O |
4.1e+07 |-+ O O OOOOOOOOOOOO O OO |
| OOO O O |
4.05e+07 |-+ |
| |
4e+07 |-+ + + + + + + + + |
| + + + ++ + +++++ ++++ + +++++ :: : :::+++++ ++ +++ + |
3.95e+07 |++ :+ +:+: ++ ++ +++ + + +++:++ : + ++ +++ +|
|++++ + + + + + + |
3.9e+07 +----------------------------------------------------------------+
vm-scalability.time.minor_page_faults
2.1e+08 +----------------------------------------------------------------+
| + + + + + + + + + + + + + |
2.05e+08 |+++:++++++++++++ ++++++ + ++++++ + ++++++ ++ +:+++ +++++++++++++|
| + + + + + + |
2e+08 |-+ |
| |
1.95e+08 |-+ |
| |
1.9e+08 |-+ |
| |
1.85e+08 |-+ |
|OOOOO |
1.8e+08 |O+ O |
| O OOOOOOOOOOOOOOOOOOOOOOO |
1.75e+08 +----------------------------------------------------------------+
vm-scalability.time.voluntary_context_switches
2.4e+08 +----------------------------------------------------------------+
|O OOO |
2.35e+08 |-+ |
2.3e+08 |-+ OOOOO O OOOOO O OOO OO |
| O O O O OOO O |
2.25e+08 |-+ |
2.2e+08 |-+ |
| |
2.15e+08 |-+ |
2.1e+08 |-+ |
| |
2.05e+08 |-+ |
2e+08 |-+ + + + + |
|++++++++++++++++ ++++++++ ++++++ ++++++++ ++++++++++++++++++++++|
1.95e+08 +----------------------------------------------------------------+
perf-sched.total_wait_and_delay.count.ms
6e+06 +-------------------------------------------------------------------+
| |
5e+06 |OOOOOOOOO OOOOOOOOOOOOO OOOOOO |
| OO OO |
|++++++++++++++++++++++++++++++ +++++++++++++++++++++++ +++++++++|
4e+06 |-+ : : :: ::: : |
| : : :: ::: : |
3e+06 |-+ : : :: ::: : |
| : : :: ::: : |
2e+06 |-+ : : :: : : : |
| : : : : : : |
| : : : : : : |
1e+06 |-+ : : : : : : |
| : : : : : : |
0 +-------------------------------------------------------------------+
vm-scalability.throughput
3.5e+07 +----------------------------------------------------------------+
3.48e+07 |-+ O |
|OOOO |
3.46e+07 |O+ O |
3.44e+07 |-+ |
3.42e+07 |-+ OOOO O O O O O |
3.4e+07 |-+ O OOOOOOOOOOOOO O O OOO |
| OO |
3.38e+07 |-+ |
3.36e+07 |-+ |
3.34e+07 |-+ + + + + + + + + + |
3.32e+07 |-+ ++ + ++ + + +++++ ++++ + +++++ +: :+ ::+++++ ++ ++ : ++ |
|++ ::++:+: ++ ++ +++ + + + +:+ : + ++ + ++: +|
3.3e+07 |++++ + + + + + + |
3.28e+07 +----------------------------------------------------------------+
vm-scalability.median
182000 +------------------------------------------------------------------+
| O O |
180000 |-OO O |
|O |
| OO OO |
178000 |-+ OOO OOOOOOOOOOOOO OO O |
| OOOO OO O |
176000 |-+ |
| |
174000 |-+ + + + + + + |
| + ++ + + +++++ ++++ + +++++ +: ++ : + ++++++ +++ ++ |
| + :+++++ ::+++++++ + + ++ + :+ + : + +++ + :++|
172000 |+++ + + + + + + + + |
| |
170000 +------------------------------------------------------------------+
vm-scalability.workload
1.05e+10 +----------------------------------------------------------------+
| O O |
1.04e+10 |O+OOO |
|O |
1.03e+10 |-+ O O O |
| O O OO O OOOOOOO O OO |
1.02e+10 |-+ O O OOO O O |
| |
1.01e+10 |-+ |
| + + + + + + + + |
1e+10 |++ + + + ++ + + +++++ ++++ + +++++ +: :+ ::+++++ ++ +++ ++ |
|+++:+ +:+: ++ ++ +++ + + + +:+ : + ++ ++ +|
9.9e+09 |-+ + + + + + + |
| |
9.8e+09 +----------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Oliver Sang
View attachment "config-5.11.0-rc4-00001-gf9ce0be71d1f" of type "text/plain" (172414 bytes)
View attachment "job-script" of type "text/plain" (7783 bytes)
View attachment "job.yaml" of type "text/plain" (5364 bytes)
View attachment "reproduce" of type "text/plain" (924 bytes)
Powered by blists - more mailing lists