[<prev] [next>] [day] [month] [year] [list]
Message-ID: <202311201629.b861c327-oliver.sang@intel.com>
Date: Mon, 20 Nov 2023 21:16:24 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Huang Ying <ying.huang@...el.com>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>,
<linux-kernel@...r.kernel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Mel Gorman <mgorman@...hsingularity.net>,
Vlastimil Babka <vbabka@...e.cz>,
"David Hildenbrand" <david@...hat.com>,
Johannes Weiner <jweiner@...hat.com>,
"Dave Hansen" <dave.hansen@...ux.intel.com>,
Michal Hocko <mhocko@...e.com>,
"Pavel Tatashin" <pasha.tatashin@...een.com>,
Matthew Wilcox <willy@...radead.org>,
Christoph Lameter <cl@...ux.com>,
Arjan van de Ven <arjan@...ux.intel.com>,
Sudeep Holla <sudeep.holla@....com>, <linux-mm@...ck.org>,
<ying.huang@...el.com>, <feng.tang@...el.com>,
<fengwei.yin@...el.com>, <oliver.sang@...el.com>
Subject: [linus:master] [mm, page_alloc] c0a242394c:
will-it-scale.per_process_ops 12.6% improvement
Hello,
kernel test robot noticed a 12.6% improvement of will-it-scale.per_process_ops on:
commit: c0a242394cb980bd00e1f61dc8aacb453d2bbe6a ("mm, page_alloc: scale the number of pages that are batch allocated")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
testcase: will-it-scale
test machine: 104 threads 2 sockets (Skylake) with 192G memory
parameters:
nr_task: 50%
mode: process
test: page_fault2
cpufreq_governor: performance
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20231120/202311201629.b861c327-oliver.sang@intel.com
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
gcc-12/performance/x86_64-rhel-8.3/process/50%/debian-11.1-x86_64-20220510.cgz/lkp-skl-fpga01/page_fault2/will-it-scale
commit:
52166607ec ("mm: restrict the pcp batch scale factor to avoid too long latency")
c0a242394c ("mm, page_alloc: scale the number of pages that are batch allocated")
52166607ecc98039 c0a242394cb980bd00e1f61dc8a
---------------- ---------------------------
%stddev %change %stddev
\ | \
4.90 +0.6 5.49 mpstat.cpu.all.usr%
1367 ± 6% +72.8% 2362 ± 4% perf-c2c.HITM.local
8592059 +12.6% 9677986 will-it-scale.52.processes
165231 +12.6% 186114 will-it-scale.per_process_ops
8592059 +12.6% 9677986 will-it-scale.workload
2592 ± 19% +587.0% 17809 ± 97% numa-meminfo.node0.Active(anon)
3494860 ± 2% -22.6% 2703947 numa-meminfo.node0.AnonPages.max
3538966 ± 2% -24.9% 2657708 ± 7% numa-meminfo.node1.AnonPages.max
9310 ± 3% +7.6% 10019 ± 5% numa-meminfo.node1.KernelStack
1.295e+09 +12.8% 1.46e+09 numa-numastat.node0.local_node
1.294e+09 +12.8% 1.46e+09 numa-numastat.node0.numa_hit
1.31e+09 +12.0% 1.467e+09 numa-numastat.node1.local_node
1.309e+09 +12.0% 1.466e+09 numa-numastat.node1.numa_hit
213394 ± 50% +373.5% 1010435 ± 33% sched_debug.cfs_rq:/.avg_vruntime.min
1932637 ± 4% -32.0% 1313931 ± 8% sched_debug.cfs_rq:/.avg_vruntime.stddev
213394 ± 50% +373.5% 1010435 ± 33% sched_debug.cfs_rq:/.min_vruntime.min
1932637 ± 4% -32.0% 1313931 ± 8% sched_debug.cfs_rq:/.min_vruntime.stddev
0.08 +12.5% 0.09 turbostat.IPC
63.77 -45.2 18.60 ± 22% turbostat.PKG_%
353.10 +2.9% 363.42 turbostat.PkgWatt
68.28 +11.4% 76.03 turbostat.RAMWatt
833540 +5.6% 880188 proc-vmstat.nr_anon_pages
2.603e+09 +12.4% 2.925e+09 proc-vmstat.numa_hit
2.605e+09 +12.4% 2.927e+09 proc-vmstat.numa_local
2.599e+09 +12.4% 2.92e+09 proc-vmstat.pgalloc_normal
2.591e+09 +12.4% 2.911e+09 proc-vmstat.pgfault
2.599e+09 +12.4% 2.92e+09 proc-vmstat.pgfree
648.18 ± 19% +586.7% 4450 ± 97% numa-vmstat.node0.nr_active_anon
648.18 ± 19% +586.7% 4450 ± 97% numa-vmstat.node0.nr_zone_active_anon
1.294e+09 +12.8% 1.46e+09 numa-vmstat.node0.numa_hit
1.295e+09 +12.8% 1.46e+09 numa-vmstat.node0.numa_local
9310 ± 3% +7.6% 10021 ± 5% numa-vmstat.node1.nr_kernel_stack
1.309e+09 +12.0% 1.466e+09 numa-vmstat.node1.numa_hit
1.31e+09 +12.0% 1.467e+09 numa-vmstat.node1.numa_local
0.01 ± 80% -93.5% 0.00 ±223% perf-sched.sch_delay.avg.ms.io_schedule.folio_wait_bit_common.filemap_fault.__do_fault
0.01 ± 9% -100.0% 0.00 perf-sched.sch_delay.avg.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
0.04 ± 9% -46.3% 0.02 ± 73% perf-sched.sch_delay.max.ms.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
0.03 ±107% -97.0% 0.00 ±223% perf-sched.sch_delay.max.ms.io_schedule.folio_wait_bit_common.filemap_fault.__do_fault
0.02 ± 27% -100.0% 0.00 perf-sched.sch_delay.max.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
0.03 ± 7% -15.0% 0.02 ± 10% perf-sched.sch_delay.max.ms.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
0.94 ± 16% -51.9% 0.45 ± 22% perf-sched.sch_delay.max.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
98.83 ± 11% -40.5% 58.83 ± 11% perf-sched.wait_and_delay.count.__cond_resched.shmem_get_folio_gfp.shmem_fault.__do_fault.do_cow_fault
232.00 ± 10% +48.4% 344.33 ± 4% perf-sched.wait_and_delay.count.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_exc_page_fault
39.50 ± 54% -87.3% 5.03 perf-sched.wait_and_delay.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
2.99 ± 15% -100.0% 0.00 perf-sched.wait_time.avg.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
4.81 ± 7% -100.0% 0.00 perf-sched.wait_time.max.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
33.32 ± 69% -85.0% 5.01 perf-sched.wait_time.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
16.82 +1.6% 17.09 perf-stat.i.MPKI
8.6e+09 +11.3% 9.573e+09 perf-stat.i.branch-instructions
39148476 +5.6% 41324228 perf-stat.i.branch-misses
81.02 -3.1 77.94 perf-stat.i.cache-miss-rate%
7.134e+08 +13.5% 8.096e+08 perf-stat.i.cache-misses
8.802e+08 +17.7% 1.036e+09 perf-stat.i.cache-references
1813 +1.2% 1834 perf-stat.i.context-switches
3.43 -9.9% 3.09 perf-stat.i.cpi
204.33 -10.7% 182.42 ± 2% perf-stat.i.cycles-between-cache-misses
10135544 +11.8% 11330409 ± 2% perf-stat.i.dTLB-load-misses
1.06e+10 +11.5% 1.182e+10 perf-stat.i.dTLB-loads
70683663 +12.6% 79603765 perf-stat.i.dTLB-store-misses
6.001e+09 +12.8% 6.766e+09 perf-stat.i.dTLB-stores
9753929 +12.9% 11015762 perf-stat.i.iTLB-load-misses
4.24e+10 +11.5% 4.728e+10 perf-stat.i.instructions
4377 -1.5% 4312 perf-stat.i.instructions-per-iTLB-miss
0.29 +11.5% 0.33 perf-stat.i.ipc
0.34 ± 23% -48.0% 0.18 ± 11% perf-stat.i.major-faults
1343 +17.4% 1577 perf-stat.i.metric.K/sec
253.10 +11.9% 283.16 perf-stat.i.metric.M/sec
8585112 +12.0% 9619126 perf-stat.i.minor-faults
0.32 ± 27% +0.3 0.60 ± 53% perf-stat.i.node-load-miss-rate%
694018 +17.3% 813810 ± 3% perf-stat.i.node-load-misses
2.451e+08 +3.6% 2.539e+08 ± 2% perf-stat.i.node-loads
538019 +14.0% 613240 perf-stat.i.node-store-misses
49463410 +25.2% 61905404 perf-stat.i.node-stores
8585112 +12.0% 9619126 perf-stat.i.page-faults
16.83 +1.7% 17.12 perf-stat.overall.MPKI
0.46 -0.0 0.43 perf-stat.overall.branch-miss-rate%
81.06 -2.9 78.18 perf-stat.overall.cache-miss-rate%
3.42 -10.5% 3.07 perf-stat.overall.cpi
203.46 -12.0% 179.07 perf-stat.overall.cycles-between-cache-misses
4347 -1.3% 4291 perf-stat.overall.instructions-per-iTLB-miss
0.29 +11.7% 0.33 perf-stat.overall.ipc
0.28 +0.0 0.32 perf-stat.overall.node-load-miss-rate%
1.08 -0.1 0.98 ± 2% perf-stat.overall.node-store-miss-rate%
8.572e+09 +11.3% 9.542e+09 perf-stat.ps.branch-instructions
39013363 +5.6% 41189792 perf-stat.ps.branch-misses
7.111e+08 +13.5% 8.07e+08 perf-stat.ps.cache-misses
8.773e+08 +17.7% 1.032e+09 perf-stat.ps.cache-references
1805 +1.2% 1826 perf-stat.ps.context-switches
10101169 +11.8% 11293042 ± 2% perf-stat.ps.dTLB-load-misses
1.056e+10 +11.6% 1.179e+10 perf-stat.ps.dTLB-loads
70446051 +12.6% 79343784 perf-stat.ps.dTLB-store-misses
5.981e+09 +12.8% 6.744e+09 perf-stat.ps.dTLB-stores
9719620 +13.0% 10983217 perf-stat.ps.iTLB-load-misses
4.225e+10 +11.5% 4.713e+10 perf-stat.ps.instructions
0.34 ± 22% -48.1% 0.18 ± 11% perf-stat.ps.major-faults
8556237 +12.1% 9587784 perf-stat.ps.minor-faults
691779 +17.3% 811254 ± 3% perf-stat.ps.node-load-misses
2.442e+08 +3.6% 2.531e+08 ± 2% perf-stat.ps.node-loads
536237 +14.0% 611234 perf-stat.ps.node-store-misses
49302195 +25.2% 61706509 perf-stat.ps.node-stores
8556237 +12.1% 9587784 perf-stat.ps.page-faults
1.277e+13 +11.9% 1.43e+13 perf-stat.total.instructions
23.92 -10.1 13.79 perf-profile.calltrace.cycles-pp.unmap_region.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap.__x64_sys_munmap
23.92 -10.1 13.80 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__munmap
23.92 -10.1 13.80 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
23.92 -10.1 13.80 perf-profile.calltrace.cycles-pp.__munmap
23.92 -10.1 13.80 perf-profile.calltrace.cycles-pp.do_vmi_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe
23.92 -10.1 13.80 perf-profile.calltrace.cycles-pp.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64
23.92 -10.1 13.80 perf-profile.calltrace.cycles-pp.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
23.92 -10.1 13.80 perf-profile.calltrace.cycles-pp.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
19.93 -9.8 10.12 perf-profile.calltrace.cycles-pp.release_pages.tlb_batch_pages_flush.zap_pte_range.zap_pmd_range.unmap_page_range
20.07 -9.7 10.33 perf-profile.calltrace.cycles-pp.tlb_batch_pages_flush.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas
10.10 -9.3 0.84 ± 6% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.free_pcppages_bulk.free_unref_page_list.release_pages
10.10 -9.3 0.84 ± 6% perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.free_pcppages_bulk.free_unref_page_list.release_pages.tlb_batch_pages_flush
21.64 -9.2 12.46 perf-profile.calltrace.cycles-pp.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region
21.66 -9.2 12.48 perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap
21.66 -9.2 12.48 perf-profile.calltrace.cycles-pp.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap
21.66 -9.2 12.48 perf-profile.calltrace.cycles-pp.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap
9.06 -7.1 1.99 ± 2% perf-profile.calltrace.cycles-pp.free_pcppages_bulk.free_unref_page_list.release_pages.tlb_batch_pages_flush.zap_pte_range
9.58 -7.0 2.54 ± 2% perf-profile.calltrace.cycles-pp.free_unref_page_list.release_pages.tlb_batch_pages_flush.zap_pte_range.zap_pmd_range
6.28 ± 2% -6.3 0.00 perf-profile.calltrace.cycles-pp.rmqueue_bulk.rmqueue.get_page_from_freelist.__alloc_pages.__folio_alloc
6.67 -3.5 3.16 ± 4% perf-profile.calltrace.cycles-pp.rmqueue.get_page_from_freelist.__alloc_pages.__folio_alloc.vma_alloc_folio
6.90 -3.5 3.40 ± 3% perf-profile.calltrace.cycles-pp.get_page_from_freelist.__alloc_pages.__folio_alloc.vma_alloc_folio.do_cow_fault
7.28 -3.5 3.83 ± 3% perf-profile.calltrace.cycles-pp.__alloc_pages.__folio_alloc.vma_alloc_folio.do_cow_fault.do_fault
7.34 -3.4 3.90 ± 3% perf-profile.calltrace.cycles-pp.__folio_alloc.vma_alloc_folio.do_cow_fault.do_fault.__handle_mm_fault
7.81 -3.4 4.41 ± 3% perf-profile.calltrace.cycles-pp.vma_alloc_folio.do_cow_fault.do_fault.__handle_mm_fault.handle_mm_fault
9.46 -2.9 6.54 ± 2% perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.release_pages.tlb_batch_pages_flush.zap_pte_range
9.46 -2.9 6.54 ± 2% perf-profile.calltrace.cycles-pp.folio_lruvec_lock_irqsave.release_pages.tlb_batch_pages_flush.zap_pte_range.zap_pmd_range
9.44 -2.1 7.34 ± 2% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.release_pages.tlb_batch_pages_flush
13.41 -2.0 11.42 perf-profile.calltrace.cycles-pp.copy_page.do_cow_fault.do_fault.__handle_mm_fault.handle_mm_fault
2.25 -1.0 1.28 ± 3% perf-profile.calltrace.cycles-pp.release_pages.tlb_batch_pages_flush.tlb_finish_mmu.unmap_region.do_vmi_align_munmap
2.26 -1.0 1.30 ± 2% perf-profile.calltrace.cycles-pp.tlb_batch_pages_flush.tlb_finish_mmu.unmap_region.do_vmi_align_munmap.do_vmi_munmap
2.26 -1.0 1.30 ± 2% perf-profile.calltrace.cycles-pp.tlb_finish_mmu.unmap_region.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap
4.23 -0.8 3.47 perf-profile.calltrace.cycles-pp._raw_spin_lock.__pte_offset_map_lock.finish_fault.do_cow_fault.do_fault
4.35 -0.7 3.60 perf-profile.calltrace.cycles-pp.__pte_offset_map_lock.finish_fault.do_cow_fault.do_fault.__handle_mm_fault
1.05 +0.0 1.09 perf-profile.calltrace.cycles-pp.shmem_get_folio_gfp.shmem_fault.__do_fault.do_cow_fault.do_fault
0.67 +0.1 0.74 ± 2% perf-profile.calltrace.cycles-pp.lock_vma_under_rcu.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
1.25 +0.1 1.33 perf-profile.calltrace.cycles-pp.shmem_fault.__do_fault.do_cow_fault.do_fault.__handle_mm_fault
1.34 +0.1 1.44 perf-profile.calltrace.cycles-pp.__do_fault.do_cow_fault.do_fault.__handle_mm_fault.handle_mm_fault
0.61 ± 7% +0.1 0.72 ± 2% perf-profile.calltrace.cycles-pp.__mod_lruvec_page_state.folio_add_new_anon_rmap.set_pte_range.finish_fault.do_cow_fault
1.06 +0.1 1.18 perf-profile.calltrace.cycles-pp.sync_regs.asm_exc_page_fault.testcase
0.75 ± 6% +0.1 0.88 ± 2% perf-profile.calltrace.cycles-pp.folio_add_new_anon_rmap.set_pte_range.finish_fault.do_cow_fault.do_fault
0.82 +0.2 1.00 ± 2% perf-profile.calltrace.cycles-pp.lru_add_fn.folio_batch_move_lru.folio_add_lru_vma.set_pte_range.finish_fault
2.72 +0.3 3.02 perf-profile.calltrace.cycles-pp.error_entry.testcase
2.52 ± 2% +0.3 2.83 perf-profile.calltrace.cycles-pp.irqentry_exit_to_user_mode.asm_exc_page_fault.testcase
2.77 +0.3 3.10 perf-profile.calltrace.cycles-pp.__irqentry_text_end.testcase
0.72 ± 2% +0.3 1.06 perf-profile.calltrace.cycles-pp.__free_one_page.free_pcppages_bulk.free_unref_page_list.release_pages.tlb_batch_pages_flush
0.75 +0.4 1.12 ± 2% perf-profile.calltrace.cycles-pp._compound_head.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas
0.17 ±141% +0.4 0.58 perf-profile.calltrace.cycles-pp.page_remove_rmap.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas
0.00 +0.5 0.54 perf-profile.calltrace.cycles-pp.__perf_sw_event.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
0.00 +0.8 0.82 ± 3% perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.release_pages.tlb_batch_pages_flush.tlb_finish_mmu
0.00 +0.8 0.82 ± 3% perf-profile.calltrace.cycles-pp.folio_lruvec_lock_irqsave.release_pages.tlb_batch_pages_flush.tlb_finish_mmu.unmap_region
0.00 +0.8 0.85 ± 3% perf-profile.calltrace.cycles-pp.__list_del_entry_valid_or_report.rmqueue_bulk.__rmqueue_pcplist.rmqueue.get_page_from_freelist
0.00 +1.6 1.61 ± 5% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.rmqueue_bulk.__rmqueue_pcplist.rmqueue
0.00 +1.6 1.62 ± 6% perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.rmqueue_bulk.__rmqueue_pcplist.rmqueue.get_page_from_freelist
0.00 +2.6 2.56 ± 4% perf-profile.calltrace.cycles-pp.rmqueue_bulk.__rmqueue_pcplist.rmqueue.get_page_from_freelist.__alloc_pages
0.00 +2.9 2.90 ± 4% perf-profile.calltrace.cycles-pp.__rmqueue_pcplist.rmqueue.get_page_from_freelist.__alloc_pages.__folio_alloc
32.25 +7.6 39.90 perf-profile.calltrace.cycles-pp.do_cow_fault.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
32.34 +7.7 39.99 perf-profile.calltrace.cycles-pp.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault
32.92 +7.7 40.65 perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
33.76 +7.8 41.56 perf-profile.calltrace.cycles-pp.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
35.34 +8.0 43.29 perf-profile.calltrace.cycles-pp.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
35.47 +8.0 43.44 perf-profile.calltrace.cycles-pp.exc_page_fault.asm_exc_page_fault.testcase
44.87 +9.1 53.97 perf-profile.calltrace.cycles-pp.asm_exc_page_fault.testcase
48.05 +9.4 57.46 perf-profile.calltrace.cycles-pp.testcase
8.27 ± 2% +12.8 21.10 ± 2% perf-profile.calltrace.cycles-pp.finish_fault.do_cow_fault.do_fault.__handle_mm_fault.handle_mm_fault
1.35 ± 8% +13.0 14.35 ± 3% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.folio_batch_move_lru.folio_add_lru_vma
1.43 ± 7% +13.0 14.43 ± 3% perf-profile.calltrace.cycles-pp.folio_lruvec_lock_irqsave.folio_batch_move_lru.folio_add_lru_vma.set_pte_range.finish_fault
1.42 ± 7% +13.0 14.42 ± 3% perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.folio_batch_move_lru.folio_add_lru_vma.set_pte_range
2.77 ± 5% +13.4 16.18 ± 3% perf-profile.calltrace.cycles-pp.folio_batch_move_lru.folio_add_lru_vma.set_pte_range.finish_fault.do_cow_fault
2.90 ± 5% +13.4 16.32 ± 3% perf-profile.calltrace.cycles-pp.folio_add_lru_vma.set_pte_range.finish_fault.do_cow_fault.do_fault
3.85 ± 4% +13.6 17.42 ± 3% perf-profile.calltrace.cycles-pp.set_pte_range.finish_fault.do_cow_fault.do_fault.__handle_mm_fault
22.33 -10.8 11.57 perf-profile.children.cycles-pp.release_pages
22.33 -10.7 11.63 perf-profile.children.cycles-pp.tlb_batch_pages_flush
23.93 -10.1 13.80 perf-profile.children.cycles-pp.do_vmi_align_munmap
23.92 -10.1 13.80 perf-profile.children.cycles-pp.__munmap
23.92 -10.1 13.80 perf-profile.children.cycles-pp.__vm_munmap
23.92 -10.1 13.80 perf-profile.children.cycles-pp.__x64_sys_munmap
23.92 -10.1 13.79 perf-profile.children.cycles-pp.unmap_region
23.93 -10.1 13.80 perf-profile.children.cycles-pp.do_vmi_munmap
24.00 -10.1 13.87 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
23.99 -10.1 13.87 perf-profile.children.cycles-pp.do_syscall_64
21.66 -9.2 12.48 perf-profile.children.cycles-pp.unmap_vmas
21.66 -9.2 12.48 perf-profile.children.cycles-pp.unmap_page_range
21.66 -9.2 12.48 perf-profile.children.cycles-pp.zap_pmd_range
21.66 -9.2 12.48 perf-profile.children.cycles-pp.zap_pte_range
11.00 -8.8 2.24 ± 2% perf-profile.children.cycles-pp.free_pcppages_bulk
11.59 -8.7 2.88 ± 2% perf-profile.children.cycles-pp.free_unref_page_list
6.30 ± 2% -3.7 2.58 ± 4% perf-profile.children.cycles-pp.rmqueue_bulk
6.70 -3.5 3.19 ± 4% perf-profile.children.cycles-pp.rmqueue
6.94 -3.5 3.44 ± 3% perf-profile.children.cycles-pp.get_page_from_freelist
7.37 -3.4 3.92 ± 3% perf-profile.children.cycles-pp.__folio_alloc
7.36 -3.4 3.92 ± 3% perf-profile.children.cycles-pp.__alloc_pages
7.85 -3.4 4.45 ± 2% perf-profile.children.cycles-pp.vma_alloc_folio
13.43 -2.0 11.44 perf-profile.children.cycles-pp.copy_page
25.59 -1.2 24.37 ± 2% perf-profile.children.cycles-pp._raw_spin_lock_irqsave
25.50 -1.2 24.28 ± 2% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
2.26 -1.0 1.30 ± 2% perf-profile.children.cycles-pp.tlb_finish_mmu
4.26 -0.8 3.50 perf-profile.children.cycles-pp._raw_spin_lock
4.35 -0.7 3.61 perf-profile.children.cycles-pp.__pte_offset_map_lock
1.92 -0.5 1.43 ± 3% perf-profile.children.cycles-pp.__list_del_entry_valid_or_report
0.09 +0.0 0.10 ± 3% perf-profile.children.cycles-pp.put_page
0.08 ± 5% +0.0 0.10 ± 5% perf-profile.children.cycles-pp.pte_offset_map_nolock
0.10 ± 6% +0.0 0.12 ± 4% perf-profile.children.cycles-pp.scheduler_tick
0.09 ± 4% +0.0 0.10 ± 3% perf-profile.children.cycles-pp.get_pfnblock_flags_mask
0.12 ± 3% +0.0 0.14 ± 3% perf-profile.children.cycles-pp._raw_spin_trylock
0.11 ± 4% +0.0 0.13 ± 6% perf-profile.children.cycles-pp.down_read_trylock
0.16 ± 2% +0.0 0.18 ± 2% perf-profile.children.cycles-pp.free_unref_page_prepare
0.14 ± 4% +0.0 0.16 ± 3% perf-profile.children.cycles-pp.tick_sched_timer
0.13 ± 5% +0.0 0.15 ± 3% perf-profile.children.cycles-pp.cgroup_rstat_updated
0.13 ± 5% +0.0 0.15 ± 5% perf-profile.children.cycles-pp.update_process_times
0.13 ± 3% +0.0 0.15 ± 3% perf-profile.children.cycles-pp.__mem_cgroup_uncharge_list
0.13 ± 5% +0.0 0.15 ± 4% perf-profile.children.cycles-pp.tick_sched_handle
0.18 ± 3% +0.0 0.21 ± 4% perf-profile.children.cycles-pp.blk_cgroup_congested
0.36 +0.0 0.39 ± 2% perf-profile.children.cycles-pp.mas_walk
0.14 ± 5% +0.0 0.18 ± 4% perf-profile.children.cycles-pp.handle_pte_fault
0.22 ± 2% +0.0 0.25 ± 4% perf-profile.children.cycles-pp.__folio_throttle_swaprate
1.06 +0.0 1.11 perf-profile.children.cycles-pp.shmem_get_folio_gfp
0.37 ± 2% +0.0 0.41 perf-profile.children.cycles-pp.__mod_node_page_state
0.46 +0.0 0.51 ± 2% perf-profile.children.cycles-pp.__mod_lruvec_state
0.67 +0.1 0.74 ± 2% perf-profile.children.cycles-pp.lock_vma_under_rcu
0.13 ± 2% +0.1 0.20 ± 2% perf-profile.children.cycles-pp.free_pages_and_swap_cache
0.12 ± 4% +0.1 0.20 ± 3% perf-profile.children.cycles-pp.free_swap_cache
0.48 ± 4% +0.1 0.55 ± 3% perf-profile.children.cycles-pp.__mod_memcg_lruvec_state
1.25 +0.1 1.34 perf-profile.children.cycles-pp.shmem_fault
0.49 +0.1 0.58 perf-profile.children.cycles-pp.page_remove_rmap
1.35 +0.1 1.44 perf-profile.children.cycles-pp.__do_fault
0.73 ± 3% +0.1 0.82 perf-profile.children.cycles-pp.___perf_sw_event
0.93 ± 2% +0.1 1.04 perf-profile.children.cycles-pp.__perf_sw_event
1.10 +0.1 1.22 perf-profile.children.cycles-pp.sync_regs
0.22 ± 5% +0.1 0.35 ± 5% perf-profile.children.cycles-pp.__list_add_valid_or_report
0.75 ± 6% +0.1 0.88 ± 2% perf-profile.children.cycles-pp.folio_add_new_anon_rmap
0.85 ± 4% +0.2 1.00 ± 2% perf-profile.children.cycles-pp.__mod_lruvec_page_state
1.44 +0.2 1.61 perf-profile.children.cycles-pp.native_irq_return_iret
0.84 +0.2 1.02 ± 2% perf-profile.children.cycles-pp.lru_add_fn
2.74 +0.3 3.04 perf-profile.children.cycles-pp.error_entry
2.56 ± 2% +0.3 2.87 perf-profile.children.cycles-pp.irqentry_exit_to_user_mode
2.77 +0.3 3.10 perf-profile.children.cycles-pp.__irqentry_text_end
0.79 +0.4 1.17 perf-profile.children.cycles-pp._compound_head
0.81 ± 2% +0.4 1.21 perf-profile.children.cycles-pp.__free_one_page
0.00 +2.9 2.92 ± 4% perf-profile.children.cycles-pp.__rmqueue_pcplist
32.30 +7.6 39.94 perf-profile.children.cycles-pp.do_cow_fault
32.34 +7.6 39.99 perf-profile.children.cycles-pp.do_fault
32.93 +7.7 40.67 perf-profile.children.cycles-pp.__handle_mm_fault
33.81 +7.8 41.61 perf-profile.children.cycles-pp.handle_mm_fault
35.36 +8.0 43.32 perf-profile.children.cycles-pp.do_user_addr_fault
35.49 +8.0 43.46 perf-profile.children.cycles-pp.exc_page_fault
42.09 +8.8 50.86 perf-profile.children.cycles-pp.asm_exc_page_fault
49.44 +9.6 59.04 perf-profile.children.cycles-pp.testcase
11.03 ± 2% +10.8 21.81 ± 2% perf-profile.children.cycles-pp.folio_lruvec_lock_irqsave
8.29 ± 2% +12.8 21.12 ± 2% perf-profile.children.cycles-pp.finish_fault
2.78 ± 5% +13.4 16.21 ± 3% perf-profile.children.cycles-pp.folio_batch_move_lru
2.90 ± 5% +13.4 16.33 ± 3% perf-profile.children.cycles-pp.folio_add_lru_vma
3.86 ± 4% +13.6 17.43 ± 3% perf-profile.children.cycles-pp.set_pte_range
13.36 -2.0 11.38 perf-profile.self.cycles-pp.copy_page
25.49 -1.2 24.28 ± 2% perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
4.24 -0.8 3.48 perf-profile.self.cycles-pp._raw_spin_lock
1.92 -0.5 1.42 ± 3% perf-profile.self.cycles-pp.__list_del_entry_valid_or_report
0.22 ± 4% -0.1 0.13 ± 3% perf-profile.self.cycles-pp.rmqueue
0.10 ± 7% -0.0 0.08 ± 4% perf-profile.self.cycles-pp.rmqueue_bulk
0.11 +0.0 0.12 ± 3% perf-profile.self.cycles-pp.uncharge_folio
0.09 +0.0 0.10 ± 3% perf-profile.self.cycles-pp.put_page
0.09 ± 4% +0.0 0.10 ± 4% perf-profile.self.cycles-pp.exc_page_fault
0.07 ± 5% +0.0 0.08 ± 4% perf-profile.self.cycles-pp.exit_to_user_mode_prepare
0.13 +0.0 0.14 ± 3% perf-profile.self.cycles-pp.asm_exc_page_fault
0.12 ± 4% +0.0 0.14 ± 3% perf-profile.self.cycles-pp._raw_spin_trylock
0.08 ± 5% +0.0 0.10 ± 3% perf-profile.self.cycles-pp.get_pfnblock_flags_mask
0.11 ± 4% +0.0 0.13 ± 6% perf-profile.self.cycles-pp.down_read_trylock
0.22 ± 2% +0.0 0.24 ± 2% perf-profile.self.cycles-pp.get_page_from_freelist
0.14 ± 3% +0.0 0.15 ± 3% perf-profile.self.cycles-pp.folio_add_new_anon_rmap
0.14 ± 2% +0.0 0.16 ± 3% perf-profile.self.cycles-pp.blk_cgroup_congested
0.11 ± 6% +0.0 0.13 ± 2% perf-profile.self.cycles-pp.cgroup_rstat_updated
0.06 ± 11% +0.0 0.08 ± 5% perf-profile.self.cycles-pp.handle_pte_fault
0.12 ± 4% +0.0 0.14 ± 4% perf-profile.self.cycles-pp.mas_walk
0.18 ± 2% +0.0 0.20 ± 2% perf-profile.self.cycles-pp.free_unref_page_list
0.28 ± 3% +0.0 0.30 ± 2% perf-profile.self.cycles-pp.vma_alloc_folio
0.27 ± 2% +0.0 0.30 perf-profile.self.cycles-pp.page_remove_rmap
0.20 +0.0 0.24 ± 3% perf-profile.self.cycles-pp.shmem_fault
0.24 ± 3% +0.0 0.27 perf-profile.self.cycles-pp.shmem_get_folio_gfp
0.34 +0.0 0.38 ± 2% perf-profile.self.cycles-pp.__alloc_pages
0.35 ± 3% +0.0 0.40 perf-profile.self.cycles-pp.__mod_node_page_state
0.44 ± 2% +0.1 0.49 perf-profile.self.cycles-pp.__handle_mm_fault
0.38 +0.1 0.44 ± 3% perf-profile.self.cycles-pp.lru_add_fn
0.40 ± 4% +0.1 0.46 ± 4% perf-profile.self.cycles-pp.__mod_memcg_lruvec_state
0.00 +0.1 0.07 ± 7% perf-profile.self.cycles-pp.__rmqueue_pcplist
0.12 ± 3% +0.1 0.19 ± 3% perf-profile.self.cycles-pp.free_swap_cache
0.65 ± 2% +0.1 0.72 perf-profile.self.cycles-pp.___perf_sw_event
0.29 ± 12% +0.1 0.37 ± 6% perf-profile.self.cycles-pp.__mod_lruvec_page_state
0.29 +0.1 0.39 ± 2% perf-profile.self.cycles-pp.zap_pte_range
1.10 +0.1 1.22 perf-profile.self.cycles-pp.sync_regs
0.21 ± 4% +0.1 0.33 ± 6% perf-profile.self.cycles-pp.__list_add_valid_or_report
1.44 +0.2 1.60 perf-profile.self.cycles-pp.native_irq_return_iret
0.38 ± 7% +0.2 0.55 ± 2% perf-profile.self.cycles-pp.folio_batch_move_lru
2.73 +0.3 3.03 perf-profile.self.cycles-pp.error_entry
2.47 +0.3 2.78 perf-profile.self.cycles-pp.irqentry_exit_to_user_mode
2.77 +0.3 3.10 perf-profile.self.cycles-pp.__irqentry_text_end
0.78 +0.4 1.15 perf-profile.self.cycles-pp._compound_head
3.17 +0.4 3.54 perf-profile.self.cycles-pp.testcase
0.75 ± 2% +0.4 1.15 perf-profile.self.cycles-pp.__free_one_page
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
Powered by blists - more mailing lists