lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <202311201629.b861c327-oliver.sang@intel.com>
Date:   Mon, 20 Nov 2023 21:16:24 +0800
From:   kernel test robot <oliver.sang@...el.com>
To:     Huang Ying <ying.huang@...el.com>
CC:     <oe-lkp@...ts.linux.dev>, <lkp@...el.com>,
        <linux-kernel@...r.kernel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Vlastimil Babka <vbabka@...e.cz>,
        "David Hildenbrand" <david@...hat.com>,
        Johannes Weiner <jweiner@...hat.com>,
        "Dave Hansen" <dave.hansen@...ux.intel.com>,
        Michal Hocko <mhocko@...e.com>,
        "Pavel Tatashin" <pasha.tatashin@...een.com>,
        Matthew Wilcox <willy@...radead.org>,
        Christoph Lameter <cl@...ux.com>,
        Arjan van de Ven <arjan@...ux.intel.com>,
        Sudeep Holla <sudeep.holla@....com>, <linux-mm@...ck.org>,
        <ying.huang@...el.com>, <feng.tang@...el.com>,
        <fengwei.yin@...el.com>, <oliver.sang@...el.com>
Subject: [linus:master] [mm, page_alloc]  c0a242394c:
 will-it-scale.per_process_ops 12.6% improvement



Hello,

kernel test robot noticed a 12.6% improvement of will-it-scale.per_process_ops on:


commit: c0a242394cb980bd00e1f61dc8aacb453d2bbe6a ("mm, page_alloc: scale the number of pages that are batch allocated")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

testcase: will-it-scale
test machine: 104 threads 2 sockets (Skylake) with 192G memory
parameters:

	nr_task: 50%
	mode: process
	test: page_fault2
	cpufreq_governor: performance






Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20231120/202311201629.b861c327-oliver.sang@intel.com

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
  gcc-12/performance/x86_64-rhel-8.3/process/50%/debian-11.1-x86_64-20220510.cgz/lkp-skl-fpga01/page_fault2/will-it-scale

commit: 
  52166607ec ("mm: restrict the pcp batch scale factor to avoid too long latency")
  c0a242394c ("mm, page_alloc: scale the number of pages that are batch allocated")

52166607ecc98039 c0a242394cb980bd00e1f61dc8a 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
      4.90            +0.6        5.49        mpstat.cpu.all.usr%
      1367 ±  6%     +72.8%       2362 ±  4%  perf-c2c.HITM.local
   8592059           +12.6%    9677986        will-it-scale.52.processes
    165231           +12.6%     186114        will-it-scale.per_process_ops
   8592059           +12.6%    9677986        will-it-scale.workload
      2592 ± 19%    +587.0%      17809 ± 97%  numa-meminfo.node0.Active(anon)
   3494860 ±  2%     -22.6%    2703947        numa-meminfo.node0.AnonPages.max
   3538966 ±  2%     -24.9%    2657708 ±  7%  numa-meminfo.node1.AnonPages.max
      9310 ±  3%      +7.6%      10019 ±  5%  numa-meminfo.node1.KernelStack
 1.295e+09           +12.8%   1.46e+09        numa-numastat.node0.local_node
 1.294e+09           +12.8%   1.46e+09        numa-numastat.node0.numa_hit
  1.31e+09           +12.0%  1.467e+09        numa-numastat.node1.local_node
 1.309e+09           +12.0%  1.466e+09        numa-numastat.node1.numa_hit
    213394 ± 50%    +373.5%    1010435 ± 33%  sched_debug.cfs_rq:/.avg_vruntime.min
   1932637 ±  4%     -32.0%    1313931 ±  8%  sched_debug.cfs_rq:/.avg_vruntime.stddev
    213394 ± 50%    +373.5%    1010435 ± 33%  sched_debug.cfs_rq:/.min_vruntime.min
   1932637 ±  4%     -32.0%    1313931 ±  8%  sched_debug.cfs_rq:/.min_vruntime.stddev
      0.08           +12.5%       0.09        turbostat.IPC
     63.77           -45.2       18.60 ± 22%  turbostat.PKG_%
    353.10            +2.9%     363.42        turbostat.PkgWatt
     68.28           +11.4%      76.03        turbostat.RAMWatt
    833540            +5.6%     880188        proc-vmstat.nr_anon_pages
 2.603e+09           +12.4%  2.925e+09        proc-vmstat.numa_hit
 2.605e+09           +12.4%  2.927e+09        proc-vmstat.numa_local
 2.599e+09           +12.4%   2.92e+09        proc-vmstat.pgalloc_normal
 2.591e+09           +12.4%  2.911e+09        proc-vmstat.pgfault
 2.599e+09           +12.4%   2.92e+09        proc-vmstat.pgfree
    648.18 ± 19%    +586.7%       4450 ± 97%  numa-vmstat.node0.nr_active_anon
    648.18 ± 19%    +586.7%       4450 ± 97%  numa-vmstat.node0.nr_zone_active_anon
 1.294e+09           +12.8%   1.46e+09        numa-vmstat.node0.numa_hit
 1.295e+09           +12.8%   1.46e+09        numa-vmstat.node0.numa_local
      9310 ±  3%      +7.6%      10021 ±  5%  numa-vmstat.node1.nr_kernel_stack
 1.309e+09           +12.0%  1.466e+09        numa-vmstat.node1.numa_hit
  1.31e+09           +12.0%  1.467e+09        numa-vmstat.node1.numa_local
      0.01 ± 80%     -93.5%       0.00 ±223%  perf-sched.sch_delay.avg.ms.io_schedule.folio_wait_bit_common.filemap_fault.__do_fault
      0.01 ±  9%    -100.0%       0.00        perf-sched.sch_delay.avg.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
      0.04 ±  9%     -46.3%       0.02 ± 73%  perf-sched.sch_delay.max.ms.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
      0.03 ±107%     -97.0%       0.00 ±223%  perf-sched.sch_delay.max.ms.io_schedule.folio_wait_bit_common.filemap_fault.__do_fault
      0.02 ± 27%    -100.0%       0.00        perf-sched.sch_delay.max.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
      0.03 ±  7%     -15.0%       0.02 ± 10%  perf-sched.sch_delay.max.ms.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
      0.94 ± 16%     -51.9%       0.45 ± 22%  perf-sched.sch_delay.max.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
     98.83 ± 11%     -40.5%      58.83 ± 11%  perf-sched.wait_and_delay.count.__cond_resched.shmem_get_folio_gfp.shmem_fault.__do_fault.do_cow_fault
    232.00 ± 10%     +48.4%     344.33 ±  4%  perf-sched.wait_and_delay.count.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_exc_page_fault
     39.50 ± 54%     -87.3%       5.03        perf-sched.wait_and_delay.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
      2.99 ± 15%    -100.0%       0.00        perf-sched.wait_time.avg.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
      4.81 ±  7%    -100.0%       0.00        perf-sched.wait_time.max.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
     33.32 ± 69%     -85.0%       5.01        perf-sched.wait_time.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
     16.82            +1.6%      17.09        perf-stat.i.MPKI
   8.6e+09           +11.3%  9.573e+09        perf-stat.i.branch-instructions
  39148476            +5.6%   41324228        perf-stat.i.branch-misses
     81.02            -3.1       77.94        perf-stat.i.cache-miss-rate%
 7.134e+08           +13.5%  8.096e+08        perf-stat.i.cache-misses
 8.802e+08           +17.7%  1.036e+09        perf-stat.i.cache-references
      1813            +1.2%       1834        perf-stat.i.context-switches
      3.43            -9.9%       3.09        perf-stat.i.cpi
    204.33           -10.7%     182.42 ±  2%  perf-stat.i.cycles-between-cache-misses
  10135544           +11.8%   11330409 ±  2%  perf-stat.i.dTLB-load-misses
  1.06e+10           +11.5%  1.182e+10        perf-stat.i.dTLB-loads
  70683663           +12.6%   79603765        perf-stat.i.dTLB-store-misses
 6.001e+09           +12.8%  6.766e+09        perf-stat.i.dTLB-stores
   9753929           +12.9%   11015762        perf-stat.i.iTLB-load-misses
  4.24e+10           +11.5%  4.728e+10        perf-stat.i.instructions
      4377            -1.5%       4312        perf-stat.i.instructions-per-iTLB-miss
      0.29           +11.5%       0.33        perf-stat.i.ipc
      0.34 ± 23%     -48.0%       0.18 ± 11%  perf-stat.i.major-faults
      1343           +17.4%       1577        perf-stat.i.metric.K/sec
    253.10           +11.9%     283.16        perf-stat.i.metric.M/sec
   8585112           +12.0%    9619126        perf-stat.i.minor-faults
      0.32 ± 27%      +0.3        0.60 ± 53%  perf-stat.i.node-load-miss-rate%
    694018           +17.3%     813810 ±  3%  perf-stat.i.node-load-misses
 2.451e+08            +3.6%  2.539e+08 ±  2%  perf-stat.i.node-loads
    538019           +14.0%     613240        perf-stat.i.node-store-misses
  49463410           +25.2%   61905404        perf-stat.i.node-stores
   8585112           +12.0%    9619126        perf-stat.i.page-faults
     16.83            +1.7%      17.12        perf-stat.overall.MPKI
      0.46            -0.0        0.43        perf-stat.overall.branch-miss-rate%
     81.06            -2.9       78.18        perf-stat.overall.cache-miss-rate%
      3.42           -10.5%       3.07        perf-stat.overall.cpi
    203.46           -12.0%     179.07        perf-stat.overall.cycles-between-cache-misses
      4347            -1.3%       4291        perf-stat.overall.instructions-per-iTLB-miss
      0.29           +11.7%       0.33        perf-stat.overall.ipc
      0.28            +0.0        0.32        perf-stat.overall.node-load-miss-rate%
      1.08            -0.1        0.98 ±  2%  perf-stat.overall.node-store-miss-rate%
 8.572e+09           +11.3%  9.542e+09        perf-stat.ps.branch-instructions
  39013363            +5.6%   41189792        perf-stat.ps.branch-misses
 7.111e+08           +13.5%   8.07e+08        perf-stat.ps.cache-misses
 8.773e+08           +17.7%  1.032e+09        perf-stat.ps.cache-references
      1805            +1.2%       1826        perf-stat.ps.context-switches
  10101169           +11.8%   11293042 ±  2%  perf-stat.ps.dTLB-load-misses
 1.056e+10           +11.6%  1.179e+10        perf-stat.ps.dTLB-loads
  70446051           +12.6%   79343784        perf-stat.ps.dTLB-store-misses
 5.981e+09           +12.8%  6.744e+09        perf-stat.ps.dTLB-stores
   9719620           +13.0%   10983217        perf-stat.ps.iTLB-load-misses
 4.225e+10           +11.5%  4.713e+10        perf-stat.ps.instructions
      0.34 ± 22%     -48.1%       0.18 ± 11%  perf-stat.ps.major-faults
   8556237           +12.1%    9587784        perf-stat.ps.minor-faults
    691779           +17.3%     811254 ±  3%  perf-stat.ps.node-load-misses
 2.442e+08            +3.6%  2.531e+08 ±  2%  perf-stat.ps.node-loads
    536237           +14.0%     611234        perf-stat.ps.node-store-misses
  49302195           +25.2%   61706509        perf-stat.ps.node-stores
   8556237           +12.1%    9587784        perf-stat.ps.page-faults
 1.277e+13           +11.9%   1.43e+13        perf-stat.total.instructions
     23.92           -10.1       13.79        perf-profile.calltrace.cycles-pp.unmap_region.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap.__x64_sys_munmap
     23.92           -10.1       13.80        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__munmap
     23.92           -10.1       13.80        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
     23.92           -10.1       13.80        perf-profile.calltrace.cycles-pp.__munmap
     23.92           -10.1       13.80        perf-profile.calltrace.cycles-pp.do_vmi_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe
     23.92           -10.1       13.80        perf-profile.calltrace.cycles-pp.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64
     23.92           -10.1       13.80        perf-profile.calltrace.cycles-pp.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
     23.92           -10.1       13.80        perf-profile.calltrace.cycles-pp.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
     19.93            -9.8       10.12        perf-profile.calltrace.cycles-pp.release_pages.tlb_batch_pages_flush.zap_pte_range.zap_pmd_range.unmap_page_range
     20.07            -9.7       10.33        perf-profile.calltrace.cycles-pp.tlb_batch_pages_flush.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas
     10.10            -9.3        0.84 ±  6%  perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.free_pcppages_bulk.free_unref_page_list.release_pages
     10.10            -9.3        0.84 ±  6%  perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.free_pcppages_bulk.free_unref_page_list.release_pages.tlb_batch_pages_flush
     21.64            -9.2       12.46        perf-profile.calltrace.cycles-pp.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region
     21.66            -9.2       12.48        perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap
     21.66            -9.2       12.48        perf-profile.calltrace.cycles-pp.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap
     21.66            -9.2       12.48        perf-profile.calltrace.cycles-pp.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap
      9.06            -7.1        1.99 ±  2%  perf-profile.calltrace.cycles-pp.free_pcppages_bulk.free_unref_page_list.release_pages.tlb_batch_pages_flush.zap_pte_range
      9.58            -7.0        2.54 ±  2%  perf-profile.calltrace.cycles-pp.free_unref_page_list.release_pages.tlb_batch_pages_flush.zap_pte_range.zap_pmd_range
      6.28 ±  2%      -6.3        0.00        perf-profile.calltrace.cycles-pp.rmqueue_bulk.rmqueue.get_page_from_freelist.__alloc_pages.__folio_alloc
      6.67            -3.5        3.16 ±  4%  perf-profile.calltrace.cycles-pp.rmqueue.get_page_from_freelist.__alloc_pages.__folio_alloc.vma_alloc_folio
      6.90            -3.5        3.40 ±  3%  perf-profile.calltrace.cycles-pp.get_page_from_freelist.__alloc_pages.__folio_alloc.vma_alloc_folio.do_cow_fault
      7.28            -3.5        3.83 ±  3%  perf-profile.calltrace.cycles-pp.__alloc_pages.__folio_alloc.vma_alloc_folio.do_cow_fault.do_fault
      7.34            -3.4        3.90 ±  3%  perf-profile.calltrace.cycles-pp.__folio_alloc.vma_alloc_folio.do_cow_fault.do_fault.__handle_mm_fault
      7.81            -3.4        4.41 ±  3%  perf-profile.calltrace.cycles-pp.vma_alloc_folio.do_cow_fault.do_fault.__handle_mm_fault.handle_mm_fault
      9.46            -2.9        6.54 ±  2%  perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.release_pages.tlb_batch_pages_flush.zap_pte_range
      9.46            -2.9        6.54 ±  2%  perf-profile.calltrace.cycles-pp.folio_lruvec_lock_irqsave.release_pages.tlb_batch_pages_flush.zap_pte_range.zap_pmd_range
      9.44            -2.1        7.34 ±  2%  perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.release_pages.tlb_batch_pages_flush
     13.41            -2.0       11.42        perf-profile.calltrace.cycles-pp.copy_page.do_cow_fault.do_fault.__handle_mm_fault.handle_mm_fault
      2.25            -1.0        1.28 ±  3%  perf-profile.calltrace.cycles-pp.release_pages.tlb_batch_pages_flush.tlb_finish_mmu.unmap_region.do_vmi_align_munmap
      2.26            -1.0        1.30 ±  2%  perf-profile.calltrace.cycles-pp.tlb_batch_pages_flush.tlb_finish_mmu.unmap_region.do_vmi_align_munmap.do_vmi_munmap
      2.26            -1.0        1.30 ±  2%  perf-profile.calltrace.cycles-pp.tlb_finish_mmu.unmap_region.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap
      4.23            -0.8        3.47        perf-profile.calltrace.cycles-pp._raw_spin_lock.__pte_offset_map_lock.finish_fault.do_cow_fault.do_fault
      4.35            -0.7        3.60        perf-profile.calltrace.cycles-pp.__pte_offset_map_lock.finish_fault.do_cow_fault.do_fault.__handle_mm_fault
      1.05            +0.0        1.09        perf-profile.calltrace.cycles-pp.shmem_get_folio_gfp.shmem_fault.__do_fault.do_cow_fault.do_fault
      0.67            +0.1        0.74 ±  2%  perf-profile.calltrace.cycles-pp.lock_vma_under_rcu.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
      1.25            +0.1        1.33        perf-profile.calltrace.cycles-pp.shmem_fault.__do_fault.do_cow_fault.do_fault.__handle_mm_fault
      1.34            +0.1        1.44        perf-profile.calltrace.cycles-pp.__do_fault.do_cow_fault.do_fault.__handle_mm_fault.handle_mm_fault
      0.61 ±  7%      +0.1        0.72 ±  2%  perf-profile.calltrace.cycles-pp.__mod_lruvec_page_state.folio_add_new_anon_rmap.set_pte_range.finish_fault.do_cow_fault
      1.06            +0.1        1.18        perf-profile.calltrace.cycles-pp.sync_regs.asm_exc_page_fault.testcase
      0.75 ±  6%      +0.1        0.88 ±  2%  perf-profile.calltrace.cycles-pp.folio_add_new_anon_rmap.set_pte_range.finish_fault.do_cow_fault.do_fault
      0.82            +0.2        1.00 ±  2%  perf-profile.calltrace.cycles-pp.lru_add_fn.folio_batch_move_lru.folio_add_lru_vma.set_pte_range.finish_fault
      2.72            +0.3        3.02        perf-profile.calltrace.cycles-pp.error_entry.testcase
      2.52 ±  2%      +0.3        2.83        perf-profile.calltrace.cycles-pp.irqentry_exit_to_user_mode.asm_exc_page_fault.testcase
      2.77            +0.3        3.10        perf-profile.calltrace.cycles-pp.__irqentry_text_end.testcase
      0.72 ±  2%      +0.3        1.06        perf-profile.calltrace.cycles-pp.__free_one_page.free_pcppages_bulk.free_unref_page_list.release_pages.tlb_batch_pages_flush
      0.75            +0.4        1.12 ±  2%  perf-profile.calltrace.cycles-pp._compound_head.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas
      0.17 ±141%      +0.4        0.58        perf-profile.calltrace.cycles-pp.page_remove_rmap.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas
      0.00            +0.5        0.54        perf-profile.calltrace.cycles-pp.__perf_sw_event.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
      0.00            +0.8        0.82 ±  3%  perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.release_pages.tlb_batch_pages_flush.tlb_finish_mmu
      0.00            +0.8        0.82 ±  3%  perf-profile.calltrace.cycles-pp.folio_lruvec_lock_irqsave.release_pages.tlb_batch_pages_flush.tlb_finish_mmu.unmap_region
      0.00            +0.8        0.85 ±  3%  perf-profile.calltrace.cycles-pp.__list_del_entry_valid_or_report.rmqueue_bulk.__rmqueue_pcplist.rmqueue.get_page_from_freelist
      0.00            +1.6        1.61 ±  5%  perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.rmqueue_bulk.__rmqueue_pcplist.rmqueue
      0.00            +1.6        1.62 ±  6%  perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.rmqueue_bulk.__rmqueue_pcplist.rmqueue.get_page_from_freelist
      0.00            +2.6        2.56 ±  4%  perf-profile.calltrace.cycles-pp.rmqueue_bulk.__rmqueue_pcplist.rmqueue.get_page_from_freelist.__alloc_pages
      0.00            +2.9        2.90 ±  4%  perf-profile.calltrace.cycles-pp.__rmqueue_pcplist.rmqueue.get_page_from_freelist.__alloc_pages.__folio_alloc
     32.25            +7.6       39.90        perf-profile.calltrace.cycles-pp.do_cow_fault.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
     32.34            +7.7       39.99        perf-profile.calltrace.cycles-pp.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault
     32.92            +7.7       40.65        perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
     33.76            +7.8       41.56        perf-profile.calltrace.cycles-pp.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
     35.34            +8.0       43.29        perf-profile.calltrace.cycles-pp.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
     35.47            +8.0       43.44        perf-profile.calltrace.cycles-pp.exc_page_fault.asm_exc_page_fault.testcase
     44.87            +9.1       53.97        perf-profile.calltrace.cycles-pp.asm_exc_page_fault.testcase
     48.05            +9.4       57.46        perf-profile.calltrace.cycles-pp.testcase
      8.27 ±  2%     +12.8       21.10 ±  2%  perf-profile.calltrace.cycles-pp.finish_fault.do_cow_fault.do_fault.__handle_mm_fault.handle_mm_fault
      1.35 ±  8%     +13.0       14.35 ±  3%  perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.folio_batch_move_lru.folio_add_lru_vma
      1.43 ±  7%     +13.0       14.43 ±  3%  perf-profile.calltrace.cycles-pp.folio_lruvec_lock_irqsave.folio_batch_move_lru.folio_add_lru_vma.set_pte_range.finish_fault
      1.42 ±  7%     +13.0       14.42 ±  3%  perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.folio_batch_move_lru.folio_add_lru_vma.set_pte_range
      2.77 ±  5%     +13.4       16.18 ±  3%  perf-profile.calltrace.cycles-pp.folio_batch_move_lru.folio_add_lru_vma.set_pte_range.finish_fault.do_cow_fault
      2.90 ±  5%     +13.4       16.32 ±  3%  perf-profile.calltrace.cycles-pp.folio_add_lru_vma.set_pte_range.finish_fault.do_cow_fault.do_fault
      3.85 ±  4%     +13.6       17.42 ±  3%  perf-profile.calltrace.cycles-pp.set_pte_range.finish_fault.do_cow_fault.do_fault.__handle_mm_fault
     22.33           -10.8       11.57        perf-profile.children.cycles-pp.release_pages
     22.33           -10.7       11.63        perf-profile.children.cycles-pp.tlb_batch_pages_flush
     23.93           -10.1       13.80        perf-profile.children.cycles-pp.do_vmi_align_munmap
     23.92           -10.1       13.80        perf-profile.children.cycles-pp.__munmap
     23.92           -10.1       13.80        perf-profile.children.cycles-pp.__vm_munmap
     23.92           -10.1       13.80        perf-profile.children.cycles-pp.__x64_sys_munmap
     23.92           -10.1       13.79        perf-profile.children.cycles-pp.unmap_region
     23.93           -10.1       13.80        perf-profile.children.cycles-pp.do_vmi_munmap
     24.00           -10.1       13.87        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
     23.99           -10.1       13.87        perf-profile.children.cycles-pp.do_syscall_64
     21.66            -9.2       12.48        perf-profile.children.cycles-pp.unmap_vmas
     21.66            -9.2       12.48        perf-profile.children.cycles-pp.unmap_page_range
     21.66            -9.2       12.48        perf-profile.children.cycles-pp.zap_pmd_range
     21.66            -9.2       12.48        perf-profile.children.cycles-pp.zap_pte_range
     11.00            -8.8        2.24 ±  2%  perf-profile.children.cycles-pp.free_pcppages_bulk
     11.59            -8.7        2.88 ±  2%  perf-profile.children.cycles-pp.free_unref_page_list
      6.30 ±  2%      -3.7        2.58 ±  4%  perf-profile.children.cycles-pp.rmqueue_bulk
      6.70            -3.5        3.19 ±  4%  perf-profile.children.cycles-pp.rmqueue
      6.94            -3.5        3.44 ±  3%  perf-profile.children.cycles-pp.get_page_from_freelist
      7.37            -3.4        3.92 ±  3%  perf-profile.children.cycles-pp.__folio_alloc
      7.36            -3.4        3.92 ±  3%  perf-profile.children.cycles-pp.__alloc_pages
      7.85            -3.4        4.45 ±  2%  perf-profile.children.cycles-pp.vma_alloc_folio
     13.43            -2.0       11.44        perf-profile.children.cycles-pp.copy_page
     25.59            -1.2       24.37 ±  2%  perf-profile.children.cycles-pp._raw_spin_lock_irqsave
     25.50            -1.2       24.28 ±  2%  perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
      2.26            -1.0        1.30 ±  2%  perf-profile.children.cycles-pp.tlb_finish_mmu
      4.26            -0.8        3.50        perf-profile.children.cycles-pp._raw_spin_lock
      4.35            -0.7        3.61        perf-profile.children.cycles-pp.__pte_offset_map_lock
      1.92            -0.5        1.43 ±  3%  perf-profile.children.cycles-pp.__list_del_entry_valid_or_report
      0.09            +0.0        0.10 ±  3%  perf-profile.children.cycles-pp.put_page
      0.08 ±  5%      +0.0        0.10 ±  5%  perf-profile.children.cycles-pp.pte_offset_map_nolock
      0.10 ±  6%      +0.0        0.12 ±  4%  perf-profile.children.cycles-pp.scheduler_tick
      0.09 ±  4%      +0.0        0.10 ±  3%  perf-profile.children.cycles-pp.get_pfnblock_flags_mask
      0.12 ±  3%      +0.0        0.14 ±  3%  perf-profile.children.cycles-pp._raw_spin_trylock
      0.11 ±  4%      +0.0        0.13 ±  6%  perf-profile.children.cycles-pp.down_read_trylock
      0.16 ±  2%      +0.0        0.18 ±  2%  perf-profile.children.cycles-pp.free_unref_page_prepare
      0.14 ±  4%      +0.0        0.16 ±  3%  perf-profile.children.cycles-pp.tick_sched_timer
      0.13 ±  5%      +0.0        0.15 ±  3%  perf-profile.children.cycles-pp.cgroup_rstat_updated
      0.13 ±  5%      +0.0        0.15 ±  5%  perf-profile.children.cycles-pp.update_process_times
      0.13 ±  3%      +0.0        0.15 ±  3%  perf-profile.children.cycles-pp.__mem_cgroup_uncharge_list
      0.13 ±  5%      +0.0        0.15 ±  4%  perf-profile.children.cycles-pp.tick_sched_handle
      0.18 ±  3%      +0.0        0.21 ±  4%  perf-profile.children.cycles-pp.blk_cgroup_congested
      0.36            +0.0        0.39 ±  2%  perf-profile.children.cycles-pp.mas_walk
      0.14 ±  5%      +0.0        0.18 ±  4%  perf-profile.children.cycles-pp.handle_pte_fault
      0.22 ±  2%      +0.0        0.25 ±  4%  perf-profile.children.cycles-pp.__folio_throttle_swaprate
      1.06            +0.0        1.11        perf-profile.children.cycles-pp.shmem_get_folio_gfp
      0.37 ±  2%      +0.0        0.41        perf-profile.children.cycles-pp.__mod_node_page_state
      0.46            +0.0        0.51 ±  2%  perf-profile.children.cycles-pp.__mod_lruvec_state
      0.67            +0.1        0.74 ±  2%  perf-profile.children.cycles-pp.lock_vma_under_rcu
      0.13 ±  2%      +0.1        0.20 ±  2%  perf-profile.children.cycles-pp.free_pages_and_swap_cache
      0.12 ±  4%      +0.1        0.20 ±  3%  perf-profile.children.cycles-pp.free_swap_cache
      0.48 ±  4%      +0.1        0.55 ±  3%  perf-profile.children.cycles-pp.__mod_memcg_lruvec_state
      1.25            +0.1        1.34        perf-profile.children.cycles-pp.shmem_fault
      0.49            +0.1        0.58        perf-profile.children.cycles-pp.page_remove_rmap
      1.35            +0.1        1.44        perf-profile.children.cycles-pp.__do_fault
      0.73 ±  3%      +0.1        0.82        perf-profile.children.cycles-pp.___perf_sw_event
      0.93 ±  2%      +0.1        1.04        perf-profile.children.cycles-pp.__perf_sw_event
      1.10            +0.1        1.22        perf-profile.children.cycles-pp.sync_regs
      0.22 ±  5%      +0.1        0.35 ±  5%  perf-profile.children.cycles-pp.__list_add_valid_or_report
      0.75 ±  6%      +0.1        0.88 ±  2%  perf-profile.children.cycles-pp.folio_add_new_anon_rmap
      0.85 ±  4%      +0.2        1.00 ±  2%  perf-profile.children.cycles-pp.__mod_lruvec_page_state
      1.44            +0.2        1.61        perf-profile.children.cycles-pp.native_irq_return_iret
      0.84            +0.2        1.02 ±  2%  perf-profile.children.cycles-pp.lru_add_fn
      2.74            +0.3        3.04        perf-profile.children.cycles-pp.error_entry
      2.56 ±  2%      +0.3        2.87        perf-profile.children.cycles-pp.irqentry_exit_to_user_mode
      2.77            +0.3        3.10        perf-profile.children.cycles-pp.__irqentry_text_end
      0.79            +0.4        1.17        perf-profile.children.cycles-pp._compound_head
      0.81 ±  2%      +0.4        1.21        perf-profile.children.cycles-pp.__free_one_page
      0.00            +2.9        2.92 ±  4%  perf-profile.children.cycles-pp.__rmqueue_pcplist
     32.30            +7.6       39.94        perf-profile.children.cycles-pp.do_cow_fault
     32.34            +7.6       39.99        perf-profile.children.cycles-pp.do_fault
     32.93            +7.7       40.67        perf-profile.children.cycles-pp.__handle_mm_fault
     33.81            +7.8       41.61        perf-profile.children.cycles-pp.handle_mm_fault
     35.36            +8.0       43.32        perf-profile.children.cycles-pp.do_user_addr_fault
     35.49            +8.0       43.46        perf-profile.children.cycles-pp.exc_page_fault
     42.09            +8.8       50.86        perf-profile.children.cycles-pp.asm_exc_page_fault
     49.44            +9.6       59.04        perf-profile.children.cycles-pp.testcase
     11.03 ±  2%     +10.8       21.81 ±  2%  perf-profile.children.cycles-pp.folio_lruvec_lock_irqsave
      8.29 ±  2%     +12.8       21.12 ±  2%  perf-profile.children.cycles-pp.finish_fault
      2.78 ±  5%     +13.4       16.21 ±  3%  perf-profile.children.cycles-pp.folio_batch_move_lru
      2.90 ±  5%     +13.4       16.33 ±  3%  perf-profile.children.cycles-pp.folio_add_lru_vma
      3.86 ±  4%     +13.6       17.43 ±  3%  perf-profile.children.cycles-pp.set_pte_range
     13.36            -2.0       11.38        perf-profile.self.cycles-pp.copy_page
     25.49            -1.2       24.28 ±  2%  perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
      4.24            -0.8        3.48        perf-profile.self.cycles-pp._raw_spin_lock
      1.92            -0.5        1.42 ±  3%  perf-profile.self.cycles-pp.__list_del_entry_valid_or_report
      0.22 ±  4%      -0.1        0.13 ±  3%  perf-profile.self.cycles-pp.rmqueue
      0.10 ±  7%      -0.0        0.08 ±  4%  perf-profile.self.cycles-pp.rmqueue_bulk
      0.11            +0.0        0.12 ±  3%  perf-profile.self.cycles-pp.uncharge_folio
      0.09            +0.0        0.10 ±  3%  perf-profile.self.cycles-pp.put_page
      0.09 ±  4%      +0.0        0.10 ±  4%  perf-profile.self.cycles-pp.exc_page_fault
      0.07 ±  5%      +0.0        0.08 ±  4%  perf-profile.self.cycles-pp.exit_to_user_mode_prepare
      0.13            +0.0        0.14 ±  3%  perf-profile.self.cycles-pp.asm_exc_page_fault
      0.12 ±  4%      +0.0        0.14 ±  3%  perf-profile.self.cycles-pp._raw_spin_trylock
      0.08 ±  5%      +0.0        0.10 ±  3%  perf-profile.self.cycles-pp.get_pfnblock_flags_mask
      0.11 ±  4%      +0.0        0.13 ±  6%  perf-profile.self.cycles-pp.down_read_trylock
      0.22 ±  2%      +0.0        0.24 ±  2%  perf-profile.self.cycles-pp.get_page_from_freelist
      0.14 ±  3%      +0.0        0.15 ±  3%  perf-profile.self.cycles-pp.folio_add_new_anon_rmap
      0.14 ±  2%      +0.0        0.16 ±  3%  perf-profile.self.cycles-pp.blk_cgroup_congested
      0.11 ±  6%      +0.0        0.13 ±  2%  perf-profile.self.cycles-pp.cgroup_rstat_updated
      0.06 ± 11%      +0.0        0.08 ±  5%  perf-profile.self.cycles-pp.handle_pte_fault
      0.12 ±  4%      +0.0        0.14 ±  4%  perf-profile.self.cycles-pp.mas_walk
      0.18 ±  2%      +0.0        0.20 ±  2%  perf-profile.self.cycles-pp.free_unref_page_list
      0.28 ±  3%      +0.0        0.30 ±  2%  perf-profile.self.cycles-pp.vma_alloc_folio
      0.27 ±  2%      +0.0        0.30        perf-profile.self.cycles-pp.page_remove_rmap
      0.20            +0.0        0.24 ±  3%  perf-profile.self.cycles-pp.shmem_fault
      0.24 ±  3%      +0.0        0.27        perf-profile.self.cycles-pp.shmem_get_folio_gfp
      0.34            +0.0        0.38 ±  2%  perf-profile.self.cycles-pp.__alloc_pages
      0.35 ±  3%      +0.0        0.40        perf-profile.self.cycles-pp.__mod_node_page_state
      0.44 ±  2%      +0.1        0.49        perf-profile.self.cycles-pp.__handle_mm_fault
      0.38            +0.1        0.44 ±  3%  perf-profile.self.cycles-pp.lru_add_fn
      0.40 ±  4%      +0.1        0.46 ±  4%  perf-profile.self.cycles-pp.__mod_memcg_lruvec_state
      0.00            +0.1        0.07 ±  7%  perf-profile.self.cycles-pp.__rmqueue_pcplist
      0.12 ±  3%      +0.1        0.19 ±  3%  perf-profile.self.cycles-pp.free_swap_cache
      0.65 ±  2%      +0.1        0.72        perf-profile.self.cycles-pp.___perf_sw_event
      0.29 ± 12%      +0.1        0.37 ±  6%  perf-profile.self.cycles-pp.__mod_lruvec_page_state
      0.29            +0.1        0.39 ±  2%  perf-profile.self.cycles-pp.zap_pte_range
      1.10            +0.1        1.22        perf-profile.self.cycles-pp.sync_regs
      0.21 ±  4%      +0.1        0.33 ±  6%  perf-profile.self.cycles-pp.__list_add_valid_or_report
      1.44            +0.2        1.60        perf-profile.self.cycles-pp.native_irq_return_iret
      0.38 ±  7%      +0.2        0.55 ±  2%  perf-profile.self.cycles-pp.folio_batch_move_lru
      2.73            +0.3        3.03        perf-profile.self.cycles-pp.error_entry
      2.47            +0.3        2.78        perf-profile.self.cycles-pp.irqentry_exit_to_user_mode
      2.77            +0.3        3.10        perf-profile.self.cycles-pp.__irqentry_text_end
      0.78            +0.4        1.15        perf-profile.self.cycles-pp._compound_head
      3.17            +0.4        3.54        perf-profile.self.cycles-pp.testcase
      0.75 ±  2%      +0.4        1.15        perf-profile.self.cycles-pp.__free_one_page




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ