lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20201024120659.GI31092@shao2-debian>
Date:   Sat, 24 Oct 2020 20:06:59 +0800
From:   kernel test robot <rong.a.chen@...el.com>
To:     David Rientjes <rientjes@...gle.com>
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Yang Shi <yang.shi@...ux.alibaba.com>,
        "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
        Mike Rapoport <rppt@...ux.ibm.com>,
        Jeremy Cline <jcline@...hat.com>,
        Andrea Arcangeli <aarcange@...hat.com>,
        Mike Kravetz <mike.kravetz@...cle.com>,
        Michal Hocko <mhocko@...nel.org>,
        Vlastimil Babka <vbabka@...e.cz>,
        LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org,
        lkp@...el.com, ying.huang@...el.com, feng.tang@...el.com,
        zhengjun.xing@...el.com
Subject: [mm, shmem] dcdf11ee14: will-it-scale.per_process_ops -17.9%
 regression

Greeting,

FYI, we noticed a -17.9% regression of will-it-scale.per_process_ops due to commit:


commit: dcdf11ee144133328664d90836e712d840d047d9 ("mm, shmem: add vmstat for hugepage fallback")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master


in testcase: will-it-scale
on test machine: 192 threads Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz with 192G memory
with following parameters:

	nr_task: 100%
	mode: process
	test: page_fault3
	cpufreq_governor: performance
	ucode: 0x5002f01

test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
test-url: https://github.com/antonblanchard/will-it-scale

In addition to that, the commit also has significant impact on the following tests:

+------------------+--------------------------------------------------+
| testcase: change | vm-scalability: boot-time.dhcp -1.0% improvement |
| test machine     | 104 threads Skylake with 192G memory             |
| test parameters  | cpufreq_governor=performance                     |
|                  | runtime=300s                                     |
|                  | size=1T                                          |
|                  | test=lru-shm                                     |
|                  | ucode=0x2006906                                  |
+------------------+--------------------------------------------------+


If you fix the issue, kindly add following tag
Reported-by: kernel test robot <rong.a.chen@...el.com>


Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp install job.yaml  # job file is attached in this email
        bin/lkp run     job.yaml

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
  gcc-9/performance/x86_64-rhel-8.3/process/100%/debian-10.4-x86_64-20200603.cgz/lkp-csl-2ap4/page_fault3/will-it-scale/0x5002f01

commit: 
  6aeff241fe ("mm/migrate.c: migrate PG_readahead flag")
  dcdf11ee14 ("mm, shmem: add vmstat for hugepage fallback")

6aeff241fe6c4561 dcdf11ee144133328664d90836e 
---------------- --------------------------- 
       fail:runs  %reproduction    fail:runs
           |             |             |    
         74:16         -79%          61:16    perf-profile.calltrace.cycles-pp.error_entry.testcase
         65:16         -71%          54:16    perf-profile.calltrace.cycles-pp.sync_regs.error_entry.testcase
         78:16         -84%          64:16    perf-profile.children.cycles-pp.error_entry
          1:16          -2%           0:16    perf-profile.children.cycles-pp.error_exit
          8:16         -10%           7:16    perf-profile.self.cycles-pp.error_entry
          0:16          -2%           0:16    perf-profile.self.cycles-pp.error_exit
         %stddev     %change         %stddev
             \          |                \  
    491236           -17.9%     403513 ±  2%  will-it-scale.per_process_ops
  94317579           -17.9%   77474716 ±  2%  will-it-scale.workload
      1.22            -0.1        1.08 ±  2%  mpstat.cpu.all.irq%
     11.72            +4.4       16.12 ±  3%  mpstat.cpu.all.usr%
     86.94            -5.2%      82.38        vmstat.cpu.sy
     11.00           +40.9%      15.50 ±  4%  vmstat.cpu.us
    941.60           -10.4%     843.27        sched_debug.cpu.sched_count.min
    394.51           -12.6%     344.65        sched_debug.cpu.ttwu_count.min
    388.73           -12.6%     339.66        sched_debug.cpu.ttwu_local.min
  63421776           -15.7%   53463068        proc-vmstat.numa_hit
  63328346           -15.7%   53369647        proc-vmstat.numa_local
  63583145           -15.7%   53592409        proc-vmstat.pgalloc_normal
 2.833e+10           -17.8%  2.329e+10 ±  2%  proc-vmstat.pgfault
  60231689 ±  5%     -14.1%   51752243 ±  5%  proc-vmstat.pgfree
   8881812           -10.2%    7972540 ±  3%  numa-vmstat.node0.numa_hit
   8813894           -10.2%    7915641 ±  3%  numa-vmstat.node0.numa_local
   9101400            -9.7%    8219165        numa-vmstat.node1.numa_local
   9135335 ±  2%      -8.2%    8388489 ±  2%  numa-vmstat.node2.numa_hit
   9027011 ±  2%      -8.3%    8281210 ±  2%  numa-vmstat.node2.numa_local
   9130741           -26.6%    6705906 ±  2%  numa-vmstat.node3.numa_hit
   9014304           -26.9%    6588488 ±  2%  numa-vmstat.node3.numa_local
  15463298           -12.0%   13607940 ±  4%  numa-numastat.node0.local_node
  15487319           -12.0%   13634289 ±  4%  numa-numastat.node0.numa_hit
  15940614           -10.8%   14211661 ±  2%  numa-numastat.node1.local_node
  15961220           -10.8%   14230733 ±  2%  numa-numastat.node1.numa_hit
  15994648           -10.0%   14397886        numa-numastat.node2.local_node
  16017192           -10.0%   14418935        numa-numastat.node2.numa_hit
  15995400           -30.2%   11161473 ±  2%  numa-numastat.node3.local_node
  16021436           -30.2%   11188323 ±  2%  numa-numastat.node3.numa_hit
     13258 ±  2%      +9.2%      14485 ±  3%  softirqs.CPU108.RCU
     12789 ±  3%     +10.0%      14069 ±  2%  softirqs.CPU112.RCU
     12501 ±  7%     +12.5%      14057        softirqs.CPU114.RCU
     12783 ±  3%      +9.3%      13968 ±  4%  softirqs.CPU125.RCU
     13816 ±  3%      +9.3%      15102 ±  3%  softirqs.CPU14.RCU
     12998 ±  3%     +11.1%      14440 ±  2%  softirqs.CPU16.RCU
     13004 ±  7%     +12.6%      14637 ±  2%  softirqs.CPU17.RCU
     13081 ±  3%     +10.1%      14396 ±  2%  softirqs.CPU19.RCU
     12848 ±  3%     +12.3%      14425 ±  4%  softirqs.CPU21.RCU
     12853 ±  3%     +11.6%      14347 ±  3%  softirqs.CPU22.RCU
     12908 ±  3%     +10.9%      14321 ±  2%  softirqs.CPU23.RCU
     13142 ±  3%     +15.8%      15223 ±  9%  softirqs.CPU27.RCU
      2.98           +12.2%       3.35        perf-stat.i.MPKI
 4.712e+10           -17.7%  3.876e+10 ±  2%  perf-stat.i.branch-instructions
      0.29            +0.0        0.30        perf-stat.i.branch-miss-rate%
 1.311e+08           -15.1%  1.113e+08        perf-stat.i.branch-misses
     57.06            -2.2       54.86        perf-stat.i.cache-miss-rate%
   3.9e+08           -10.9%  3.474e+08        perf-stat.i.cache-misses
 6.817e+08            -7.3%  6.317e+08        perf-stat.i.cache-references
      2.56           +21.4%       3.10 ±  2%  perf-stat.i.cpi
      1511           +12.3%       1697        perf-stat.i.cycles-between-cache-misses
 6.549e+10           -17.8%  5.385e+10 ±  2%  perf-stat.i.dTLB-loads
 1.571e+09           -17.9%   1.29e+09 ±  2%  perf-stat.i.dTLB-store-misses
 3.375e+10           -17.8%  2.775e+10 ±  2%  perf-stat.i.dTLB-stores
  83231523           -12.9%   72514211        perf-stat.i.iTLB-load-misses
    244449 ±  4%     -42.7%     140075 ±  4%  perf-stat.i.iTLB-loads
 2.309e+11           -17.7%  1.899e+11 ±  2%  perf-stat.i.instructions
      2776            -5.6%       2622        perf-stat.i.instructions-per-iTLB-miss
      0.39           -17.7%       0.32 ±  2%  perf-stat.i.ipc
      0.02 ±118%   +2130.1%       0.54 ±  2%  perf-stat.i.metric.K/sec
    775.66           -17.7%     638.38 ±  2%  perf-stat.i.metric.M/sec
  93608966           -17.9%   76881713 ±  2%  perf-stat.i.minor-faults
   4012755           +41.4%    5674418 ±  2%  perf-stat.i.node-load-misses
  22945074 ± 17%     +42.0%   32578532 ±  2%  perf-stat.i.node-loads
     22.22            +0.5       22.70        perf-stat.i.node-store-miss-rate%
  26917782           -15.5%   22742426        perf-stat.i.node-store-misses
  95169694           -17.8%   78245919 ±  2%  perf-stat.i.node-stores
  93608966           -17.9%   76881713 ±  2%  perf-stat.i.page-faults
      2.95           +12.7%       3.33        perf-stat.overall.MPKI
      0.28            +0.0        0.29        perf-stat.overall.branch-miss-rate%
     57.23            -2.2       55.01        perf-stat.overall.cache-miss-rate%
      2.55           +21.7%       3.10 ±  2%  perf-stat.overall.cpi
      1506           +12.4%       1693        perf-stat.overall.cycles-between-cache-misses
      2773            -5.6%       2618        perf-stat.overall.instructions-per-iTLB-miss
      0.39           -17.8%       0.32 ±  2%  perf-stat.overall.ipc
     22.05            +0.5       22.52        perf-stat.overall.node-store-miss-rate%
 4.697e+10           -17.7%  3.865e+10 ±  2%  perf-stat.ps.branch-instructions
 1.307e+08           -15.1%  1.109e+08        perf-stat.ps.branch-misses
 3.889e+08           -10.9%  3.464e+08        perf-stat.ps.cache-misses
 6.795e+08            -7.3%  6.298e+08        perf-stat.ps.cache-references
 6.528e+10           -17.7%   5.37e+10 ±  2%  perf-stat.ps.dTLB-loads
 1.566e+09           -17.9%  1.287e+09 ±  2%  perf-stat.ps.dTLB-store-misses
 3.364e+10           -17.7%  2.767e+10 ±  2%  perf-stat.ps.dTLB-stores
  82972636           -12.9%   72308513        perf-stat.ps.iTLB-load-misses
    242363 ±  4%     -42.8%     138589 ±  4%  perf-stat.ps.iTLB-loads
 2.301e+11           -17.7%  1.894e+11 ±  2%  perf-stat.ps.instructions
  93314496           -17.8%   76668000 ±  2%  perf-stat.ps.minor-faults
   3997637           +41.4%    5652936 ±  2%  perf-stat.ps.node-load-misses
  22900278 ± 17%     +42.0%   32514566 ±  2%  perf-stat.ps.node-loads
  26832423           -15.5%   22676967        perf-stat.ps.node-store-misses
  94867767           -17.8%   78025979 ±  2%  perf-stat.ps.node-stores
  93314496           -17.8%   76668001 ±  2%  perf-stat.ps.page-faults
 6.967e+13           -17.6%  5.738e+13 ±  2%  perf-stat.total.instructions
     87.22            -6.8       80.39        perf-profile.calltrace.cycles-pp.page_fault.testcase
     10.71 ±  2%      -2.3        8.37 ±  2%  perf-profile.calltrace.cycles-pp.__do_fault.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
     10.27 ±  2%      -2.2        8.02 ±  2%  perf-profile.calltrace.cycles-pp.shmem_fault.__do_fault.do_fault.__handle_mm_fault.handle_mm_fault
      7.33 ±  2%      -2.2        5.09 ±  3%  perf-profile.calltrace.cycles-pp.__mod_lruvec_state.page_remove_rmap.zap_pte_range.unmap_page_range.unmap_vmas
      9.52 ±  2%      -2.1        7.43 ±  3%  perf-profile.calltrace.cycles-pp.shmem_getpage_gfp.shmem_fault.__do_fault.do_fault.__handle_mm_fault
      6.29 ±  3%      -2.0        4.25 ±  4%  perf-profile.calltrace.cycles-pp.__mod_memcg_state.__mod_lruvec_state.page_remove_rmap.zap_pte_range.unmap_page_range
      8.77 ±  3%      -1.9        6.85 ±  2%  perf-profile.calltrace.cycles-pp.find_lock_entry.shmem_getpage_gfp.shmem_fault.__do_fault.do_fault
      7.51 ±  2%      -1.7        5.83 ±  2%  perf-profile.calltrace.cycles-pp.__mod_lruvec_state.page_add_file_rmap.alloc_set_pte.finish_fault.do_fault
      6.43 ±  3%      -1.4        5.00 ±  2%  perf-profile.calltrace.cycles-pp.__mod_memcg_state.__mod_lruvec_state.page_add_file_rmap.alloc_set_pte.finish_fault
      7.33 ±  2%      -1.1        6.23 ±  3%  perf-profile.calltrace.cycles-pp.__count_memcg_events.handle_mm_fault.do_user_addr_fault.page_fault.testcase
      3.22            -0.8        2.39 ±  2%  perf-profile.calltrace.cycles-pp.fault_dirty_shared_page.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
      1.89 ±  3%      -0.8        1.09 ±  5%  perf-profile.calltrace.cycles-pp._raw_spin_lock.alloc_set_pte.finish_fault.do_fault.__handle_mm_fault
      0.98 ±  2%      -0.4        0.61 ±  4%  perf-profile.calltrace.cycles-pp.unlock_page.fault_dirty_shared_page.do_fault.__handle_mm_fault.handle_mm_fault
      1.32 ±  2%      -0.4        0.96 ±  3%  perf-profile.calltrace.cycles-pp.up_read.do_user_addr_fault.page_fault.testcase
      3.30 ±  2%      -0.3        2.99 ±  2%  perf-profile.calltrace.cycles-pp.lock_page_memcg.page_add_file_rmap.alloc_set_pte.finish_fault.do_fault
      0.96 ±  2%      -0.3        0.67 ±  3%  perf-profile.calltrace.cycles-pp.down_read_trylock.do_user_addr_fault.page_fault.testcase
      1.20            -0.3        0.94 ±  3%  perf-profile.calltrace.cycles-pp.__perf_sw_event.do_user_addr_fault.page_fault.testcase
      1.22            -0.2        1.01 ±  2%  perf-profile.calltrace.cycles-pp.__perf_sw_event.page_fault.testcase
      1.03            -0.2        0.83 ±  2%  perf-profile.calltrace.cycles-pp.file_update_time.fault_dirty_shared_page.do_fault.__handle_mm_fault.handle_mm_fault
      0.69            -0.2        0.53 ±  2%  perf-profile.calltrace.cycles-pp.tlb_flush_mmu.zap_pte_range.unmap_page_range.unmap_vmas.unmap_region
      0.91            -0.2        0.76 ±  2%  perf-profile.calltrace.cycles-pp.swapgs_restore_regs_and_return_to_usermode.testcase
      0.72 ±  3%      -0.1        0.58 ±  3%  perf-profile.calltrace.cycles-pp.___perf_sw_event.__perf_sw_event.do_user_addr_fault.page_fault.testcase
      0.75 ±  2%      -0.1        0.61 ±  2%  perf-profile.calltrace.cycles-pp.___perf_sw_event.__perf_sw_event.page_fault.testcase
      0.99 ±  2%      -0.0        0.94 ±  2%  perf-profile.calltrace.cycles-pp.lock_page_memcg.page_remove_rmap.zap_pte_range.unmap_page_range.unmap_vmas
      1.32 ±  3%      +0.3        1.62 ±  2%  perf-profile.calltrace.cycles-pp.xas_load.find_get_entry.find_lock_entry.shmem_getpage_gfp.shmem_fault
      9.98 ±  2%      +0.9       10.88 ±  2%  perf-profile.calltrace.cycles-pp.page_remove_rmap.zap_pte_range.unmap_page_range.unmap_vmas.unmap_region
     42.89            +1.6       44.49        perf-profile.calltrace.cycles-pp.handle_mm_fault.do_user_addr_fault.page_fault.testcase
     33.71            +3.0       36.76        perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.page_fault.testcase
     31.29            +3.5       34.83        perf-profile.calltrace.cycles-pp.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.page_fault
     16.85            +6.8       23.66 ±  2%  perf-profile.calltrace.cycles-pp.finish_fault.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
     16.62            +6.9       23.49 ±  2%  perf-profile.calltrace.cycles-pp.alloc_set_pte.finish_fault.do_fault.__handle_mm_fault.handle_mm_fault
     13.38            +8.0       21.37 ±  2%  perf-profile.calltrace.cycles-pp.page_add_file_rmap.alloc_set_pte.finish_fault.do_fault.__handle_mm_fault
     76.90            +8.0       84.93        perf-profile.calltrace.cycles-pp.testcase
     14.88 ±  2%      -3.9       10.95 ±  3%  perf-profile.children.cycles-pp.__mod_lruvec_state
     69.38            -3.6       65.83        perf-profile.children.cycles-pp.page_fault
     12.73 ±  3%      -3.5        9.26 ±  3%  perf-profile.children.cycles-pp.__mod_memcg_state
     10.72 ±  2%      -2.3        8.38 ±  2%  perf-profile.children.cycles-pp.__do_fault
     10.29 ±  2%      -2.3        8.04 ±  2%  perf-profile.children.cycles-pp.shmem_fault
      9.55 ±  2%      -2.1        7.45 ±  3%  perf-profile.children.cycles-pp.shmem_getpage_gfp
      8.83 ±  3%      -1.9        6.90 ±  2%  perf-profile.children.cycles-pp.find_lock_entry
      7.33 ±  2%      -1.1        6.24 ±  3%  perf-profile.children.cycles-pp.__count_memcg_events
      3.29            -0.8        2.44 ±  2%  perf-profile.children.cycles-pp.fault_dirty_shared_page
      1.93 ±  3%      -0.8        1.12 ±  5%  perf-profile.children.cycles-pp._raw_spin_lock
      4.25            -0.7        3.52        perf-profile.children.cycles-pp.sync_regs
      2.42            -0.5        1.95 ±  2%  perf-profile.children.cycles-pp.__perf_sw_event
      4.30 ±  2%      -0.4        3.93 ±  2%  perf-profile.children.cycles-pp.lock_page_memcg
      0.98 ±  2%      -0.4        0.61 ±  4%  perf-profile.children.cycles-pp.unlock_page
      1.32 ±  2%      -0.4        0.96 ±  3%  perf-profile.children.cycles-pp.up_read
      1.55 ±  2%      -0.3        1.25 ±  2%  perf-profile.children.cycles-pp.___perf_sw_event
      0.96 ±  2%      -0.3        0.67 ±  3%  perf-profile.children.cycles-pp.down_read_trylock
      1.04            -0.3        0.79 ±  2%  perf-profile.children.cycles-pp.set_page_dirty
      1.09            -0.2        0.87 ±  2%  perf-profile.children.cycles-pp.page_mapping
      1.05            -0.2        0.86 ±  2%  perf-profile.children.cycles-pp.file_update_time
      0.71            -0.2        0.54 ±  2%  perf-profile.children.cycles-pp.tlb_flush_mmu
      0.54 ±  2%      -0.2        0.39 ±  2%  perf-profile.children.cycles-pp.release_pages
      0.91            -0.2        0.76 ±  2%  perf-profile.children.cycles-pp.swapgs_restore_regs_and_return_to_usermode
      0.59            -0.1        0.45 ±  3%  perf-profile.children.cycles-pp.find_vma
      0.75            -0.1        0.62 ±  2%  perf-profile.children.cycles-pp.__mod_node_page_state
      0.66 ±  2%      -0.1        0.53 ±  4%  perf-profile.children.cycles-pp.current_time
      0.50 ±  2%      -0.1        0.38 ±  4%  perf-profile.children.cycles-pp.vmacache_find
      0.70 ±  6%      -0.1        0.60 ±  6%  perf-profile.children.cycles-pp.smp_apic_timer_interrupt
      0.65 ±  6%      -0.1        0.55 ±  6%  perf-profile.children.cycles-pp.hrtimer_interrupt
      0.75 ±  5%      -0.1        0.66 ±  5%  perf-profile.children.cycles-pp.apic_timer_interrupt
      0.50 ±  2%      -0.1        0.41 ±  2%  perf-profile.children.cycles-pp.do_page_fault
      0.52            -0.1        0.43 ±  2%  perf-profile.children.cycles-pp.___might_sleep
      0.35            -0.1        0.27 ±  3%  perf-profile.children.cycles-pp.__set_page_dirty_no_writeback
      0.37 ±  5%      -0.1        0.30 ±  8%  perf-profile.children.cycles-pp.__hrtimer_run_queues
      0.32 ±  6%      -0.1        0.25 ±  9%  perf-profile.children.cycles-pp.tick_sched_timer
      0.41 ±  2%      -0.1        0.35 ±  2%  perf-profile.children.cycles-pp.prepare_exit_to_usermode
      0.32            -0.1        0.26 ±  2%  perf-profile.children.cycles-pp.mark_page_accessed
      0.29 ±  3%      -0.1        0.23 ±  4%  perf-profile.children.cycles-pp.__might_sleep
      0.49 ±  3%      -0.0        0.45 ±  3%  perf-profile.children.cycles-pp.__unlock_page_memcg
      0.16 ±  2%      -0.0        0.12 ±  4%  perf-profile.children.cycles-pp.page_rmapping
      0.20 ±  3%      -0.0        0.16 ±  5%  perf-profile.children.cycles-pp.perf_exclude_event
      0.20 ±  3%      -0.0        0.17 ±  3%  perf-profile.children.cycles-pp.__alloc_pages_nodemask
      0.20 ±  2%      -0.0        0.17 ±  3%  perf-profile.children.cycles-pp.pte_alloc_one
      0.16 ±  2%      -0.0        0.12 ±  3%  perf-profile.children.cycles-pp.PageHuge
      0.23 ±  2%      -0.0        0.19 ±  2%  perf-profile.children.cycles-pp._cond_resched
      0.13 ±  3%      -0.0        0.10 ±  4%  perf-profile.children.cycles-pp.__memcg_kmem_charge_page
      0.15 ±  4%      -0.0        0.12 ±  5%  perf-profile.children.cycles-pp.perf_swevent_event
      0.14 ±  3%      -0.0        0.12 ±  4%  perf-profile.children.cycles-pp.vm_normal_page
      0.08 ±  4%      -0.0        0.06 ±  6%  perf-profile.children.cycles-pp.page_counter_try_charge
      0.12 ±  3%      -0.0        0.10 ±  4%  perf-profile.children.cycles-pp.rcu_all_qs
      0.09 ±  2%      -0.0        0.07 ±  5%  perf-profile.children.cycles-pp.__memcg_kmem_charge
      1.34 ±  3%      +0.3        1.64 ±  2%  perf-profile.children.cycles-pp.xas_load
     10.05 ±  2%      +0.9       10.94 ±  2%  perf-profile.children.cycles-pp.page_remove_rmap
     42.95            +1.6       44.54        perf-profile.children.cycles-pp.handle_mm_fault
     33.75            +3.0       36.79        perf-profile.children.cycles-pp.__handle_mm_fault
     31.37            +3.5       34.89        perf-profile.children.cycles-pp.do_fault
      9.24 ±  2%      +4.7       13.94 ±  3%  perf-profile.children.cycles-pp.native_irq_return_iret
     16.89            +6.8       23.68 ±  2%  perf-profile.children.cycles-pp.finish_fault
     16.67            +6.9       23.52 ±  2%  perf-profile.children.cycles-pp.alloc_set_pte
     13.42            +8.0       21.41 ±  2%  perf-profile.children.cycles-pp.page_add_file_rmap
     19.39            -3.5       15.91 ±  2%  perf-profile.self.cycles-pp.testcase
     12.63 ±  3%      -3.4        9.19 ±  3%  perf-profile.self.cycles-pp.__mod_memcg_state
      7.32 ±  2%      -1.1        6.22 ±  3%  perf-profile.self.cycles-pp.__count_memcg_events
      2.50 ±  2%      -1.1        1.40 ±  6%  perf-profile.self.cycles-pp.find_lock_entry
      1.90 ±  3%      -0.8        1.10 ±  5%  perf-profile.self.cycles-pp._raw_spin_lock
      4.24            -0.7        3.51        perf-profile.self.cycles-pp.sync_regs
      2.00 ±  2%      -0.5        1.51 ±  2%  perf-profile.self.cycles-pp.zap_pte_range
      2.34            -0.5        1.87 ±  2%  perf-profile.self.cycles-pp.__handle_mm_fault
      0.95 ±  2%      -0.4        0.59 ±  4%  perf-profile.self.cycles-pp.unlock_page
      4.24 ±  2%      -0.4        3.88 ±  2%  perf-profile.self.cycles-pp.lock_page_memcg
      1.30 ±  2%      -0.3        0.95 ±  3%  perf-profile.self.cycles-pp.up_read
      1.37 ±  3%      -0.3        1.06 ±  5%  perf-profile.self.cycles-pp.__mod_lruvec_state
      1.06 ±  2%      -0.3        0.77 ±  2%  perf-profile.self.cycles-pp.alloc_set_pte
      0.95 ±  2%      -0.3        0.66 ±  3%  perf-profile.self.cycles-pp.down_read_trylock
      1.60            -0.3        1.33 ±  2%  perf-profile.self.cycles-pp.handle_mm_fault
      1.15            -0.3        0.90 ±  2%  perf-profile.self.cycles-pp.do_user_addr_fault
      1.26 ±  2%      -0.2        1.01 ±  3%  perf-profile.self.cycles-pp.___perf_sw_event
      1.05            -0.2        0.84 ±  2%  perf-profile.self.cycles-pp.page_mapping
      0.86            -0.2        0.70 ±  2%  perf-profile.self.cycles-pp.__perf_sw_event
      0.70 ±  5%      -0.2        0.54 ±  5%  perf-profile.self.cycles-pp.shmem_getpage_gfp
      0.74 ±  2%      -0.2        0.58        perf-profile.self.cycles-pp.shmem_fault
      0.52 ±  2%      -0.1        0.38 ±  2%  perf-profile.self.cycles-pp.release_pages
      0.73            -0.1        0.60 ±  2%  perf-profile.self.cycles-pp.__mod_node_page_state
      0.75            -0.1        0.63 ±  2%  perf-profile.self.cycles-pp.page_fault
      0.47 ±  2%      -0.1        0.36 ±  3%  perf-profile.self.cycles-pp.vmacache_find
      0.33            -0.1        0.23 ±  2%  perf-profile.self.cycles-pp.set_page_dirty
      0.49 ±  2%      -0.1        0.41 ±  2%  perf-profile.self.cycles-pp.swapgs_restore_regs_and_return_to_usermode
      0.49            -0.1        0.41 ±  2%  perf-profile.self.cycles-pp.___might_sleep
      0.48            -0.1        0.40 ±  2%  perf-profile.self.cycles-pp.do_page_fault
      0.40 ±  3%      -0.1        0.32 ±  3%  perf-profile.self.cycles-pp.file_update_time
      0.32            -0.1        0.25 ±  4%  perf-profile.self.cycles-pp.fault_dirty_shared_page
      0.31            -0.1        0.25 ±  3%  perf-profile.self.cycles-pp.__set_page_dirty_no_writeback
      0.23 ±  4%      -0.1        0.16 ±  3%  perf-profile.self.cycles-pp.finish_fault
      0.37 ±  3%      -0.1        0.30 ±  2%  perf-profile.self.cycles-pp.prepare_exit_to_usermode
      0.31            -0.1        0.25 ±  2%  perf-profile.self.cycles-pp.mark_page_accessed
      0.26 ±  2%      -0.1        0.21 ±  3%  perf-profile.self.cycles-pp.__might_sleep
      0.21 ±  3%      -0.1        0.16 ±  2%  perf-profile.self.cycles-pp.__do_fault
      0.23 ±  4%      -0.0        0.19 ±  5%  perf-profile.self.cycles-pp.current_time
      0.13 ±  3%      -0.0        0.10 ±  4%  perf-profile.self.cycles-pp.page_rmapping
      0.17 ±  4%      -0.0        0.13 ±  5%  perf-profile.self.cycles-pp.perf_exclude_event
      0.10 ±  5%      -0.0        0.08 ±  4%  perf-profile.self.cycles-pp.find_vma
      0.12 ±  5%      -0.0        0.09 ±  4%  perf-profile.self.cycles-pp.perf_swevent_event
      0.11 ±  2%      -0.0        0.09 ±  4%  perf-profile.self.cycles-pp.PageHuge
      0.12 ±  3%      -0.0        0.10 ±  4%  perf-profile.self.cycles-pp.vm_normal_page
      0.11 ±  3%      -0.0        0.09 ±  4%  perf-profile.self.cycles-pp._cond_resched
      0.07 ±  5%      -0.0        0.06 ±  8%  perf-profile.self.cycles-pp.page_counter_try_charge
      0.08 ±  4%      -0.0        0.07        perf-profile.self.cycles-pp.rcu_all_qs
      1.04 ±  2%      +0.4        1.40 ±  2%  perf-profile.self.cycles-pp.xas_load
      1.66 ±  4%      +3.2        4.84 ±  5%  perf-profile.self.cycles-pp.page_remove_rmap
      9.22 ±  2%      +4.7       13.93 ±  3%  perf-profile.self.cycles-pp.native_irq_return_iret
      2.58 ±  2%      +9.9       12.51 ±  4%  perf-profile.self.cycles-pp.page_add_file_rmap


                                                                                
                            will-it-scale.per_process_ops                       
                                                                                
  520000 +------------------------------------------------------------------+   
  500000 |-+           +       +                   +              +         |   
         |+++  ++   +++++++++++++++++++ +++++ ++++++ ++ +++++++++++         |   
  480000 |++ +++: ++               +   ++ +  ++     + ++   +  +             |   
  460000 |-+ +   ::                    +     +                              |   
         |       +                                                          |   
  440000 |-+                                                                |   
  420000 |-+                                               O           O    |   
  400000 |-+        O  O OOOO  OOOOOOOOOO OOOOO          OOO       O OO OOOO|   
         |   O  OOOO OOOO O       OO   OOO O OOOOOOOOOOOO O OOO OOO O      O|   
  380000 |OOOO O  O        O                                   OO O       O |   
  360000 |OO  O O            OO                                             |   
         |                   O                                              |   
  340000 |-+                   O                                            |   
  320000 +------------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                                will-it-scale.workload                          
                                                                                
    1e+08 +-----------------------------------------------------------------+   
          |            +        +                  +              +         |   
  9.5e+07 |+++ + +   +++++++++++++++ ++  + +++ ++++++++ +++ +++++++         |   
    9e+07 |++ +++:+++               +  ++ +  ++     +  +   +   +            |   
          |       :                                                         |   
  8.5e+07 |-+     +                                                         |   
          |                                                                 |   
    8e+07 |-+               OO                              O       O  O    |   
          |          O O  O    OOOOOOOOOOOOOOOOO O  OOOOOOOOO  O   OOOOOOOOO|   
  7.5e+07 |-+ O  OOOO OOOO O                    OOOOO O  O   OOOOOO       OO|   
    7e+07 |OOO OO           OO                                    O         |   
          |OO  O O            OO                                            |   
  6.5e+07 |-+                   O                                           |   
          |                                                                 |   
    6e+07 +-----------------------------------------------------------------+   
                                                                                
                                                                                
[*] bisect-good sample
[O] bisect-bad  sample

***************************************************************************************************
lkp-skl-fpga01: 104 threads Skylake with 192G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/runtime/size/tbox_group/test/testcase/ucode:
  gcc-9/performance/x86_64-rhel-8.3/debian-10.4-x86_64-20200603.cgz/300s/1T/lkp-skl-fpga01/lru-shm/vm-scalability/0x2006906

commit: 
  6aeff241fe ("mm/migrate.c: migrate PG_readahead flag")
  dcdf11ee14 ("mm, shmem: add vmstat for hugepage fallback")

6aeff241fe6c4561 dcdf11ee144133328664d90836e 
---------------- --------------------------- 
       fail:runs  %reproduction    fail:runs
           |             |             |    
           :2           50%           1:20    dmesg.WARNING:at#for_ip_interrupt_entry/0x
           :2           50%           1:20    last_state.booting
           :2           50%           1:20    last_state.is_incomplete_run
          1:2          408%          10:20    perf-profile.calltrace.cycles-pp.sync_regs.error_entry
           :2          420%           8:20    perf-profile.calltrace.cycles-pp.sync_regs.error_entry.do_access
          3:2          875%          21:20    perf-profile.calltrace.cycles-pp.error_entry
          0:2          914%          19:20    perf-profile.calltrace.cycles-pp.error_entry.do_access
          5:2         1972%          44:20    perf-profile.children.cycles-pp.error_entry
          2:2          956%          21:20    perf-profile.self.cycles-pp.error_entry
         %stddev     %change         %stddev
             \          |                \  
      0.54 ± 13%      -0.1        0.45 ± 20%  perf-profile.children.cycles-pp.find_get_entries
      0.60 ± 17%      -0.1        0.51 ± 20%  perf-profile.children.cycles-pp.clockevents_program_event
      0.29            -0.1        0.21 ± 17%  perf-profile.children.cycles-pp.find_vma
      0.28 ±  5%      -0.1        0.19 ± 17%  perf-profile.children.cycles-pp.vmacache_find
      0.18 ±  8%      -0.1        0.13 ± 24%  perf-profile.children.cycles-pp.truncate_cleanup_page
      0.12 ± 28%      -0.0        0.09 ± 32%  perf-profile.children.cycles-pp.tick_irq_enter
      0.08 ±  5%      -0.0        0.07 ± 12%  perf-profile.children.cycles-pp.perf_exclude_event
      0.76 ±  7%      -0.1        0.64 ± 20%  perf-profile.self.cycles-pp.release_pages
      0.74 ±  8%      -0.1        0.62 ± 13%  perf-profile.self.cycles-pp.__handle_mm_fault
      0.43 ± 12%      -0.1        0.32 ± 41%  perf-profile.self.cycles-pp.zap_pte_range
      0.70 ± 12%      -0.1        0.61 ± 13%  perf-profile.self.cycles-pp.__pagevec_lru_add_fn
      0.48 ± 14%      -0.1        0.39 ± 20%  perf-profile.self.cycles-pp.find_get_entries
      0.27 ±  3%      -0.1        0.19 ± 17%  perf-profile.self.cycles-pp.vmacache_find
      0.43 ±  8%      -0.1        0.37 ± 11%  perf-profile.self.cycles-pp.rmqueue
      0.17 ± 11%      -0.1        0.12 ± 24%  perf-profile.self.cycles-pp.truncate_cleanup_page
      0.26 ±  5%      -0.0        0.22 ± 12%  perf-profile.self.cycles-pp.do_user_addr_fault
      0.21 ± 11%      -0.0        0.18 ± 14%  perf-profile.self.cycles-pp.do_fault
      0.18 ± 11%      -0.0        0.15 ± 15%  perf-profile.self.cycles-pp.___might_sleep
      0.08 ±  6%      -0.0        0.06 ± 12%  perf-profile.self.cycles-pp.perf_exclude_event
     19.96            -1.0%      19.75        boot-time.dhcp
      8812 ±  4%      -5.0%       8370 ±  5%  numa-meminfo.node1.KernelStack
      8800 ±  4%      -4.8%       8373 ±  5%  numa-vmstat.node1.nr_kernel_stack
     53.50 ±  5%     +67.3%      89.51 ± 18%  sched_debug.cfs_rq:/.nr_spread_over.max
    828.33 ±  9%     -15.6%     699.50 ± 14%  sched_debug.cfs_rq:/.util_est_enqueued.max
   1361213            -6.5%    1273045 ±  4%  perf-stat.i.node-loads
     59.25            +2.9       62.17 ±  3%  perf-stat.overall.node-load-miss-rate%
   1359101            -6.4%    1271647 ±  3%  perf-stat.ps.node-loads
      6903 ±  5%      -9.2%       6270 ± 11%  slabinfo.eventpoll_pwq.active_objs
      6903 ±  5%      -9.2%       6270 ± 11%  slabinfo.eventpoll_pwq.num_objs
      1726 ±  3%      -7.3%       1600 ±  6%  slabinfo.pool_workqueue.num_objs
    105593 ±  5%      -9.0%      96123 ±  7%  softirqs.CPU30.TIMER
     27563 ±  5%     -10.4%      24703 ±  6%  softirqs.CPU35.SCHED
    102935 ±  3%      -7.5%      95173 ±  5%  softirqs.CPU35.TIMER
     26724 ±  7%     -11.1%      23757 ±  5%  softirqs.CPU38.SCHED
    101664 ±  3%      -6.5%      95094 ±  6%  softirqs.CPU48.TIMER
     25948 ±  3%      -6.0%      24399 ±  5%  softirqs.CPU49.SCHED
     24817 ±  6%      -7.7%      22911 ±  3%  softirqs.CPU53.SCHED
    106024 ± 13%     -20.5%      84307 ± 15%  softirqs.CPU62.TIMER
     28303 ± 11%     -13.7%      24432 ±  6%  softirqs.CPU80.SCHED
     15005 ± 16%     -19.9%      12017 ±  6%  softirqs.CPU82.RCU
     26967 ±  8%     -11.5%      23853 ±  6%  softirqs.CPU92.SCHED
     25547 ±  4%      -6.1%      23979 ±  2%  softirqs.CPU97.SCHED
     25487 ±  3%      -7.1%      23687 ±  4%  softirqs.CPU98.SCHED
      6613 ± 40%     -77.3%       1500 ±144%  interrupts.CPU35.RES:Rescheduling_interrupts
      3834 ± 91%     -92.0%     307.00 ± 83%  interrupts.CPU38.RES:Rescheduling_interrupts
      2074 ± 28%     -23.3%       1590 ± 22%  interrupts.CPU4.NMI:Non-maskable_interrupts
      2074 ± 28%     -23.3%       1590 ± 22%  interrupts.CPU4.PMI:Performance_monitoring_interrupts
      4380 ± 68%     -69.9%       1320 ± 18%  interrupts.CPU46.NMI:Non-maskable_interrupts
      4380 ± 68%     -69.9%       1320 ± 18%  interrupts.CPU46.PMI:Performance_monitoring_interrupts
      4634 ± 95%     -84.8%     704.00 ±147%  interrupts.CPU53.RES:Rescheduling_interrupts
      2037 ± 26%     -32.4%       1376 ± 20%  interrupts.CPU64.NMI:Non-maskable_interrupts
      2037 ± 26%     -32.4%       1376 ± 20%  interrupts.CPU64.PMI:Performance_monitoring_interrupts
      3381 ± 94%     -94.6%     184.05 ± 50%  interrupts.CPU72.RES:Rescheduling_interrupts
      5218 ± 95%     -86.6%     700.84 ±174%  interrupts.CPU88.RES:Rescheduling_interrupts
      5814 ± 80%     -75.4%       1429 ±170%  interrupts.CPU92.RES:Rescheduling_interrupts
      1957 ± 67%     -87.6%     242.00 ± 83%  interrupts.CPU97.RES:Rescheduling_interrupts
      1496 ± 53%     -82.2%     266.58 ± 80%  interrupts.CPU98.RES:Rescheduling_interrupts





Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Rong Chen


View attachment "config-5.6.0-11465-gdcdf11ee14413" of type "text/plain" (156771 bytes)

View attachment "job-script" of type "text/plain" (7632 bytes)

View attachment "job.yaml" of type "text/plain" (5297 bytes)

View attachment "reproduce" of type "text/plain" (344 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ