[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20201024120659.GI31092@shao2-debian>
Date: Sat, 24 Oct 2020 20:06:59 +0800
From: kernel test robot <rong.a.chen@...el.com>
To: David Rientjes <rientjes@...gle.com>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Yang Shi <yang.shi@...ux.alibaba.com>,
"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
Mike Rapoport <rppt@...ux.ibm.com>,
Jeremy Cline <jcline@...hat.com>,
Andrea Arcangeli <aarcange@...hat.com>,
Mike Kravetz <mike.kravetz@...cle.com>,
Michal Hocko <mhocko@...nel.org>,
Vlastimil Babka <vbabka@...e.cz>,
LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org,
lkp@...el.com, ying.huang@...el.com, feng.tang@...el.com,
zhengjun.xing@...el.com
Subject: [mm, shmem] dcdf11ee14: will-it-scale.per_process_ops -17.9%
regression
Greeting,
FYI, we noticed a -17.9% regression of will-it-scale.per_process_ops due to commit:
commit: dcdf11ee144133328664d90836e712d840d047d9 ("mm, shmem: add vmstat for hugepage fallback")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
in testcase: will-it-scale
on test machine: 192 threads Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz with 192G memory
with following parameters:
nr_task: 100%
mode: process
test: page_fault3
cpufreq_governor: performance
ucode: 0x5002f01
test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
test-url: https://github.com/antonblanchard/will-it-scale
In addition to that, the commit also has significant impact on the following tests:
+------------------+--------------------------------------------------+
| testcase: change | vm-scalability: boot-time.dhcp -1.0% improvement |
| test machine | 104 threads Skylake with 192G memory |
| test parameters | cpufreq_governor=performance |
| | runtime=300s |
| | size=1T |
| | test=lru-shm |
| | ucode=0x2006906 |
+------------------+--------------------------------------------------+
If you fix the issue, kindly add following tag
Reported-by: kernel test robot <rong.a.chen@...el.com>
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
gcc-9/performance/x86_64-rhel-8.3/process/100%/debian-10.4-x86_64-20200603.cgz/lkp-csl-2ap4/page_fault3/will-it-scale/0x5002f01
commit:
6aeff241fe ("mm/migrate.c: migrate PG_readahead flag")
dcdf11ee14 ("mm, shmem: add vmstat for hugepage fallback")
6aeff241fe6c4561 dcdf11ee144133328664d90836e
---------------- ---------------------------
fail:runs %reproduction fail:runs
| | |
74:16 -79% 61:16 perf-profile.calltrace.cycles-pp.error_entry.testcase
65:16 -71% 54:16 perf-profile.calltrace.cycles-pp.sync_regs.error_entry.testcase
78:16 -84% 64:16 perf-profile.children.cycles-pp.error_entry
1:16 -2% 0:16 perf-profile.children.cycles-pp.error_exit
8:16 -10% 7:16 perf-profile.self.cycles-pp.error_entry
0:16 -2% 0:16 perf-profile.self.cycles-pp.error_exit
%stddev %change %stddev
\ | \
491236 -17.9% 403513 ± 2% will-it-scale.per_process_ops
94317579 -17.9% 77474716 ± 2% will-it-scale.workload
1.22 -0.1 1.08 ± 2% mpstat.cpu.all.irq%
11.72 +4.4 16.12 ± 3% mpstat.cpu.all.usr%
86.94 -5.2% 82.38 vmstat.cpu.sy
11.00 +40.9% 15.50 ± 4% vmstat.cpu.us
941.60 -10.4% 843.27 sched_debug.cpu.sched_count.min
394.51 -12.6% 344.65 sched_debug.cpu.ttwu_count.min
388.73 -12.6% 339.66 sched_debug.cpu.ttwu_local.min
63421776 -15.7% 53463068 proc-vmstat.numa_hit
63328346 -15.7% 53369647 proc-vmstat.numa_local
63583145 -15.7% 53592409 proc-vmstat.pgalloc_normal
2.833e+10 -17.8% 2.329e+10 ± 2% proc-vmstat.pgfault
60231689 ± 5% -14.1% 51752243 ± 5% proc-vmstat.pgfree
8881812 -10.2% 7972540 ± 3% numa-vmstat.node0.numa_hit
8813894 -10.2% 7915641 ± 3% numa-vmstat.node0.numa_local
9101400 -9.7% 8219165 numa-vmstat.node1.numa_local
9135335 ± 2% -8.2% 8388489 ± 2% numa-vmstat.node2.numa_hit
9027011 ± 2% -8.3% 8281210 ± 2% numa-vmstat.node2.numa_local
9130741 -26.6% 6705906 ± 2% numa-vmstat.node3.numa_hit
9014304 -26.9% 6588488 ± 2% numa-vmstat.node3.numa_local
15463298 -12.0% 13607940 ± 4% numa-numastat.node0.local_node
15487319 -12.0% 13634289 ± 4% numa-numastat.node0.numa_hit
15940614 -10.8% 14211661 ± 2% numa-numastat.node1.local_node
15961220 -10.8% 14230733 ± 2% numa-numastat.node1.numa_hit
15994648 -10.0% 14397886 numa-numastat.node2.local_node
16017192 -10.0% 14418935 numa-numastat.node2.numa_hit
15995400 -30.2% 11161473 ± 2% numa-numastat.node3.local_node
16021436 -30.2% 11188323 ± 2% numa-numastat.node3.numa_hit
13258 ± 2% +9.2% 14485 ± 3% softirqs.CPU108.RCU
12789 ± 3% +10.0% 14069 ± 2% softirqs.CPU112.RCU
12501 ± 7% +12.5% 14057 softirqs.CPU114.RCU
12783 ± 3% +9.3% 13968 ± 4% softirqs.CPU125.RCU
13816 ± 3% +9.3% 15102 ± 3% softirqs.CPU14.RCU
12998 ± 3% +11.1% 14440 ± 2% softirqs.CPU16.RCU
13004 ± 7% +12.6% 14637 ± 2% softirqs.CPU17.RCU
13081 ± 3% +10.1% 14396 ± 2% softirqs.CPU19.RCU
12848 ± 3% +12.3% 14425 ± 4% softirqs.CPU21.RCU
12853 ± 3% +11.6% 14347 ± 3% softirqs.CPU22.RCU
12908 ± 3% +10.9% 14321 ± 2% softirqs.CPU23.RCU
13142 ± 3% +15.8% 15223 ± 9% softirqs.CPU27.RCU
2.98 +12.2% 3.35 perf-stat.i.MPKI
4.712e+10 -17.7% 3.876e+10 ± 2% perf-stat.i.branch-instructions
0.29 +0.0 0.30 perf-stat.i.branch-miss-rate%
1.311e+08 -15.1% 1.113e+08 perf-stat.i.branch-misses
57.06 -2.2 54.86 perf-stat.i.cache-miss-rate%
3.9e+08 -10.9% 3.474e+08 perf-stat.i.cache-misses
6.817e+08 -7.3% 6.317e+08 perf-stat.i.cache-references
2.56 +21.4% 3.10 ± 2% perf-stat.i.cpi
1511 +12.3% 1697 perf-stat.i.cycles-between-cache-misses
6.549e+10 -17.8% 5.385e+10 ± 2% perf-stat.i.dTLB-loads
1.571e+09 -17.9% 1.29e+09 ± 2% perf-stat.i.dTLB-store-misses
3.375e+10 -17.8% 2.775e+10 ± 2% perf-stat.i.dTLB-stores
83231523 -12.9% 72514211 perf-stat.i.iTLB-load-misses
244449 ± 4% -42.7% 140075 ± 4% perf-stat.i.iTLB-loads
2.309e+11 -17.7% 1.899e+11 ± 2% perf-stat.i.instructions
2776 -5.6% 2622 perf-stat.i.instructions-per-iTLB-miss
0.39 -17.7% 0.32 ± 2% perf-stat.i.ipc
0.02 ±118% +2130.1% 0.54 ± 2% perf-stat.i.metric.K/sec
775.66 -17.7% 638.38 ± 2% perf-stat.i.metric.M/sec
93608966 -17.9% 76881713 ± 2% perf-stat.i.minor-faults
4012755 +41.4% 5674418 ± 2% perf-stat.i.node-load-misses
22945074 ± 17% +42.0% 32578532 ± 2% perf-stat.i.node-loads
22.22 +0.5 22.70 perf-stat.i.node-store-miss-rate%
26917782 -15.5% 22742426 perf-stat.i.node-store-misses
95169694 -17.8% 78245919 ± 2% perf-stat.i.node-stores
93608966 -17.9% 76881713 ± 2% perf-stat.i.page-faults
2.95 +12.7% 3.33 perf-stat.overall.MPKI
0.28 +0.0 0.29 perf-stat.overall.branch-miss-rate%
57.23 -2.2 55.01 perf-stat.overall.cache-miss-rate%
2.55 +21.7% 3.10 ± 2% perf-stat.overall.cpi
1506 +12.4% 1693 perf-stat.overall.cycles-between-cache-misses
2773 -5.6% 2618 perf-stat.overall.instructions-per-iTLB-miss
0.39 -17.8% 0.32 ± 2% perf-stat.overall.ipc
22.05 +0.5 22.52 perf-stat.overall.node-store-miss-rate%
4.697e+10 -17.7% 3.865e+10 ± 2% perf-stat.ps.branch-instructions
1.307e+08 -15.1% 1.109e+08 perf-stat.ps.branch-misses
3.889e+08 -10.9% 3.464e+08 perf-stat.ps.cache-misses
6.795e+08 -7.3% 6.298e+08 perf-stat.ps.cache-references
6.528e+10 -17.7% 5.37e+10 ± 2% perf-stat.ps.dTLB-loads
1.566e+09 -17.9% 1.287e+09 ± 2% perf-stat.ps.dTLB-store-misses
3.364e+10 -17.7% 2.767e+10 ± 2% perf-stat.ps.dTLB-stores
82972636 -12.9% 72308513 perf-stat.ps.iTLB-load-misses
242363 ± 4% -42.8% 138589 ± 4% perf-stat.ps.iTLB-loads
2.301e+11 -17.7% 1.894e+11 ± 2% perf-stat.ps.instructions
93314496 -17.8% 76668000 ± 2% perf-stat.ps.minor-faults
3997637 +41.4% 5652936 ± 2% perf-stat.ps.node-load-misses
22900278 ± 17% +42.0% 32514566 ± 2% perf-stat.ps.node-loads
26832423 -15.5% 22676967 perf-stat.ps.node-store-misses
94867767 -17.8% 78025979 ± 2% perf-stat.ps.node-stores
93314496 -17.8% 76668001 ± 2% perf-stat.ps.page-faults
6.967e+13 -17.6% 5.738e+13 ± 2% perf-stat.total.instructions
87.22 -6.8 80.39 perf-profile.calltrace.cycles-pp.page_fault.testcase
10.71 ± 2% -2.3 8.37 ± 2% perf-profile.calltrace.cycles-pp.__do_fault.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
10.27 ± 2% -2.2 8.02 ± 2% perf-profile.calltrace.cycles-pp.shmem_fault.__do_fault.do_fault.__handle_mm_fault.handle_mm_fault
7.33 ± 2% -2.2 5.09 ± 3% perf-profile.calltrace.cycles-pp.__mod_lruvec_state.page_remove_rmap.zap_pte_range.unmap_page_range.unmap_vmas
9.52 ± 2% -2.1 7.43 ± 3% perf-profile.calltrace.cycles-pp.shmem_getpage_gfp.shmem_fault.__do_fault.do_fault.__handle_mm_fault
6.29 ± 3% -2.0 4.25 ± 4% perf-profile.calltrace.cycles-pp.__mod_memcg_state.__mod_lruvec_state.page_remove_rmap.zap_pte_range.unmap_page_range
8.77 ± 3% -1.9 6.85 ± 2% perf-profile.calltrace.cycles-pp.find_lock_entry.shmem_getpage_gfp.shmem_fault.__do_fault.do_fault
7.51 ± 2% -1.7 5.83 ± 2% perf-profile.calltrace.cycles-pp.__mod_lruvec_state.page_add_file_rmap.alloc_set_pte.finish_fault.do_fault
6.43 ± 3% -1.4 5.00 ± 2% perf-profile.calltrace.cycles-pp.__mod_memcg_state.__mod_lruvec_state.page_add_file_rmap.alloc_set_pte.finish_fault
7.33 ± 2% -1.1 6.23 ± 3% perf-profile.calltrace.cycles-pp.__count_memcg_events.handle_mm_fault.do_user_addr_fault.page_fault.testcase
3.22 -0.8 2.39 ± 2% perf-profile.calltrace.cycles-pp.fault_dirty_shared_page.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
1.89 ± 3% -0.8 1.09 ± 5% perf-profile.calltrace.cycles-pp._raw_spin_lock.alloc_set_pte.finish_fault.do_fault.__handle_mm_fault
0.98 ± 2% -0.4 0.61 ± 4% perf-profile.calltrace.cycles-pp.unlock_page.fault_dirty_shared_page.do_fault.__handle_mm_fault.handle_mm_fault
1.32 ± 2% -0.4 0.96 ± 3% perf-profile.calltrace.cycles-pp.up_read.do_user_addr_fault.page_fault.testcase
3.30 ± 2% -0.3 2.99 ± 2% perf-profile.calltrace.cycles-pp.lock_page_memcg.page_add_file_rmap.alloc_set_pte.finish_fault.do_fault
0.96 ± 2% -0.3 0.67 ± 3% perf-profile.calltrace.cycles-pp.down_read_trylock.do_user_addr_fault.page_fault.testcase
1.20 -0.3 0.94 ± 3% perf-profile.calltrace.cycles-pp.__perf_sw_event.do_user_addr_fault.page_fault.testcase
1.22 -0.2 1.01 ± 2% perf-profile.calltrace.cycles-pp.__perf_sw_event.page_fault.testcase
1.03 -0.2 0.83 ± 2% perf-profile.calltrace.cycles-pp.file_update_time.fault_dirty_shared_page.do_fault.__handle_mm_fault.handle_mm_fault
0.69 -0.2 0.53 ± 2% perf-profile.calltrace.cycles-pp.tlb_flush_mmu.zap_pte_range.unmap_page_range.unmap_vmas.unmap_region
0.91 -0.2 0.76 ± 2% perf-profile.calltrace.cycles-pp.swapgs_restore_regs_and_return_to_usermode.testcase
0.72 ± 3% -0.1 0.58 ± 3% perf-profile.calltrace.cycles-pp.___perf_sw_event.__perf_sw_event.do_user_addr_fault.page_fault.testcase
0.75 ± 2% -0.1 0.61 ± 2% perf-profile.calltrace.cycles-pp.___perf_sw_event.__perf_sw_event.page_fault.testcase
0.99 ± 2% -0.0 0.94 ± 2% perf-profile.calltrace.cycles-pp.lock_page_memcg.page_remove_rmap.zap_pte_range.unmap_page_range.unmap_vmas
1.32 ± 3% +0.3 1.62 ± 2% perf-profile.calltrace.cycles-pp.xas_load.find_get_entry.find_lock_entry.shmem_getpage_gfp.shmem_fault
9.98 ± 2% +0.9 10.88 ± 2% perf-profile.calltrace.cycles-pp.page_remove_rmap.zap_pte_range.unmap_page_range.unmap_vmas.unmap_region
42.89 +1.6 44.49 perf-profile.calltrace.cycles-pp.handle_mm_fault.do_user_addr_fault.page_fault.testcase
33.71 +3.0 36.76 perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.page_fault.testcase
31.29 +3.5 34.83 perf-profile.calltrace.cycles-pp.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.page_fault
16.85 +6.8 23.66 ± 2% perf-profile.calltrace.cycles-pp.finish_fault.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
16.62 +6.9 23.49 ± 2% perf-profile.calltrace.cycles-pp.alloc_set_pte.finish_fault.do_fault.__handle_mm_fault.handle_mm_fault
13.38 +8.0 21.37 ± 2% perf-profile.calltrace.cycles-pp.page_add_file_rmap.alloc_set_pte.finish_fault.do_fault.__handle_mm_fault
76.90 +8.0 84.93 perf-profile.calltrace.cycles-pp.testcase
14.88 ± 2% -3.9 10.95 ± 3% perf-profile.children.cycles-pp.__mod_lruvec_state
69.38 -3.6 65.83 perf-profile.children.cycles-pp.page_fault
12.73 ± 3% -3.5 9.26 ± 3% perf-profile.children.cycles-pp.__mod_memcg_state
10.72 ± 2% -2.3 8.38 ± 2% perf-profile.children.cycles-pp.__do_fault
10.29 ± 2% -2.3 8.04 ± 2% perf-profile.children.cycles-pp.shmem_fault
9.55 ± 2% -2.1 7.45 ± 3% perf-profile.children.cycles-pp.shmem_getpage_gfp
8.83 ± 3% -1.9 6.90 ± 2% perf-profile.children.cycles-pp.find_lock_entry
7.33 ± 2% -1.1 6.24 ± 3% perf-profile.children.cycles-pp.__count_memcg_events
3.29 -0.8 2.44 ± 2% perf-profile.children.cycles-pp.fault_dirty_shared_page
1.93 ± 3% -0.8 1.12 ± 5% perf-profile.children.cycles-pp._raw_spin_lock
4.25 -0.7 3.52 perf-profile.children.cycles-pp.sync_regs
2.42 -0.5 1.95 ± 2% perf-profile.children.cycles-pp.__perf_sw_event
4.30 ± 2% -0.4 3.93 ± 2% perf-profile.children.cycles-pp.lock_page_memcg
0.98 ± 2% -0.4 0.61 ± 4% perf-profile.children.cycles-pp.unlock_page
1.32 ± 2% -0.4 0.96 ± 3% perf-profile.children.cycles-pp.up_read
1.55 ± 2% -0.3 1.25 ± 2% perf-profile.children.cycles-pp.___perf_sw_event
0.96 ± 2% -0.3 0.67 ± 3% perf-profile.children.cycles-pp.down_read_trylock
1.04 -0.3 0.79 ± 2% perf-profile.children.cycles-pp.set_page_dirty
1.09 -0.2 0.87 ± 2% perf-profile.children.cycles-pp.page_mapping
1.05 -0.2 0.86 ± 2% perf-profile.children.cycles-pp.file_update_time
0.71 -0.2 0.54 ± 2% perf-profile.children.cycles-pp.tlb_flush_mmu
0.54 ± 2% -0.2 0.39 ± 2% perf-profile.children.cycles-pp.release_pages
0.91 -0.2 0.76 ± 2% perf-profile.children.cycles-pp.swapgs_restore_regs_and_return_to_usermode
0.59 -0.1 0.45 ± 3% perf-profile.children.cycles-pp.find_vma
0.75 -0.1 0.62 ± 2% perf-profile.children.cycles-pp.__mod_node_page_state
0.66 ± 2% -0.1 0.53 ± 4% perf-profile.children.cycles-pp.current_time
0.50 ± 2% -0.1 0.38 ± 4% perf-profile.children.cycles-pp.vmacache_find
0.70 ± 6% -0.1 0.60 ± 6% perf-profile.children.cycles-pp.smp_apic_timer_interrupt
0.65 ± 6% -0.1 0.55 ± 6% perf-profile.children.cycles-pp.hrtimer_interrupt
0.75 ± 5% -0.1 0.66 ± 5% perf-profile.children.cycles-pp.apic_timer_interrupt
0.50 ± 2% -0.1 0.41 ± 2% perf-profile.children.cycles-pp.do_page_fault
0.52 -0.1 0.43 ± 2% perf-profile.children.cycles-pp.___might_sleep
0.35 -0.1 0.27 ± 3% perf-profile.children.cycles-pp.__set_page_dirty_no_writeback
0.37 ± 5% -0.1 0.30 ± 8% perf-profile.children.cycles-pp.__hrtimer_run_queues
0.32 ± 6% -0.1 0.25 ± 9% perf-profile.children.cycles-pp.tick_sched_timer
0.41 ± 2% -0.1 0.35 ± 2% perf-profile.children.cycles-pp.prepare_exit_to_usermode
0.32 -0.1 0.26 ± 2% perf-profile.children.cycles-pp.mark_page_accessed
0.29 ± 3% -0.1 0.23 ± 4% perf-profile.children.cycles-pp.__might_sleep
0.49 ± 3% -0.0 0.45 ± 3% perf-profile.children.cycles-pp.__unlock_page_memcg
0.16 ± 2% -0.0 0.12 ± 4% perf-profile.children.cycles-pp.page_rmapping
0.20 ± 3% -0.0 0.16 ± 5% perf-profile.children.cycles-pp.perf_exclude_event
0.20 ± 3% -0.0 0.17 ± 3% perf-profile.children.cycles-pp.__alloc_pages_nodemask
0.20 ± 2% -0.0 0.17 ± 3% perf-profile.children.cycles-pp.pte_alloc_one
0.16 ± 2% -0.0 0.12 ± 3% perf-profile.children.cycles-pp.PageHuge
0.23 ± 2% -0.0 0.19 ± 2% perf-profile.children.cycles-pp._cond_resched
0.13 ± 3% -0.0 0.10 ± 4% perf-profile.children.cycles-pp.__memcg_kmem_charge_page
0.15 ± 4% -0.0 0.12 ± 5% perf-profile.children.cycles-pp.perf_swevent_event
0.14 ± 3% -0.0 0.12 ± 4% perf-profile.children.cycles-pp.vm_normal_page
0.08 ± 4% -0.0 0.06 ± 6% perf-profile.children.cycles-pp.page_counter_try_charge
0.12 ± 3% -0.0 0.10 ± 4% perf-profile.children.cycles-pp.rcu_all_qs
0.09 ± 2% -0.0 0.07 ± 5% perf-profile.children.cycles-pp.__memcg_kmem_charge
1.34 ± 3% +0.3 1.64 ± 2% perf-profile.children.cycles-pp.xas_load
10.05 ± 2% +0.9 10.94 ± 2% perf-profile.children.cycles-pp.page_remove_rmap
42.95 +1.6 44.54 perf-profile.children.cycles-pp.handle_mm_fault
33.75 +3.0 36.79 perf-profile.children.cycles-pp.__handle_mm_fault
31.37 +3.5 34.89 perf-profile.children.cycles-pp.do_fault
9.24 ± 2% +4.7 13.94 ± 3% perf-profile.children.cycles-pp.native_irq_return_iret
16.89 +6.8 23.68 ± 2% perf-profile.children.cycles-pp.finish_fault
16.67 +6.9 23.52 ± 2% perf-profile.children.cycles-pp.alloc_set_pte
13.42 +8.0 21.41 ± 2% perf-profile.children.cycles-pp.page_add_file_rmap
19.39 -3.5 15.91 ± 2% perf-profile.self.cycles-pp.testcase
12.63 ± 3% -3.4 9.19 ± 3% perf-profile.self.cycles-pp.__mod_memcg_state
7.32 ± 2% -1.1 6.22 ± 3% perf-profile.self.cycles-pp.__count_memcg_events
2.50 ± 2% -1.1 1.40 ± 6% perf-profile.self.cycles-pp.find_lock_entry
1.90 ± 3% -0.8 1.10 ± 5% perf-profile.self.cycles-pp._raw_spin_lock
4.24 -0.7 3.51 perf-profile.self.cycles-pp.sync_regs
2.00 ± 2% -0.5 1.51 ± 2% perf-profile.self.cycles-pp.zap_pte_range
2.34 -0.5 1.87 ± 2% perf-profile.self.cycles-pp.__handle_mm_fault
0.95 ± 2% -0.4 0.59 ± 4% perf-profile.self.cycles-pp.unlock_page
4.24 ± 2% -0.4 3.88 ± 2% perf-profile.self.cycles-pp.lock_page_memcg
1.30 ± 2% -0.3 0.95 ± 3% perf-profile.self.cycles-pp.up_read
1.37 ± 3% -0.3 1.06 ± 5% perf-profile.self.cycles-pp.__mod_lruvec_state
1.06 ± 2% -0.3 0.77 ± 2% perf-profile.self.cycles-pp.alloc_set_pte
0.95 ± 2% -0.3 0.66 ± 3% perf-profile.self.cycles-pp.down_read_trylock
1.60 -0.3 1.33 ± 2% perf-profile.self.cycles-pp.handle_mm_fault
1.15 -0.3 0.90 ± 2% perf-profile.self.cycles-pp.do_user_addr_fault
1.26 ± 2% -0.2 1.01 ± 3% perf-profile.self.cycles-pp.___perf_sw_event
1.05 -0.2 0.84 ± 2% perf-profile.self.cycles-pp.page_mapping
0.86 -0.2 0.70 ± 2% perf-profile.self.cycles-pp.__perf_sw_event
0.70 ± 5% -0.2 0.54 ± 5% perf-profile.self.cycles-pp.shmem_getpage_gfp
0.74 ± 2% -0.2 0.58 perf-profile.self.cycles-pp.shmem_fault
0.52 ± 2% -0.1 0.38 ± 2% perf-profile.self.cycles-pp.release_pages
0.73 -0.1 0.60 ± 2% perf-profile.self.cycles-pp.__mod_node_page_state
0.75 -0.1 0.63 ± 2% perf-profile.self.cycles-pp.page_fault
0.47 ± 2% -0.1 0.36 ± 3% perf-profile.self.cycles-pp.vmacache_find
0.33 -0.1 0.23 ± 2% perf-profile.self.cycles-pp.set_page_dirty
0.49 ± 2% -0.1 0.41 ± 2% perf-profile.self.cycles-pp.swapgs_restore_regs_and_return_to_usermode
0.49 -0.1 0.41 ± 2% perf-profile.self.cycles-pp.___might_sleep
0.48 -0.1 0.40 ± 2% perf-profile.self.cycles-pp.do_page_fault
0.40 ± 3% -0.1 0.32 ± 3% perf-profile.self.cycles-pp.file_update_time
0.32 -0.1 0.25 ± 4% perf-profile.self.cycles-pp.fault_dirty_shared_page
0.31 -0.1 0.25 ± 3% perf-profile.self.cycles-pp.__set_page_dirty_no_writeback
0.23 ± 4% -0.1 0.16 ± 3% perf-profile.self.cycles-pp.finish_fault
0.37 ± 3% -0.1 0.30 ± 2% perf-profile.self.cycles-pp.prepare_exit_to_usermode
0.31 -0.1 0.25 ± 2% perf-profile.self.cycles-pp.mark_page_accessed
0.26 ± 2% -0.1 0.21 ± 3% perf-profile.self.cycles-pp.__might_sleep
0.21 ± 3% -0.1 0.16 ± 2% perf-profile.self.cycles-pp.__do_fault
0.23 ± 4% -0.0 0.19 ± 5% perf-profile.self.cycles-pp.current_time
0.13 ± 3% -0.0 0.10 ± 4% perf-profile.self.cycles-pp.page_rmapping
0.17 ± 4% -0.0 0.13 ± 5% perf-profile.self.cycles-pp.perf_exclude_event
0.10 ± 5% -0.0 0.08 ± 4% perf-profile.self.cycles-pp.find_vma
0.12 ± 5% -0.0 0.09 ± 4% perf-profile.self.cycles-pp.perf_swevent_event
0.11 ± 2% -0.0 0.09 ± 4% perf-profile.self.cycles-pp.PageHuge
0.12 ± 3% -0.0 0.10 ± 4% perf-profile.self.cycles-pp.vm_normal_page
0.11 ± 3% -0.0 0.09 ± 4% perf-profile.self.cycles-pp._cond_resched
0.07 ± 5% -0.0 0.06 ± 8% perf-profile.self.cycles-pp.page_counter_try_charge
0.08 ± 4% -0.0 0.07 perf-profile.self.cycles-pp.rcu_all_qs
1.04 ± 2% +0.4 1.40 ± 2% perf-profile.self.cycles-pp.xas_load
1.66 ± 4% +3.2 4.84 ± 5% perf-profile.self.cycles-pp.page_remove_rmap
9.22 ± 2% +4.7 13.93 ± 3% perf-profile.self.cycles-pp.native_irq_return_iret
2.58 ± 2% +9.9 12.51 ± 4% perf-profile.self.cycles-pp.page_add_file_rmap
will-it-scale.per_process_ops
520000 +------------------------------------------------------------------+
500000 |-+ + + + + |
|+++ ++ +++++++++++++++++++ +++++ ++++++ ++ +++++++++++ |
480000 |++ +++: ++ + ++ + ++ + ++ + + |
460000 |-+ + :: + + |
| + |
440000 |-+ |
420000 |-+ O O |
400000 |-+ O O OOOO OOOOOOOOOO OOOOO OOO O OO OOOO|
| O OOOO OOOO O OO OOO O OOOOOOOOOOOO O OOO OOO O O|
380000 |OOOO O O O OO O O |
360000 |OO O O OO |
| O |
340000 |-+ O |
320000 +------------------------------------------------------------------+
will-it-scale.workload
1e+08 +-----------------------------------------------------------------+
| + + + + |
9.5e+07 |+++ + + +++++++++++++++ ++ + +++ ++++++++ +++ +++++++ |
9e+07 |++ +++:+++ + ++ + ++ + + + + |
| : |
8.5e+07 |-+ + |
| |
8e+07 |-+ OO O O O |
| O O O OOOOOOOOOOOOOOOOO O OOOOOOOOO O OOOOOOOOO|
7.5e+07 |-+ O OOOO OOOO O OOOOO O O OOOOOO OO|
7e+07 |OOO OO OO O |
|OO O O OO |
6.5e+07 |-+ O |
| |
6e+07 +-----------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
***************************************************************************************************
lkp-skl-fpga01: 104 threads Skylake with 192G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/runtime/size/tbox_group/test/testcase/ucode:
gcc-9/performance/x86_64-rhel-8.3/debian-10.4-x86_64-20200603.cgz/300s/1T/lkp-skl-fpga01/lru-shm/vm-scalability/0x2006906
commit:
6aeff241fe ("mm/migrate.c: migrate PG_readahead flag")
dcdf11ee14 ("mm, shmem: add vmstat for hugepage fallback")
6aeff241fe6c4561 dcdf11ee144133328664d90836e
---------------- ---------------------------
fail:runs %reproduction fail:runs
| | |
:2 50% 1:20 dmesg.WARNING:at#for_ip_interrupt_entry/0x
:2 50% 1:20 last_state.booting
:2 50% 1:20 last_state.is_incomplete_run
1:2 408% 10:20 perf-profile.calltrace.cycles-pp.sync_regs.error_entry
:2 420% 8:20 perf-profile.calltrace.cycles-pp.sync_regs.error_entry.do_access
3:2 875% 21:20 perf-profile.calltrace.cycles-pp.error_entry
0:2 914% 19:20 perf-profile.calltrace.cycles-pp.error_entry.do_access
5:2 1972% 44:20 perf-profile.children.cycles-pp.error_entry
2:2 956% 21:20 perf-profile.self.cycles-pp.error_entry
%stddev %change %stddev
\ | \
0.54 ± 13% -0.1 0.45 ± 20% perf-profile.children.cycles-pp.find_get_entries
0.60 ± 17% -0.1 0.51 ± 20% perf-profile.children.cycles-pp.clockevents_program_event
0.29 -0.1 0.21 ± 17% perf-profile.children.cycles-pp.find_vma
0.28 ± 5% -0.1 0.19 ± 17% perf-profile.children.cycles-pp.vmacache_find
0.18 ± 8% -0.1 0.13 ± 24% perf-profile.children.cycles-pp.truncate_cleanup_page
0.12 ± 28% -0.0 0.09 ± 32% perf-profile.children.cycles-pp.tick_irq_enter
0.08 ± 5% -0.0 0.07 ± 12% perf-profile.children.cycles-pp.perf_exclude_event
0.76 ± 7% -0.1 0.64 ± 20% perf-profile.self.cycles-pp.release_pages
0.74 ± 8% -0.1 0.62 ± 13% perf-profile.self.cycles-pp.__handle_mm_fault
0.43 ± 12% -0.1 0.32 ± 41% perf-profile.self.cycles-pp.zap_pte_range
0.70 ± 12% -0.1 0.61 ± 13% perf-profile.self.cycles-pp.__pagevec_lru_add_fn
0.48 ± 14% -0.1 0.39 ± 20% perf-profile.self.cycles-pp.find_get_entries
0.27 ± 3% -0.1 0.19 ± 17% perf-profile.self.cycles-pp.vmacache_find
0.43 ± 8% -0.1 0.37 ± 11% perf-profile.self.cycles-pp.rmqueue
0.17 ± 11% -0.1 0.12 ± 24% perf-profile.self.cycles-pp.truncate_cleanup_page
0.26 ± 5% -0.0 0.22 ± 12% perf-profile.self.cycles-pp.do_user_addr_fault
0.21 ± 11% -0.0 0.18 ± 14% perf-profile.self.cycles-pp.do_fault
0.18 ± 11% -0.0 0.15 ± 15% perf-profile.self.cycles-pp.___might_sleep
0.08 ± 6% -0.0 0.06 ± 12% perf-profile.self.cycles-pp.perf_exclude_event
19.96 -1.0% 19.75 boot-time.dhcp
8812 ± 4% -5.0% 8370 ± 5% numa-meminfo.node1.KernelStack
8800 ± 4% -4.8% 8373 ± 5% numa-vmstat.node1.nr_kernel_stack
53.50 ± 5% +67.3% 89.51 ± 18% sched_debug.cfs_rq:/.nr_spread_over.max
828.33 ± 9% -15.6% 699.50 ± 14% sched_debug.cfs_rq:/.util_est_enqueued.max
1361213 -6.5% 1273045 ± 4% perf-stat.i.node-loads
59.25 +2.9 62.17 ± 3% perf-stat.overall.node-load-miss-rate%
1359101 -6.4% 1271647 ± 3% perf-stat.ps.node-loads
6903 ± 5% -9.2% 6270 ± 11% slabinfo.eventpoll_pwq.active_objs
6903 ± 5% -9.2% 6270 ± 11% slabinfo.eventpoll_pwq.num_objs
1726 ± 3% -7.3% 1600 ± 6% slabinfo.pool_workqueue.num_objs
105593 ± 5% -9.0% 96123 ± 7% softirqs.CPU30.TIMER
27563 ± 5% -10.4% 24703 ± 6% softirqs.CPU35.SCHED
102935 ± 3% -7.5% 95173 ± 5% softirqs.CPU35.TIMER
26724 ± 7% -11.1% 23757 ± 5% softirqs.CPU38.SCHED
101664 ± 3% -6.5% 95094 ± 6% softirqs.CPU48.TIMER
25948 ± 3% -6.0% 24399 ± 5% softirqs.CPU49.SCHED
24817 ± 6% -7.7% 22911 ± 3% softirqs.CPU53.SCHED
106024 ± 13% -20.5% 84307 ± 15% softirqs.CPU62.TIMER
28303 ± 11% -13.7% 24432 ± 6% softirqs.CPU80.SCHED
15005 ± 16% -19.9% 12017 ± 6% softirqs.CPU82.RCU
26967 ± 8% -11.5% 23853 ± 6% softirqs.CPU92.SCHED
25547 ± 4% -6.1% 23979 ± 2% softirqs.CPU97.SCHED
25487 ± 3% -7.1% 23687 ± 4% softirqs.CPU98.SCHED
6613 ± 40% -77.3% 1500 ±144% interrupts.CPU35.RES:Rescheduling_interrupts
3834 ± 91% -92.0% 307.00 ± 83% interrupts.CPU38.RES:Rescheduling_interrupts
2074 ± 28% -23.3% 1590 ± 22% interrupts.CPU4.NMI:Non-maskable_interrupts
2074 ± 28% -23.3% 1590 ± 22% interrupts.CPU4.PMI:Performance_monitoring_interrupts
4380 ± 68% -69.9% 1320 ± 18% interrupts.CPU46.NMI:Non-maskable_interrupts
4380 ± 68% -69.9% 1320 ± 18% interrupts.CPU46.PMI:Performance_monitoring_interrupts
4634 ± 95% -84.8% 704.00 ±147% interrupts.CPU53.RES:Rescheduling_interrupts
2037 ± 26% -32.4% 1376 ± 20% interrupts.CPU64.NMI:Non-maskable_interrupts
2037 ± 26% -32.4% 1376 ± 20% interrupts.CPU64.PMI:Performance_monitoring_interrupts
3381 ± 94% -94.6% 184.05 ± 50% interrupts.CPU72.RES:Rescheduling_interrupts
5218 ± 95% -86.6% 700.84 ±174% interrupts.CPU88.RES:Rescheduling_interrupts
5814 ± 80% -75.4% 1429 ±170% interrupts.CPU92.RES:Rescheduling_interrupts
1957 ± 67% -87.6% 242.00 ± 83% interrupts.CPU97.RES:Rescheduling_interrupts
1496 ± 53% -82.2% 266.58 ± 80% interrupts.CPU98.RES:Rescheduling_interrupts
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Rong Chen
View attachment "config-5.6.0-11465-gdcdf11ee14413" of type "text/plain" (156771 bytes)
View attachment "job-script" of type "text/plain" (7632 bytes)
View attachment "job.yaml" of type "text/plain" (5297 bytes)
View attachment "reproduce" of type "text/plain" (344 bytes)
Powered by blists - more mailing lists