[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180102031743.GQ3172@yexl-desktop>
Date: Tue, 2 Jan 2018 11:17:44 +0800
From: kernel test robot <xiaolong.ye@...el.com>
To: Shakeel Butt <shakeelb@...gle.com>
Cc: Stephen Rothwell <sfr@...b.auug.org.au>,
Vlastimil Babka <vbabka@...e.cz>,
Jérôme Glisse <jglisse@...hat.com>,
Huang Ying <ying.huang@...el.com>,
Tim Chen <tim.c.chen@...ux.intel.com>,
Michal Hocko <mhocko@...nel.org>,
Greg Thelen <gthelen@...gle.com>,
Johannes Weiner <hannes@...xchg.org>,
Balbir Singh <bsingharora@...il.com>,
Minchan Kim <minchan@...nel.org>, Shaohua Li <shli@...com>,
Jan Kara <jack@...e.cz>, Nicholas Piggin <npiggin@...il.com>,
Dan Williams <dan.j.williams@...el.com>,
Mel Gorman <mgorman@...e.de>, Hugh Dickins <hughd@...gle.com>,
Andrew Morton <akpm@...ux-foundation.org>,
LKML <linux-kernel@...r.kernel.org>, lkp@...org
Subject: [lkp-robot] [mm, mlock, vmscan] 7f2ca91b49: reaim.jobs_per_min
-7.4% regression
Greeting,
FYI, we noticed a -7.4% regression of reaim.jobs_per_min due to commit:
commit: 7f2ca91b498654e7e3405f1f76ac5a80c76d336e ("mm, mlock, vmscan: no more skipping pagevecs")
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
in testcase: reaim
on test machine: 56 threads Intel(R) Xeon(R) CPU E5-2695 v3 @ 2.30GHz with 256G memory
with following parameters:
runtime: 300s
nr_task: 1000
test: mem_rtns_1
cpufreq_governor: performance
test-description: REAIM is an updated and improved version of AIM 7 benchmark.
test-url: https://sourceforge.net/projects/re-aim-7/
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
=========================================================================================
compiler/cpufreq_governor/kconfig/nr_task/rootfs/runtime/tbox_group/test/testcase:
gcc-7/performance/x86_64-rhel-7.2/1000/debian-x86_64-2016-08-31.cgz/300s/lkp-hsw-ep5/mem_rtns_1/reaim
commit:
df0ed1a935 ("mm: use sc->priority for slab shrink targets")
7f2ca91b49 ("mm, mlock, vmscan: no more skipping pagevecs")
df0ed1a935d2fa55 7f2ca91b498654e7e3405f1f76
---------------- --------------------------
%stddev %change %stddev
\ | \
181225 -7.4% 167863 reaim.jobs_per_min
181.22 -7.4% 167.86 reaim.jobs_per_min_child
85.06 -2.0% 83.31 reaim.jti
187067 -8.1% 171983 reaim.max_jobs_per_min
33.11 +8.0% 35.75 reaim.parent_time
14.36 +12.3% 16.12 ± 3% reaim.std_dev_percent
4.30 +20.0% 5.17 ± 2% reaim.std_dev_time
317.29 -4.5% 303.15 reaim.time.elapsed_time
317.29 -4.5% 303.15 reaim.time.elapsed_time.max
4244477 -3.1% 4112587 reaim.time.involuntary_context_switches
1.42e+09 -11.3% 1.26e+09 reaim.time.minor_page_faults
4512 -12.7% 3938 reaim.time.user_time
0.05 ± 22% +0.0 0.07 ± 23% mpstat.cpu.iowait%
1182842 ± 2% -10.6% 1057587 turbostat.C6
1.143e+09 ± 2% -10.7% 1.02e+09 cpuidle.C6.time
1183943 ± 2% -10.6% 1058650 cpuidle.C6.usage
20578 ± 2% +4.9% 21586 vmstat.system.cs
59905 -1.7% 58858 vmstat.system.in
7.263e+08 -11.4% 6.436e+08 numa-numastat.node0.local_node
7.263e+08 -11.4% 6.436e+08 numa-numastat.node0.numa_hit
6.96e+08 -11.0% 6.193e+08 numa-numastat.node1.local_node
6.96e+08 -11.0% 6.193e+08 numa-numastat.node1.numa_hit
360758 -15.3% 305643 meminfo.Active
43539 -100.0% 4.00 meminfo.Active(file)
1143319 -99.1% 10321 meminfo.Inactive
1132616 -100.0% 0.00 meminfo.Inactive(file)
35919 ± 14% -37.0% 22614 ± 3% meminfo.Shmem
184115 ± 4% -24.9% 138319 ± 21% numa-meminfo.node0.Active
21699 ± 4% -100.0% 2.00 ±100% numa-meminfo.node0.Active(file)
575096 ± 2% -99.4% 3258 ±109% numa-meminfo.node0.Inactive
569787 ± 2% -100.0% 0.00 numa-meminfo.node0.Inactive(file)
21838 ± 5% -100.0% 2.00 ±100% numa-meminfo.node1.Active(file)
568212 ± 2% -98.8% 7034 ± 51% numa-meminfo.node1.Inactive
562829 ± 2% -100.0% 0.00 numa-meminfo.node1.Inactive(file)
96.43 ± 3% -18.5% 78.61 ± 4% sched_debug.cfs_rq:/.exec_clock.stddev
77.67 ± 19% -33.9% 51.33 ± 61% sched_debug.cfs_rq:/.removed.util_avg.max
13.81 ± 2% -9.2% 12.54 sched_debug.cpu.nr_running.avg
23.25 ± 10% -15.6% 19.62 ± 2% sched_debug.cpu.nr_running.max
259.54 ± 18% +36.8% 355.08 ± 11% sched_debug.cpu.sched_goidle.min
11994 ± 6% +10.5% 13254 ± 6% sched_debug.cpu.ttwu_count.avg
1400 ± 5% +50.4% 2106 ± 20% sched_debug.cpu.ttwu_count.min
10884 -100.0% 1.00 proc-vmstat.nr_active_file
283153 -100.0% 0.00 proc-vmstat.nr_inactive_file
8982 ± 14% -37.1% 5645 ± 3% proc-vmstat.nr_shmem
10884 -100.0% 1.00 proc-vmstat.nr_zone_active_file
283153 -100.0% 0.00 proc-vmstat.nr_zone_inactive_file
1.422e+09 -11.2% 1.263e+09 proc-vmstat.numa_hit
1.422e+09 -11.2% 1.263e+09 proc-vmstat.numa_local
13178 ± 19% -57.6% 5583 ± 5% proc-vmstat.pgactivate
1.422e+09 -11.2% 1.263e+09 proc-vmstat.pgalloc_normal
1.42e+09 -11.2% 1.261e+09 proc-vmstat.pgfault
1.422e+09 -11.2% 1.263e+09 proc-vmstat.pgfree
5424 ± 4% -100.0% 0.50 ±100% numa-vmstat.node0.nr_active_file
142446 ± 2% -100.0% 0.00 numa-vmstat.node0.nr_inactive_file
5424 ± 4% -100.0% 0.50 ±100% numa-vmstat.node0.nr_zone_active_file
142446 ± 2% -100.0% 0.00 numa-vmstat.node0.nr_zone_inactive_file
3.635e+08 -11.7% 3.209e+08 numa-vmstat.node0.numa_hit
3.635e+08 -11.7% 3.209e+08 numa-vmstat.node0.numa_local
5459 ± 5% -100.0% 0.50 ±100% numa-vmstat.node1.nr_active_file
140707 ± 2% -100.0% 0.00 numa-vmstat.node1.nr_inactive_file
5459 ± 5% -100.0% 0.50 ±100% numa-vmstat.node1.nr_zone_active_file
140707 ± 2% -100.0% 0.00 numa-vmstat.node1.nr_zone_inactive_file
3.495e+08 -11.4% 3.096e+08 numa-vmstat.node1.numa_hit
3.493e+08 -11.4% 3.095e+08 numa-vmstat.node1.numa_local
5.422e+12 -7.6% 5.011e+12 perf-stat.branch-instructions
0.35 -0.0 0.34 perf-stat.branch-miss-rate%
1.876e+10 -10.4% 1.682e+10 perf-stat.branch-misses
3.13 +0.2 3.31 perf-stat.cache-miss-rate%
4.441e+09 -5.1% 4.214e+09 ± 2% perf-stat.cache-misses
1.419e+11 -10.3% 1.273e+11 perf-stat.cache-references
1.72 +4.8% 1.81 perf-stat.cpi
4.338e+13 -3.5% 4.185e+13 perf-stat.cpu-cycles
6.251e+12 -7.9% 5.754e+12 perf-stat.dTLB-loads
5.035e+09 -11.3% 4.465e+09 perf-stat.dTLB-store-misses
3.99e+12 -11.0% 3.549e+12 perf-stat.dTLB-stores
7.094e+09 -10.9% 6.321e+09 perf-stat.iTLB-loads
2.518e+13 -8.0% 2.317e+13 perf-stat.instructions
0.58 -4.6% 0.55 perf-stat.ipc
1.42e+09 -11.2% 1.261e+09 perf-stat.minor-faults
2.306e+09 ± 5% -9.0% 2.099e+09 ± 3% perf-stat.node-stores
1.42e+09 -11.2% 1.261e+09 perf-stat.page-faults
81.49 -66.7 14.75 ± 22% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_fastpath
81.36 -66.7 14.69 ± 22% perf-profile.calltrace.cycles-pp.sys_brk.entry_SYSCALL_64_fastpath
80.05 -65.5 14.50 ± 22% perf-profile.calltrace.cycles-pp.do_munmap.sys_brk.entry_SYSCALL_64_fastpath
74.39 -60.7 13.65 ± 22% perf-profile.calltrace.cycles-pp.unmap_region.do_munmap.sys_brk.entry_SYSCALL_64_fastpath
65.45 -3.8 61.61 perf-profile.calltrace.cycles-pp.tlb_finish_mmu.unmap_region.do_munmap.sys_brk.entry_SYSCALL_64_fastpath
65.30 -3.8 61.48 perf-profile.calltrace.cycles-pp.arch_tlb_finish_mmu.tlb_finish_mmu.unmap_region.do_munmap.sys_brk
63.63 -3.6 60.03 perf-profile.calltrace.cycles-pp.tlb_flush_mmu_free.arch_tlb_finish_mmu.tlb_finish_mmu.unmap_region.do_munmap
63.39 -3.6 59.81 perf-profile.calltrace.cycles-pp.release_pages.tlb_flush_mmu_free.arch_tlb_finish_mmu.tlb_finish_mmu.unmap_region
59.52 -3.1 56.46 perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.release_pages.tlb_flush_mmu_free.arch_tlb_finish_mmu.tlb_finish_mmu
58.70 -2.9 55.79 perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.release_pages.tlb_flush_mmu_free.arch_tlb_finish_mmu
11.82 +0.0 11.82 perf-profile.calltrace.cycles-pp.page_fault
11.71 +0.0 11.72 perf-profile.calltrace.cycles-pp.__do_page_fault.do_page_fault.page_fault
11.72 +0.0 11.73 perf-profile.calltrace.cycles-pp.do_page_fault.page_fault
9.99 +0.2 10.14 perf-profile.calltrace.cycles-pp.handle_mm_fault.__do_page_fault.do_page_fault.page_fault
9.68 +0.2 9.88 perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault.page_fault
5.26 +0.9 6.20 perf-profile.calltrace.cycles-pp.lru_add_drain.unmap_region.do_munmap.sys_brk.entry_SYSCALL_64_fastpath
4.62 +0.9 5.56 perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.pagevec_lru_move_fn.lru_add_drain_cpu.lru_add_drain.unmap_region
5.22 +1.0 6.18 perf-profile.calltrace.cycles-pp.lru_add_drain_cpu.lru_add_drain.unmap_region.do_munmap.sys_brk
4.55 +1.0 5.51 perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.pagevec_lru_move_fn.lru_add_drain_cpu.lru_add_drain
5.08 +1.0 6.05 perf-profile.calltrace.cycles-pp.pagevec_lru_move_fn.lru_add_drain_cpu.lru_add_drain.unmap_region.do_munmap
0.00 +57.3 57.28 ± 4% perf-profile.calltrace.cycles-pp.unmap_region.do_munmap.sys_brk.entry_SYSCALL_64_fastpath.brk
0.00 +61.3 61.34 ± 4% perf-profile.calltrace.cycles-pp.do_munmap.sys_brk.entry_SYSCALL_64_fastpath.brk
0.00 +62.3 62.30 ± 4% perf-profile.calltrace.cycles-pp.sys_brk.entry_SYSCALL_64_fastpath.brk
0.00 +62.4 62.36 ± 4% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_fastpath.brk
0.00 +65.0 64.95 ± 4% perf-profile.calltrace.cycles-pp.brk
81.45 -4.4 77.05 perf-profile.children.cycles-pp.sys_brk
81.53 -4.3 77.20 perf-profile.children.cycles-pp.entry_SYSCALL_64_fastpath
80.18 -4.2 75.98 perf-profile.children.cycles-pp.do_munmap
65.46 -3.8 61.63 perf-profile.children.cycles-pp.tlb_finish_mmu
65.34 -3.8 61.52 perf-profile.children.cycles-pp.arch_tlb_finish_mmu
63.69 -3.6 60.08 perf-profile.children.cycles-pp.tlb_flush_mmu_free
63.71 -3.6 60.10 perf-profile.children.cycles-pp.release_pages
74.47 -3.5 71.01 perf-profile.children.cycles-pp.unmap_region
67.37 -1.6 65.74 perf-profile.children.cycles-pp._raw_spin_lock_irqsave
66.42 -1.4 64.99 perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
11.84 +0.0 11.84 perf-profile.children.cycles-pp.page_fault
11.78 +0.0 11.79 perf-profile.children.cycles-pp.do_page_fault
11.80 +0.0 11.81 perf-profile.children.cycles-pp.__do_page_fault
10.04 +0.2 10.20 perf-profile.children.cycles-pp.handle_mm_fault
9.75 +0.2 9.94 perf-profile.children.cycles-pp.__handle_mm_fault
5.37 +0.9 6.30 perf-profile.children.cycles-pp.lru_add_drain
5.32 +1.0 6.27 perf-profile.children.cycles-pp.lru_add_drain_cpu
8.98 +1.6 10.62 perf-profile.children.cycles-pp.pagevec_lru_move_fn
0.00 +65.0 64.95 ± 4% perf-profile.children.cycles-pp.brk
66.42 -1.4 64.99 perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
reaim.parent_time
36.5 +-+------------------------------------------------------------------+
| O O |
36 O-O O O O O O O O O O O O |
35.5 +-+ O O O O O O |
| |
35 +-+ |
| |
34.5 +-+ |
| |
34 +-+ |
33.5 +-+ |
| .+.+..+. .+..+.+. .+.+.+..+.+. .+. .+. .|
33 +-+.. .+.+..+.+ + +..+ +. +.+. +.+.+..+ |
| + |
32.5 +-+------------------------------------------------------------------+
reaim.child_systime
1540 +-+------------------------------------------------------------------+
1520 +-+ O O |
O O O O O O O O O O O O O O O |
1500 +-+ O O O O |
1480 +-+ |
1460 +-+ |
1440 +-+ |
| |
1420 +-+ |
1400 +-+ |
1380 +-+ |
1360 +-+ .+..+. +. |
|.+.. .+. .+.+.+..+.+ +.+.. .+.+.+..+.+.+..+. .. +.+.+.. .|
1340 +-+ + +..+ + +.+ + |
1320 +-+------------------------------------------------------------------+
reaim.jobs_per_min
184000 +-+----------------------------------------------------------------+
182000 +-+.+.. |
| + +.+.+.+.. .+.+.+.. .+..+.+.+. .+.. .+.+.+..+. .+.+..+.+.|
180000 +-+ + +.+.+ + + + |
178000 +-+ |
| |
176000 +-+ |
174000 +-+ |
172000 +-+ |
| |
170000 +-+ |
168000 +-+ O O O O O O O |
O O O O O O O O O O O O |
166000 +-+ O O |
164000 +-+----------------------------------------------------------------+
reaim.jobs_per_min_child
184 +-+-------------------------------------------------------------------+
182 +-+ .+. |
| +. +.+..+.+. .+.+.+.. .+.+.+..+. .+.. .+.+..+.+. .+.+.+..+.|
180 +-+ +. +.+.+. + + +. |
178 +-+ |
| |
176 +-+ |
174 +-+ |
172 +-+ |
| |
170 +-+ |
168 +-+ O O O O O O O |
O O O O O O O O O O O O |
166 +-+ O O |
164 +-+-------------------------------------------------------------------+
reaim.std_dev_time
5.4 +-+-------------------------------------------------------------------+
| O O |
5.2 +-+ O O O O O O |
O O O O |
| O O O O O O O O O |
5 +-+ |
| |
4.8 +-+ |
| |
4.6 +-+ |
| .+ |
| .+. + |
4.4 +-+ .+. .+.+.. .+.+.. .+ +.+.. .+. .+. .+.+..+. |
|.+..+ +..+ + + + +. +.+..+ +.+..+.|
4.2 +-+-------------------------------------------------------------------+
reaim.max_jobs_per_min
190000 +-+----------------------------------------------------------------+
188000 +-+ .+. .+. .+. .+..+. |
| +.+..+ +. .+.. .+. +.+.+.+..+.+ +.. .+.+ +.|
186000 +-+ +..+.+.+ +.+.+ + |
184000 +-+ |
182000 +-+ |
180000 +-+ |
| |
178000 +-+ |
176000 +-+ |
174000 +-+ |
172000 +-O O O O O O O O |
O O O O O O O O O O O |
170000 +-+ O O |
168000 +-+----------------------------------------------------------------+
perf-stat.instructions
2.55e+13 +-+--------------------------------------------------------------+
| .+. .+. .+. .+. |
|.+.+..+.+.+.+ +..+.+.+ +.+..+.+.+.+.+.+..+ +.+ +..+.+.+.|
2.5e+13 +-+ |
| |
| |
2.45e+13 +-+ |
| |
2.4e+13 +-+ |
| |
| |
2.35e+13 +-+ |
| O O O |
O O O O O O O O O O O O O O O O O O |
2.3e+13 +-+--------------------------------------------------------------+
perf-stat.cache-references
1.5e+11 +-+--------------------------------------------------------------+
| .+.+ +..+.+ |
| .+.+ : |
1.45e+11 +-+.+.. .+.+. : |
| + +.+.+.+ +.+.+.+..+.+. .+.+.+..+. |
| + +.+.|
1.4e+11 +-+ |
| |
1.35e+11 +-+ |
| |
O O O O O O |
1.3e+11 +-+ O O O O O |
| O O O |
| O O O O O O O |
1.25e+11 +-+--------------------------------------------------------------+
perf-stat.branch-instructions
5.45e+12 +-+--------------------------------------------------------------+
5.4e+12 +-+ +..+.+.+.+ +.+.+ +.+..+.+.+.+.+.+..+ +.+ +..+.+.+ |
| |
5.35e+12 +-+ |
5.3e+12 +-+ |
| |
5.25e+12 +-+ |
5.2e+12 +-+ |
5.15e+12 +-+ |
| |
5.1e+12 +-+ |
5.05e+12 +-+ O O |
O O O O O O O O O O O O O O O |
5e+12 +-+ O O O O |
4.95e+12 +-+--------------------------------------------------------------+
perf-stat.branch-misses
1.95e+10 +-+--------------------------------------------------------------+
| .+. .+.+.. |
1.9e+10 +-+ .+ +.+ .+. |
| .+.+.+.+.+. +.+. .+ +.+.. .+.|
|.+.+..+ +. .+. .+.+ +.+ |
1.85e+10 +-+ + +. |
| |
1.8e+10 +-+ |
| |
1.75e+10 +-+ |
| |
| O O O O O O |
1.7e+10 O-O O O O O O O |
| O O O O O O O |
1.65e+10 +-+--------------------------------------------------------------+
perf-stat.dTLB-loads
6.4e+12 +-+---------------------------------------------------------------+
| |
6.3e+12 +-+ .+.+.+..+.+.+.+.+..+.+. .+. |
|.+.+..+.+ + +..+.+.+.+.+..+.+.+.+.+..+.+.|
6.2e+12 +-+ |
| |
6.1e+12 +-+ |
| |
6e+12 +-+ |
| |
5.9e+12 +-+ |
| |
5.8e+12 O-O O O O O O O O |
| O O O O O O O O O O O O |
5.7e+12 +-+---------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Xiaolong
View attachment "config-4.15.0-rc3-00107-g7f2ca91" of type "text/plain" (163758 bytes)
View attachment "job-script" of type "text/plain" (6797 bytes)
View attachment "job.yaml" of type "text/plain" (4450 bytes)
View attachment "reproduce" of type "text/plain" (1324 bytes)
Powered by blists - more mailing lists