[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20171204031231.GK25368@yexl-desktop>
Date: Mon, 4 Dec 2017 11:12:31 +0800
From: kernel test robot <xiaolong.ye@...el.com>
To: Jan Kara <jack@...e.cz>
Cc: Stephen Rothwell <sfr@...b.auug.org.au>,
"Darrick J. Wong" <darrick.wong@...cle.com>,
Dave Chinner <david@...morbit.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Andrew Morton <akpm@...ux-foundation.org>,
LKML <linux-kernel@...r.kernel.org>, lkp@...org
Subject: [lkp-robot] [mm] 4af77b68c2: vm-scalability.throughput -9.0%
regression
Greeting,
FYI, we noticed a -9.0% regression of vm-scalability.throughput due to commit:
commit: 4af77b68c2c6280230daf53fe8f13db706858187 ("mm: readahead: increase maximum readahead window")
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
in testcase: vm-scalability
on test machine: 4 threads Intel(R) Core(TM) i3-3220 CPU @ 3.30GHz with 8G memory
with following parameters:
runtime: 300s
test: lru-file-readonce
cpufreq_governor: performance
test-description: The motivation behind this suite is to exercise functions and regions of the mm/ of the Linux kernel which are of interest to us.
test-url: https://git.kernel.org/cgit/linux/kernel/git/wfg/vm-scalability.git/
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/runtime/tbox_group/test/testcase:
gcc-7/performance/x86_64-rhel-7.2/debian-x86_64-2016-08-31.cgz/300s/lkp-ivb-d02/lru-file-readonce/vm-scalability
commit:
214c9de472 ("mm/madvise: enable soft offline of HugeTLB pages at PUD level")
4af77b68c2 ("mm: readahead: increase maximum readahead window")
214c9de472b94aa8 4af77b68c2c6280230daf53fe8
---------------- --------------------------
%stddev %change %stddev
\ | \
5587932 -9.0% 5085253 vm-scalability.throughput
1398617 -8.9% 1273916 vm-scalability.median
307.25 +2.0% 313.50 vm-scalability.time.percent_of_cpu_this_job_got
888.71 +2.6% 911.83 vm-scalability.time.system_time
42.20 -11.3% 37.42 vm-scalability.time.user_time
1.676e+09 -9.0% 1.526e+09 vm-scalability.workload
4415 ± 6% +46.1% 6451 ± 16% slabinfo.kmalloc-32.active_objs
4415 ± 6% +46.4% 6463 ± 15% slabinfo.kmalloc-32.num_objs
416118 -9.9% 374847 softirqs.RCU
586589 ± 3% -6.2% 550233 ± 4% softirqs.TIMER
4099 ± 2% -50.9% 2013 ± 28% proc-vmstat.allocstall_movable
4556 +33.4% 6079 ± 5% proc-vmstat.kswapd_low_wmark_hit_quickly
4558 +33.4% 6080 ± 5% proc-vmstat.pageoutrun
1819820 ± 2% -50.8% 896227 ± 24% proc-vmstat.pgscan_direct
1819820 ± 2% -50.8% 896227 ± 24% proc-vmstat.pgsteal_direct
49555 ± 41% +91.1% 94716 ± 18% sched_debug.cfs_rq:/.MIN_vruntime.avg
198221 ± 41% +91.1% 378866 ± 18% sched_debug.cfs_rq:/.MIN_vruntime.max
85832 ± 41% +91.1% 164054 ± 18% sched_debug.cfs_rq:/.MIN_vruntime.stddev
241173 ± 11% -22.8% 186090 ± 10% sched_debug.cfs_rq:/.load.min
392.96 ± 16% -28.3% 281.92 ± 18% sched_debug.cfs_rq:/.load_avg.min
49555 ± 41% +91.1% 94716 ± 18% sched_debug.cfs_rq:/.max_vruntime.avg
198222 ± 41% +91.1% 378867 ± 18% sched_debug.cfs_rq:/.max_vruntime.max
85833 ± 41% +91.1% 164054 ± 18% sched_debug.cfs_rq:/.max_vruntime.stddev
13913 ± 16% +58.1% 22002 ± 25% sched_debug.cfs_rq:/.min_vruntime.stddev
13913 ± 16% +58.1% 22001 ± 25% sched_debug.cfs_rq:/.spread0.stddev
241173 ± 11% -22.8% 186108 ± 10% sched_debug.cpu.load.min
2.07 ± 4% -16.1% 1.74 ± 10% sched_debug.cpu.nr_running.avg
5.284e+11 -8.7% 4.823e+11 perf-stat.branch-instructions
0.35 -0.0 0.31 ± 4% perf-stat.branch-miss-rate%
1.864e+09 -19.8% 1.494e+09 ± 3% perf-stat.branch-misses
62.27 +5.4 67.71 perf-stat.cache-miss-rate%
2.965e+10 +11.8% 3.314e+10 perf-stat.cache-misses
4.762e+10 +2.8% 4.894e+10 perf-stat.cache-references
9160936 +1.2% 9268581 perf-stat.context-switches
1.41 +9.9% 1.55 perf-stat.cpi
19670 -3.1% 19055 perf-stat.cpu-migrations
8.242e+11 -8.9% 7.512e+11 perf-stat.dTLB-loads
0.14 ± 4% -0.0 0.11 ± 10% perf-stat.dTLB-store-miss-rate%
8.942e+08 ± 4% -25.3% 6.682e+08 ± 9% perf-stat.dTLB-store-misses
6.421e+11 -8.9% 5.851e+11 perf-stat.dTLB-stores
93.94 -1.4 92.56 perf-stat.iTLB-load-miss-rate%
4.418e+08 ± 3% -17.8% 3.631e+08 ± 10% perf-stat.iTLB-load-misses
2.77e+12 -8.8% 2.526e+12 perf-stat.instructions
0.71 -9.0% 0.64 perf-stat.ipc
39.34 ± 2% -4.5 34.84 perf-profile.calltrace.cycles-pp.__do_page_cache_readahead.ondemand_readahead.generic_file_read_iter.xfs_file_buffered_aio_read.xfs_file_read_iter
39.34 ± 2% -4.5 34.85 perf-profile.calltrace.cycles-pp.ondemand_readahead.generic_file_read_iter.xfs_file_buffered_aio_read.xfs_file_read_iter.__vfs_read
31.06 ± 2% -3.3 27.80 perf-profile.calltrace.cycles-pp.mpage_readpages.__do_page_cache_readahead.ondemand_readahead.generic_file_read_iter.xfs_file_buffered_aio_read
17.78 ± 2% -2.1 15.64 perf-profile.calltrace.cycles-pp.do_mpage_readpage.mpage_readpages.__do_page_cache_readahead.ondemand_readahead.generic_file_read_iter
12.68 -1.3 11.36 perf-profile.calltrace.cycles-pp.add_to_page_cache_lru.mpage_readpages.__do_page_cache_readahead.ondemand_readahead.generic_file_read_iter
7.54 -0.9 6.68 perf-profile.calltrace.cycles-pp.__add_to_page_cache_locked.add_to_page_cache_lru.mpage_readpages.__do_page_cache_readahead.ondemand_readahead
5.66 ± 3% -0.6 5.05 ± 3% perf-profile.calltrace.cycles-pp.__alloc_pages_nodemask.__do_page_cache_readahead.ondemand_readahead.generic_file_read_iter.xfs_file_buffered_aio_read
6.79 ± 6% -0.5 6.33 perf-profile.calltrace.cycles-pp.__remove_mapping.shrink_page_list.shrink_inactive_list.shrink_node_memcg.shrink_node
17.75 ± 6% -0.1 17.60 perf-profile.calltrace.cycles-pp.shrink_node_memcg.shrink_node.kswapd.kthread.ret_from_fork
18.09 ± 6% -0.1 17.94 perf-profile.calltrace.cycles-pp.kswapd.kthread.ret_from_fork
18.07 ± 6% -0.1 17.93 perf-profile.calltrace.cycles-pp.shrink_node.kswapd.kthread.ret_from_fork
18.22 ± 6% -0.1 18.09 perf-profile.calltrace.cycles-pp.ret_from_fork
18.22 ± 6% -0.1 18.09 perf-profile.calltrace.cycles-pp.kthread.ret_from_fork
17.71 ± 6% -0.1 17.57 perf-profile.calltrace.cycles-pp.shrink_inactive_list.shrink_node_memcg.shrink_node.kswapd.kthread
14.98 ± 6% -0.0 14.93 perf-profile.calltrace.cycles-pp.shrink_page_list.shrink_inactive_list.shrink_node_memcg.shrink_node.kswapd
69.64 +1.7 71.34 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_fastpath
65.92 ± 2% +2.2 68.13 perf-profile.calltrace.cycles-pp.sys_read.entry_SYSCALL_64_fastpath
65.47 ± 2% +2.3 67.78 perf-profile.calltrace.cycles-pp.vfs_read.sys_read.entry_SYSCALL_64_fastpath
61.42 ± 2% +2.5 63.92 perf-profile.calltrace.cycles-pp.xfs_file_buffered_aio_read.xfs_file_read_iter.__vfs_read.vfs_read.sys_read
63.02 ± 2% +2.6 65.59 perf-profile.calltrace.cycles-pp.__vfs_read.vfs_read.sys_read.entry_SYSCALL_64_fastpath
60.15 ± 2% +2.6 62.74 perf-profile.calltrace.cycles-pp.generic_file_read_iter.xfs_file_buffered_aio_read.xfs_file_read_iter.__vfs_read.vfs_read
62.43 ± 2% +2.6 65.05 perf-profile.calltrace.cycles-pp.xfs_file_read_iter.__vfs_read.vfs_read.sys_read.entry_SYSCALL_64_fastpath
13.23 +7.3 20.52 perf-profile.calltrace.cycles-pp.copy_page_to_iter.generic_file_read_iter.xfs_file_buffered_aio_read.xfs_file_read_iter.__vfs_read
11.92 +7.5 19.46 perf-profile.calltrace.cycles-pp.copyout.copy_page_to_iter.generic_file_read_iter.xfs_file_buffered_aio_read.xfs_file_read_iter
11.79 ± 2% +7.6 19.38 perf-profile.calltrace.cycles-pp.copy_user_enhanced_fast_string.copyout.copy_page_to_iter.generic_file_read_iter.xfs_file_buffered_aio_read
39.36 ± 2% -4.5 34.85 perf-profile.children.cycles-pp.ondemand_readahead
39.34 ± 2% -4.5 34.84 perf-profile.children.cycles-pp.__do_page_cache_readahead
31.12 ± 2% -3.2 27.88 perf-profile.children.cycles-pp.mpage_readpages
17.93 ± 2% -2.1 15.80 perf-profile.children.cycles-pp.do_mpage_readpage
12.73 -1.3 11.40 perf-profile.children.cycles-pp.add_to_page_cache_lru
7.78 ± 2% -0.9 6.88 perf-profile.children.cycles-pp.__add_to_page_cache_locked
6.09 ± 2% -0.8 5.25 ± 3% perf-profile.children.cycles-pp.__radix_tree_lookup
5.85 ± 2% -0.6 5.24 ± 3% perf-profile.children.cycles-pp.__alloc_pages_nodemask
6.92 ± 6% -0.5 6.45 perf-profile.children.cycles-pp.__remove_mapping
5.02 ± 2% -0.5 4.56 ± 4% perf-profile.children.cycles-pp.get_page_from_freelist
18.11 ± 6% -0.2 17.93 perf-profile.children.cycles-pp.shrink_node
17.78 ± 6% -0.2 17.60 perf-profile.children.cycles-pp.shrink_node_memcg
17.75 ± 6% -0.2 17.58 perf-profile.children.cycles-pp.shrink_inactive_list
18.09 ± 6% -0.1 17.94 perf-profile.children.cycles-pp.kswapd
18.23 ± 6% -0.1 18.09 perf-profile.children.cycles-pp.ret_from_fork
18.22 ± 6% -0.1 18.09 perf-profile.children.cycles-pp.kthread
15.09 ± 6% -0.1 15.03 perf-profile.children.cycles-pp.shrink_page_list
69.83 +1.7 71.55 perf-profile.children.cycles-pp.entry_SYSCALL_64_fastpath
66.16 ± 2% +2.2 68.33 perf-profile.children.cycles-pp.sys_read
65.64 ± 2% +2.3 67.94 perf-profile.children.cycles-pp.vfs_read
61.57 ± 2% +2.4 64.02 perf-profile.children.cycles-pp.xfs_file_buffered_aio_read
60.36 ± 2% +2.6 62.93 perf-profile.children.cycles-pp.generic_file_read_iter
63.18 ± 2% +2.6 65.76 perf-profile.children.cycles-pp.__vfs_read
62.46 ± 2% +2.6 65.08 perf-profile.children.cycles-pp.xfs_file_read_iter
13.38 +7.3 20.63 perf-profile.children.cycles-pp.copy_page_to_iter
11.96 +7.5 19.48 perf-profile.children.cycles-pp.copyout
11.81 ± 2% +7.6 19.41 perf-profile.children.cycles-pp.copy_user_enhanced_fast_string
14.34 ± 2% -1.7 12.68 perf-profile.self.cycles-pp.do_mpage_readpage
6.04 ± 2% -0.8 5.21 ± 3% perf-profile.self.cycles-pp.__radix_tree_lookup
11.69 +7.5 19.23 perf-profile.self.cycles-pp.copy_user_enhanced_fast_string
vm-scalability.throughput
5.7e+06 +-+---------------------------------------------------------------+
|.+.+ + .+.+.+ +.+.+ + .+.+.+ :.+.+. |
5.6e+06 +-+ +.+ : + + + +.+.+. .+.|
| +. .+.+.+.+.+.+ + |
5.5e+06 +-+ + |
| |
5.4e+06 +-+ |
| |
5.3e+06 +-+ |
| |
5.2e+06 +-+ |
| O O O O |
5.1e+06 +-+ O O O O O O O O |
O O O O O |
5e+06 +-O-O---O-O-------------------------------------------------------+
vm-scalability.median
1.44e+06 +-+--------------------------------------------------------------+
1.42e+06 +-+ .+ + + .+ .+ |
|.+.+ + .++ + .++.+ + .+.+.+ + .+ .|
1.4e+06 +-+ + + .+.+. .+.+.+ + +.+ +.+.+.+.+ |
1.38e+06 +-+ +.+ + |
| |
1.36e+06 +-+ |
1.34e+06 +-+ |
1.32e+06 +-+ |
| |
1.3e+06 +-+ O O O O |
1.28e+06 +-+ O O O O O O O O |
| O O O OO O O |
1.26e+06 O-+ O |
1.24e+06 +-+--------------------------------------------------------------+
vm-scalability.workload
1.75e+09 +-+--------------------------------------------------------------+
| |
| |
1.7e+09 +-+.+.+. +.+.+ .++.+.+. .+.+.+.+. .+.+ |
| +.+ + .+. .+ + + +.+.+. .+.|
| +.+.+.+.+ + + |
1.65e+09 +-+ |
| |
1.6e+09 +-+ |
| |
| |
1.55e+09 +-+ O O |
| O O O O O O O O O O |
| O O O |
1.5e+09 O-O-O-O-O-O------------------------------------------------------+
perf-stat.cache-misses
3.5e+10 +-+---------------------------------------------------------------+
| O |
3.4e+10 +-+ O |
| O OO O O O O |
3.3e+10 O-O O O O O |
| O O O O |
3.2e+10 +-+ O O |
| |
3.1e+10 +-+ + |
| + + |
3e+10 +-+.+.+.+. +.+ .+ .+.+.+.+. .+.+.++ +. .+.+. .|
| +. + +.+. .+ : +.+ + + +.+ |
2.9e+10 +-+ + +.+ : + |
| + |
2.8e+10 +-+---------------------------------------------------------------+
perf-stat.branch-instructions
5.4e+11 +-+---------------------------------------------------------------+
| + |
5.3e+11 +-+ .+.+. +. .+.+.+. .++.+.+. + : :|
| +.+ +.+.+.+ : +.+ +.+ +.+.+ : :|
5.2e+11 +-+ +.+.+. .+. .+ : + |
| + + : : |
5.1e+11 +-+ :: |
| + |
5e+11 +-+ |
| O |
4.9e+11 +-+ O O |
| O O O O O |
4.8e+11 +-+ O O O O O O |
O O O O O O |
4.7e+11 +-O---------------------------------------------------------------+
perf-stat.dTLB-loads
8.3e+11 +-+---------------------------------------------------------------+
8.2e+11 +-+.+ +.+.+ + .+.+. + +.+ + +.+ + +.+ + .+ |
| + + +.+.+ + |
8.1e+11 +-+ + + |
8e+11 +-+ + |
| |
7.9e+11 +-+ |
7.8e+11 +-+ |
7.7e+11 +-+ |
| O O |
7.6e+11 +-+ O O O O O |
7.5e+11 +-+ O O O O O O |
O O O O O O O |
7.4e+11 +-O |
7.3e+11 +-+---------------------------------------------------------------+
perf-stat.dTLB-stores
6.6e+11 +-+---------------------------------------------------------------+
| .+. |
6.5e+11 +-+.+.+. .+ + .+.+.+.+. .+.+.+.+ .+.+. .+. .|
6.4e+11 +-+ +.+ :.+.+. .+.+.+ + + + +.+.+ |
| + +.+ |
6.3e+11 +-+ |
6.2e+11 +-+ |
| |
6.1e+11 +-+ |
6e+11 +-+ |
| |
5.9e+11 +-+ O O O O O O O O |
5.8e+11 O-+ O O O O O O O O O O |
| O |
5.7e+11 +-O---------------------------------------------------------------+
perf-stat.cache-miss-rate_
69 +-+--------------------------------------------------------------------+
| O O O |
68 O-O O O O O O O O O O O O O O O O |
67 +-+ O |
| |
66 +-+ |
65 +-+ |
| |
64 +-+ |
63 +-+ |
| .+ + .+. .|
62 +-+ .+.. .+.+.+.+. .+ : + + .+.+.+ + |
61 +-+.+.+.+ .+ + : .+.+.+.+.+.+.+.+.+.+ +. |
| + +.+. |
60 +-+--------------------------------------------------------------------+
perf-stat.ipc
0.72 +-+------------------------------------------------------------------+
0.71 +-+ + :|
|. .+.+. .+.+. +. .+.+. .+. .+.+. .+.+.+.+. + + :|
0.7 +-+ +.+.+ +. : + + + + + + |
0.69 +-+ +.+.+.+.+.+ : |
| + : |
0.68 +-+ + |
0.67 +-+ |
0.66 +-+ |
| O O |
0.65 +-+ O O O O O O |
0.64 +-+ O O O O O |
O O O O O O O |
0.63 +-O |
0.62 +-+------------------------------------------------------------------+
perf-stat.cpi
1.6 +-O------------------------------------------------------------------+
| O O O |
O O O O O O O O |
1.55 +-+ O O O O O O |
| O O O |
| |
1.5 +-+ |
| + |
1.45 +-+ .+. .+. + + |
|.+. +. .+.+.+ + + + .+. +. .+. +. + |
| +. + +.+.+.+ + +. + + +. + +.+.+.+.+. + +|
1.4 +-+ + + + + |
| |
| |
1.35 +-+------------------------------------------------------------------+
vm-scalability.time.percent_of_cpu_this_job_got
316 +-+-------------------------------------------------------------------+
| O O O |
314 O-O O O O O |
| O O O O O O O |
| |
312 +-+ O O O O O |
| |
310 +-+ |
| .+ |
308 +-+ .+ + .+.+.+ .+. |
| +.+ + : +.+.+ +.|
| .. : : |
306 +-+.+.+.+.+. .+.+ +.+.+.+.+.+.+.+.+..+. : |
| + +.+ |
304 +-+-------------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Xiaolong
View attachment "config-4.15.0-rc1-00105-g4af77b6" of type "text/plain" (164439 bytes)
View attachment "job.yaml" of type "text/plain" (4768 bytes)
View attachment "reproduce" of type "text/plain" (1100 bytes)
Powered by blists - more mailing lists