[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20171031103957.GA16011@yexl-desktop>
Date: Tue, 31 Oct 2017 18:39:57 +0800
From: kernel test robot <xiaolong.ye@...el.com>
To: Ross Zwisler <ross.zwisler@...ux.intel.com>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
Jan Kara <jack@...e.cz>,
"Darrick J. Wong" <darrick.wong@...cle.com>,
Theodore Ts'o <tytso@....edu>,
Alexander Viro <viro@...iv.linux.org.uk>,
Andreas Dilger <adilger.kernel@...ger.ca>,
Christoph Hellwig <hch@....de>,
Dan Williams <dan.j.williams@...el.com>,
Dave Chinner <david@...morbit.com>,
Ingo Molnar <mingo@...hat.com>,
Jonathan Corbet <corbet@....net>,
Matthew Wilcox <mawilcox@...rosoft.com>,
Steven Rostedt <rostedt@...dmis.org>,
"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
Andrew Morton <akpm@...ux-foundation.org>,
LKML <linux-kernel@...r.kernel.org>, lkp@...org
Subject: [lkp-robot] [dax] 91d25ba8a6: fio.read_clat_90%_us +120% regression
Greeting,
FYI, we noticed a 120% regression of fio.read_clat_90%_us due to commit:
commit: 91d25ba8a6b0d810dc844cebeedc53029118ce3e ("dax: use common 4k zero page for dax mmap reads")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
in testcase: fio-basic
on test machine: 56 threads Intel(R) Xeon(R) CPU E5-2695 v3 @ 2.30GHz with 256G memory
with following parameters:
disk: 2pmem
fs: xfs
mount_option: dax
runtime: 200s
nr_task: 50%
time_based: tb
rw: rw
bs: 4k
ioengine: mmap
test_size: 200G
cpufreq_governor: performance
test-description: Fio is a tool that will spawn a number of threads or processes doing a particular type of I/O action as specified by the user.
test-url: https://github.com/axboe/fio
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
testcase/path_params/tbox_group/run: fio-basic/2pmem-xfs-dax-200s-50%-tb-rw-4k-mmap-200G-performance/lkp-hsw-ep6
e30331ff05f689f8 91d25ba8a6b0d810dc844cebee
---------------- --------------------------
22.66 159% 58.69 fio.latency_20us%
5 120% 11 fio.read_clat_90%_us
2.26 87% 4.23 fio.read_clat_mean_us
3.04 ± 12% 86% 5.65 ± 3% fio.read_clat_stddev
6 ± 6% 78% 12 fio.read_clat_95%_us
10 ± 4% 59% 16 ± 5% fio.read_clat_99%_us
26.43 19% 31.48 fio.latency_2us%
4796 4876 fio.time.system_time
28491 ± 3% -5% 27060 fio.time.voluntary_context_switches
819 -10% 739 fio.time.user_time
14.38 -15% 12.25 fio.write_clat_mean_us
30 -25% 23 fio.write_clat_99%_us
7.66 -31% 5.29 ± 10% fio.write_clat_stddev
24 -35% 16 fio.write_clat_95%_us
22 -36% 14 ± 3% fio.write_clat_90%_us
6.417e+08 -31% 4.446e+08 fio.time.minor_page_faults
25.31 ± 3% -67% 8.23 ± 7% fio.latency_10us%
12.87 ± 10% -88% 1.53 ± 9% fio.latency_50us%
12.72 ± 3% -100% 0.06 ± 20% fio.latency_4us%
1785 ± 3% -10% 1615 vmstat.system.cs
551 10% 604 turbostat.Avg_MHz
123 8% 132 turbostat.PkgWatt
25.47 24.88 turbostat.RAMWatt
3300 ±129% -3e+03 0 latency_stats.avg.call_rwsem_down_write_failed.path_openat.do_filp_open.do_sys_open.SyS_open.entry_SYSCALL_64_fastpath
4518312 ±155% -5e+06 0 latency_stats.avg.io_schedule.nfs_wait_on_request.nfs_writepage_setup.nfs_updatepage.nfs_write_end.generic_perform_write.nfs_file_write.__vfs_write.vfs_write.SyS_write.entry_SYSCALL_64_fastpath
3300 ±129% -3e+03 0 latency_stats.max.call_rwsem_down_write_failed.path_openat.do_filp_open.do_sys_open.SyS_open.entry_SYSCALL_64_fastpath
7064619 ±155% -7e+06 0 latency_stats.max.io_schedule.nfs_wait_on_request.nfs_writepage_setup.nfs_updatepage.nfs_write_end.generic_perform_write.nfs_file_write.__vfs_write.vfs_write.SyS_write.entry_SYSCALL_64_fastpath
3300 ±129% -3e+03 0 latency_stats.sum.call_rwsem_down_write_failed.path_openat.do_filp_open.do_sys_open.SyS_open.entry_SYSCALL_64_fastpath
9533 ± 11% -8e+03 1854 ± 26% latency_stats.sum.call_rwsem_down_read_failed.xfs_log_commit_cil.[xfs].__xfs_trans_commit.[xfs].xfs_trans_commit.[xfs].xfs_vn_update_time.[xfs].file_update_time.xfs_filemap_pfn_mkwrite.[xfs].do_wp_page.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault
13554938 ±155% -1e+07 0 latency_stats.sum.io_schedule.nfs_wait_on_request.nfs_writepage_setup.nfs_updatepage.nfs_write_end.generic_perform_write.nfs_file_write.__vfs_write.vfs_write.SyS_write.entry_SYSCALL_64_fastpath
27.72 ± 4% 61% 44.61 ± 28% perf-stat.iTLB-load-miss-rate%
1.47e+12 ± 3% 21% 1.785e+12 perf-stat.branch-instructions
0.15 15% 0.18 perf-stat.dTLB-store-miss-rate%
7.1e+12 ± 3% 14% 8.124e+12 perf-stat.instructions
1.823e+12 ± 3% 13% 2.053e+12 ± 3% perf-stat.dTLB-loads
1.839e+13 ± 3% 11% 2.045e+13 perf-stat.cpu-cycles
2.59 2.52 perf-stat.cpi
72.31 5% 75.81 perf-stat.node-store-miss-rate%
0.39 0.40 perf-stat.ipc
7016 -5% 6687 perf-stat.cpu-migrations
8.242e+09 -17% 6.882e+09 ± 4% perf-stat.branch-misses
34.55 -7% 32.26 perf-stat.cache-miss-rate%
7.286e+10 -16% 6.094e+10 ± 3% perf-stat.cache-references
943574 ± 3% -10% 847854 perf-stat.context-switches
3.135e+09 ± 5% -11% 2.805e+09 ± 7% perf-stat.dTLB-load-misses
9.382e+08 ± 8% -14% 8.111e+08 ± 10% perf-stat.iTLB-loads
9.986e+11 -15% 8.526e+11 perf-stat.dTLB-stores
0.17 ± 4% -21% 0.14 ± 4% perf-stat.dTLB-load-miss-rate%
2.517e+10 -22% 1.967e+10 ± 4% perf-stat.cache-misses
1.032e+10 ± 6% -22% 8.015e+09 perf-stat.node-load-misses
4.359e+09 ± 3% -29% 3.095e+09 perf-stat.node-store-misses
0.56 -31% 0.39 perf-stat.branch-miss-rate%
6.43e+08 -31% 4.459e+08 perf-stat.minor-faults
6.43e+08 -31% 4.459e+08 perf-stat.page-faults
1.668e+09 -41% 9.882e+08 ± 4% perf-stat.node-stores
fio.read_clat_mean_us
4.5 +-+-----------------O--O------------------------O-O----O--------------+
O O O O O O O O O O O O O O O O O O O O O O
4 +-+ O O |
3.5 +-+ |
| |
3 +-+ |
2.5 +-++. .+.. +.. +. .+.+.. |
| : +..+.+..+.+..+ +.+ : +.+..+ : +..+.+. +.+..+.+..+.|
2 +-+ : : : : |
1.5 +-+ : : : : |
| : : : : : |
1 +-+ : : : : |
0.5 +-+ : : : : |
| : : |
0 +-+-------------------------------------------------------------------+
fio.read_clat_90__us
12 +-+-----------------O--O------------------------O--O----O--------------+
O O O O O O O O O O O O O O O O O O O O O O O O
10 +-+ |
| |
| |
8 +-+ |
| |
6 +-++. .+.+.. .+.. +.. +.. .+..+. .+. |
| : +..+.+. + +.+ : +..+.+ : +.+..+ +. +..+.+..|
4 +-+ : : : : |
| : : : : : |
|: : : : : |
2 +-+ : : : : |
| : : |
0 +-+--------------------------------------------------------------------+
fio.read_clat_95__us
14 +-+--------------------------------------------------------------------+
| O O |
12 O-+O O O O O O O O O O O O O O O O O O O O O O O O O O
| |
10 +-+ |
| |
8 +-+ |
| +. .+.+..+.+.. +.. +.. .+..+.+..+. .+.+..|
6 +-+: +..+.+. +.+ : +..+.+ : +.+..+ +. |
| : : : : : |
4 +-+ : : : : |
|: : : : : |
2 +-+ : : : : |
| : : |
0 +-+--------------------------------------------------------------------+
fio.read_clat_99__us
20 +-+--------------------------------------------------------------------+
18 +-+ O O O |
| O O O
16 O-+ O O O O O O |
14 +-+O O O O O O O O O O O O O O O O |
| |
12 +-+ +.. .+. |
10 +-++. .+. .+.+..+.+.. +.. .+. +.. .+.. + +.+..+.+. +..|
8 +-+: +. +. +.+ : +. + : + + |
| : : : : : |
6 +-+ : : : : |
4 +-+ : : : : |
|: : : : : |
2 +-+ : : |
0 +-+--------------------------------------------------------------------+
fio.write_clat_90__us
25 +-+--------------------------------------------------------------------+
| +.. .+.. +.. .+ +. .+. .+..+. .|
| + + +.+..+.+..+.+ .. + : .. +. +..+.+..+ +. |
20 +-++ : + : + |
| : : : : : |
| : O : : : : |
15 O-+O O O O O O O O O: O:O O O O: O:O O O O O O O O O O O O
| : : : : : |
10 +-+ : : : : |
|: : : : : |
|: : : : : |
5 +-+ : : : : |
|: : : |
| : : |
0 +-+--------------------------------------------------------------------+
fio.write_clat_95__us
25 +-+--------------------------------------------------------------------+
| + + +.+..+.+..+ : .. + : .. +. +..+.+..+ +. |
| + : + : + |
20 +-+: : : : : |
O O O O O O : O O : : O |
| : O O O : : O : : O O O O O O O O
15 +-+ O O: : O O: O:O O O |
| : : : : : |
10 +-+ : : : : |
|: : : : : |
|: : : : : |
5 +-+ : : : : |
|: : : |
| : : |
0 +-+--------------------------------------------------------------------+
fio.latency_2us_
35 +-+--------------------------------------------------------------------+
O O O O O O O O O O O O O O O O O O O O O O O O O O O O
30 +-+ O |
| +.+..+.+..+.+..+.+.. .+ +..+..+.+ +..+.+..+.+..+.+..+.+..+.+..|
25 +-+: + : : : : |
| : : : : : |
20 +-+ : : : : |
| : : : : : |
15 +-+ : : : : |
|: : : : : |
10 +-+ : : : : |
|: : : : : |
5 +-+ :: :: |
| : : |
0 +-+--------------------------------------------------------------------+
fio.latency_4us_
14 +-+--------------------------------------------------------------------+
| +..+. + +..+. +.+.. +..+. .|
12 +-+ + +..+.+.. +: .. + .. +. .+.. : +. |
| + +. .+ : + : + +..+ : |
10 +-+: +. : : : : + |
| : : : : : |
8 +-+ : : : : |
| : : : : : |
6 +-+ : : : : |
|: : : : : |
4 +-+ : : : : |
|: : : : : |
2 +-+ : : |
| : : |
0 O-+O-O--O-O--O-O--O-O--O-O--O-O--O--O-O--O-O--O-O--O-O--O-O--O-O--O-O--O
fio.latency_20us_
70 +-+--------------------------------------------------------------------+
| O O O O |
60 +-+ O O O O O O O O O O O O O
O O O O O O O O O O |
50 +-+ O O |
| |
40 +-+ |
| |
30 +-+ |
| +. .+.+..+.+..+.+..+.+ +.. .+.+ +.. .+..+.+..+.+..+. .+.+..|
20 +-+ +. : : +. : : + +. |
| : : : : : |
10 +-+ : : : : |
|: :: :: |
0 +-+--------------------------------------------------------------------+
perf-stat.page-faults
8e+08 +-+-----------------------------------------------------------------+
| +.. +.. +.. |
7e+08 +-+ .+..+. .+.. : : .+.+..+. .+.|
6e+08 +-+ +.+..+ + +.+ : +.+.+ : +.+..+ +..+.+. |
| : : : : : |
5e+08 +-+ : : : : |
O:O O O O O O O O O O: O:O O O O: O:O O O O O O O O O O O O
4e+08 +-+ : : : : |
|: : : : : |
3e+08 +-+ : : : : |
2e+08 +-+ : : : : |
| :: :: |
1e+08 +-+ :: :: |
| : : |
0 +-+-----------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Xiaolong
View attachment "config-4.13.0-04261-g91d25ba" of type "text/plain" (162451 bytes)
View attachment "job-script" of type "text/plain" (7388 bytes)
View attachment "job.yaml" of type "text/plain" (4995 bytes)
View attachment "reproduce" of type "text/plain" (860 bytes)
Powered by blists - more mailing lists