[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170203015259.GQ17561@yexl-desktop>
Date: Fri, 3 Feb 2017 09:52:59 +0800
From: kernel test robot <xiaolong.ye@...el.com>
To: Dan Williams <dan.j.williams@...el.com>
Cc: linux-nvdimm@...ts.01.org, Jan Kara <jack@...e.cz>,
Matthew Wilcox <mawilcox@...rosoft.com>, x86@...nel.org,
linux-kernel@...r.kernel.org, Christoph Hellwig <hch@....de>,
Jeff Moyer <jmoyer@...hat.com>, Ingo Molnar <mingo@...hat.com>,
Al Viro <viro@...iv.linux.org.uk>,
"H. Peter Anvin" <hpa@...or.com>, linux-fsdevel@...r.kernel.org,
Thomas Gleixner <tglx@...utronix.de>,
Ross Zwisler <ross.zwisler@...ux.intel.com>, lkp@...org
Subject: [lkp-robot] [x86, dax, pmem] 2e12109d1c: fio.write_bw_MBps -75%
regression
Greeting,
FYI, we noticed a -75% regression of fio.write_bw_MBps due to commit:
commit: 2e12109d1c32c810088820478d21b5b7cd87a805 ("x86, dax, pmem: introduce 'copy_from_iter' dax operation")
url: https://github.com/0day-ci/linux/commits/Dan-Williams/dax-pmem-move-cpu-cache-maintenance-to-libnvdimm/20170121-031649
in testcase: fio-basic
on test machine: 56 threads Intel(R) Xeon(R) CPU E5-2695 v3 @ 2.30GHz with 256G memory
with following parameters:
disk: 2pmem
fs: xfs
mount_option: dax
runtime: 200s
nr_task: 50%
time_based: tb
rw: randwrite
bs: 2M
ioengine: sync
test_size: 200G
cpufreq_governor: performance
test-description: Fio is a tool that will spawn a number of threads or processes doing a particular type of I/O action as specified by the user.
test-url: https://github.com/axboe/fio
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
testcase/path_params/tbox_group/run: fio-basic/2pmem-xfs-dax-200s-50%-tb-randwrite-2M-sync-200G-performance/lkp-hsw-ep6
c42a4508649e40af 2e12109d1c32c810088820478d
---------------- --------------------------
%stddev change %stddev
\ | \
68769 ± 3% -75% 17370 fio.write_bw_MBps
34384 ± 3% -75% 8685 fio.write_iops
0.70 ± 20% 14149% 99.39 fio.latency_4ms%
745 ± 4% 327% 3182 fio.write_clat_mean_us
1580 ± 14% 137% 3752 ± 5% fio.write_clat_99%_us
1405 ± 17% 136% 3320 ± 3% fio.write_clat_90%_us
1456 ± 16% 133% 3392 fio.write_clat_95%_us
435 ± 27% -60% 175 ± 21% fio.write_clat_stddev
0.01 -100% 0.00 fio.latency_250us%
21.31 ± 36% -100% 0.00 fio.latency_2ms%
5122 8% 5530 fio.time.system_time
487.59 ± 12% -84% 79.88 ± 10% fio.time.user_time
59604 60810 vmstat.system.in
2351 -25% 1758 vmstat.system.cs
1191 19% 1413 turbostat.Avg_MHz
199 4% 206 turbostat.PkgWatt
51.36 50.78 turbostat.%Busy
205 -42% 118 turbostat.RAMWatt
0.00 ± 23% 224% 0.01 ± 4% perf-stat.dTLB-load-miss-rate%
0.09 66% 0.15 perf-stat.branch-miss-rate%
1.431e+13 ± 3% 13% 1.617e+13 perf-stat.cpu-cycles
2.893e+08 -19% 2.356e+08 perf-stat.branch-misses
468564 -26% 346074 perf-stat.context-switches
2.444e+08 ± 6% -39% 1.503e+08 perf-stat.iTLB-loads
1.26e+08 ± 27% -42% 72844209 ± 17% perf-stat.node-load-misses
3.148e+11 -51% 1.546e+11 perf-stat.branch-instructions
40098660 ± 22% -56% 17714563 ± 10% perf-stat.node-store-misses
326873 ± 34% -62% 122748 perf-stat.instructions-per-iTLB-miss
1.741e+12 -67% 5.7e+11 ± 7% perf-stat.dTLB-stores
1.776e+12 ± 11% -68% 5.699e+11 ± 12% perf-stat.dTLB-loads
5.629e+12 -69% 1.725e+12 perf-stat.instructions
0.39 ± 3% -73% 0.11 perf-stat.ipc
1.385e+11 ± 3% -75% 3.438e+10 perf-stat.cache-references
perf-stat.instructions
7e+12 ++------------------------------------------------------------------+
| *. *. *. |
6e+12 ++* : *. .* *. .*.*.*. .*.*.* * : * * : * .*. |
|: + : * : : * * : : + : : : + : : *. .* *.*
5e+12 ++ * : : : : * : : * : : *.* |
* * * * :: |
4e+12 ++ * |
| |
3e+12 ++ |
| |
2e+12 ++ |
O O O O O O O O O O O O O O O O O O O O O O O O O O |
1e+12 ++ |
| |
0 ++--------------O---------------------------------------------------+
perf-stat.cache-references
1.8e+11 ++----------------------------------------------------------------+
| *. *. * |
1.6e+11 ++ + * * .*.*. .*.* + * + * |
1.4e+11 ++*.* + + : * .* *.* : *.* : *.* : *. .*.*.*
|+ * : : * : + : + : : *.*.* |
1.2e+11 *+ :: * * :: |
1e+11 ++ * * |
| |
8e+10 ++ |
6e+10 ++ |
| |
4e+10 O+ O O O O O O O O O O O O O O OO |
2e+10 ++O O O O O O O O O |
| |
0 ++--------------O-------------------------------------------------+
perf-stat.branch-instructions
4e+11 ++----------------------------------------------------------------+
| *. *. * |
3.5e+11 ++ : * * * *. .*. .*.*. : * : * |
3e+11 ++*. : + + : : :+ * * * *. : : *. : : *. .*.*.*.*
|+ * * : : * + + * : + * : : *.* |
2.5e+11 *+ * * * :: |
| * |
2e+11 ++ |
| O |
1.5e+11 O+O O O O O O O O O O O O O O O O O O O O O O OO |
1e+11 ++ |
| |
5e+10 ++ |
| |
0 ++--------------O-------------------------------------------------+
perf-stat.dTLB-loads
3.5e+12 ++----------------------------------------------------------------+
| |
3e+12 ++ * |
| :: |
2.5e+12 ++ * : : |
| * *. .* :*.*. .*.*. .* : * *. * * *.* |
2e+12 ++ + + *.* : : * * :+ + + *. + + + * : + .*. |
* * : : * * * * + : * *.*.|
1.5e+12 ++ * * *
| |
1e+12 ++ |
| O |
5e+11 O+O O O O O O O O O O O O O O O O O O O O O O OO |
| |
0 ++--------------O-------------------------------------------------+
perf-stat.dTLB-stores
2.5e+12 ++----------------------------------------------------------------+
| .* |
| *. .* : *. * |
2e+12 ++ : *. .* * .*.*.*.*.* : : * : * |
| *. : * : :* : *. : : *. : : *. .*. .*.*
|+ * : : :+ * : + * : : *.* * |
1.5e+12 *+ :: * * :: |
| * * |
1e+12 ++ |
| |
| |
5e+11 O+O O O O O O O O O O O O O O O O O O O O O O O OO |
| |
| |
0 ++--------------O-------------------------------------------------+
perf-stat.context-switches
500000 ++------------*------------------------------------------*---*-----+
450000 *+ .*. + + .*. .*.*. .* .*. .*. + * *.*.*
| *.*.*.* * * *.*.* *.* *.*.*.* *.*.*.* * |
400000 ++ |
350000 O+O O O O O O O O O O O O O O O OO O O O O O O O O |
| |
300000 ++ |
250000 ++ |
200000 ++ |
| |
150000 ++ |
100000 ++ |
| |
50000 ++ |
0 ++--------------O--------------------------------------------------+
perf-stat.ipc
0.45 ++--------------*-------*--------------------------------------------+
| * *.*. :+ .*.* + .*. .* * *.* * *.* * .*. |
0.4 ++ + + *.* : * * * + : + + : : + + : :+ .* *.*
0.35 ++ * + : + : * : : * : : *.* |
* * * * :: |
0.3 ++ * |
0.25 ++ |
| |
0.2 ++ |
0.15 ++ |
| |
0.1 O+O O O O O O O O O O O O O O O O O O O O O O O O O |
0.05 ++ |
| |
0 ++--------------O----------------------------------------------------+
fio.write_bw_MBps
80000 ++------------------------------------------------------------------+
| * *.*. *. .*.*.*. .*.*.* * *.* * *.* *
70000 ++ + + *.* : * * : : + + : : + + : *. .*.*. +|
60000 ++ * : : : : * : : * : : *.* * |
* : : * * :: |
50000 ++ * * |
| |
40000 ++ |
| |
30000 ++ |
20000 ++ |
O O O O O O O O O O O O O O O O O O O O O O O O O O |
10000 ++ |
| |
0 ++--------------O---------------------------------------------------+
fio.write_iops
40000 ++------------------------------------------------------------------+
| * *.*. *. .*.*.*. .*.*.* * *.* * *.* *
35000 ++ + + *.* : * * : : + + : : + + : *. .*.*. +|
30000 ++ * : : : : * : : * : : *.* * |
* : : * * :: |
25000 ++ * * |
| |
20000 ++ |
| |
15000 ++ |
10000 ++ |
O O O O O O O O O O O O O O O O O O O O O O O O O O |
5000 ++ |
| |
0 ++--------------O---------------------------------------------------+
fio.write_clat_mean_us
4000 ++-------------------------------------------------------------------+
| O |
3500 O+O O O O O O O O O O O O O O O |
3000 ++ O O O O O O O O O |
| |
2500 ++ |
| |
2000 ++ |
| |
1500 ++ |
1000 ++ * |
*. .*. .*.*.*. .*. .*. .*. .*. .*. .*. + + .*.*.*. .*.*
500 ++* *.* * *.*.* *.*.*. * *.* * *.* * * |
| |
0 ++--------------O----------------------------------------------------+
fio.write_clat_90__us
4000 ++-------------------------------------------------------------------+
| O O O O O O O O O |
3500 O+ O O O O O O O O O O O O O O |
3000 ++ O O |
| |
2500 ++ |
| |
2000 ++ |
* .*. .* .* .* .*. *. *.|
1500 ++ .*.*.* * + .*. .*. .*. .*.*. + .*.*.* + .*.*.* *. + * : *
1000 ++* * * * * * * * + : |
| * |
500 ++ |
| |
0 ++--------------O----------------------------------------------------+
fio.write_clat_95__us
4500 ++-------------------------------------------------------------------+
| O |
4000 ++O O |
3500 O+ O O O O O O O O O O O O O O O O O O O O |
| O O |
3000 ++ |
2500 ++ |
| |
2000 *+ .* * * * |
1500 ++ .*. .*.* + .*. .*. .*. .*. .. + .*. + + .*. + + .*.* *.*
| * *.* * * * * * * *.* * *.* *.* + + |
1000 ++ * |
500 ++ |
| |
0 ++--------------O----------------------------------------------------+
fio.latency_4ms_
100 O+--O-------O-O----O-O---O---O-O-O---O-O-O-O---O-O-O--O---------------+
90 ++O O O O O O O O |
| |
80 ++ |
70 ++ |
| |
60 ++ |
50 ++ |
40 ++ |
| |
30 ++ |
20 ++ |
| |
10 ++ |
0 *+*-*-*-*-*-*-*-O--*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*--*-*-*-*-*-*-*-*-*
turbostat.Avg_MHz
1600 ++-------------------------------------------------------------------+
| |
1400 O+O O O O O O O O O O O O O O O O O O O O O O O O O |
1200 ++ .*.*.*. .*. .*.*.*. .*.*.*.*.. .*.*.*. .*.*.*. .*. .*
*.* * *.* * *.* *.* * *.*.*.*.* |
1000 ++ |
| |
800 ++ |
| |
600 ++ |
400 ++ |
| |
200 ++ |
| |
0 ++--------------O----------------------------------------------------+
turbostat.RAMWatt
250 ++--------------------------------------------------------------------+
| |
|.*. .*.*. .* *.. .*.*.*.*.*.*.*. .*. .*.*. .*. .*..*. .*. .*. .*
200 *+ * * + + * * * * * * *.*.* * |
| * |
| |
150 ++ |
O O O O O O O O O O O O O O O O O O O O O O O O O O |
100 ++ |
| |
| |
50 ++ |
| |
| |
0 ++--------------O-----------------------------------------------------+
fio.time.user_time
800 ++--------------------------------------------------------------------+
| *. |
700 ++ * : * |
600 ++ :: : : |
| : : .* : : * |
500 ++*. .*.*. *. : * + .* *. .* *. .* : : + + |
|+ *.* *.*.*.. + * * + + *.* + + *.*. +: *.* *.*
400 *+ * * * * |
| |
300 ++ |
200 ++ |
| |
100 ++ O O O |
O O O O O O O O O O O O O O O O O O O O O O O |
0 ++--------------O-----------------------------------------------------+
fio.time.system_time
6000 ++-------------------------------------------------------------------+
O O O O O O O O O O O O O O O O O O O O O O O O O O |
5000 *+*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*..*.*.*.*.*.*.*.*.*.*.*. .*.*.*.*.*
| *.* |
| |
4000 ++ |
| |
3000 ++ |
| |
2000 ++ |
| |
| |
1000 ++ |
| |
0 ++--------------O----------------------------------------------------+
fio.latency_250us_
0.6 ++--------------------------------------------------------------------+
| * |
0.5 ++ : |
| * * * : |
| : : : : |
0.4 ++ : : : : : |
| : : * : : : : : : |
0.3 ++ : : : : : : : : : |
| : : :: : : : : : : |
0.2 ++ : : : : : : : : : : |
| : : : : : : : : : : |
| : : : : : : : : : : |
0.1 ++: : : : : : : : : : |
| : : : : : : : : : : |
0 *+*---*-*-*---*-*--*-*-*-*-*-*-*-*-*-*---*-*-*-*---*--*---*-*-*-*-*-*-*
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Xiaolong
View attachment "config-4.10.0-rc4-00111-g2e12109" of type "text/plain" (155570 bytes)
View attachment "job-script" of type "text/plain" (7086 bytes)
View attachment "job.yaml" of type "text/plain" (4695 bytes)
View attachment "reproduce" of type "text/plain" (684 bytes)
Powered by blists - more mailing lists