[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20170420021417.GB3523@yexl-desktop>
Date: Thu, 20 Apr 2017 10:14:17 +0800
From: kernel test robot <xiaolong.ye@...el.com>
To: Christoph Hellwig <hch@....de>
Cc: Jens Axboe <axboe@...com>,
"Martin K. Petersen" <martin.petersen@...cle.com>,
Hannes Reinecke <hare@...e.com>,
LKML <linux-kernel@...r.kernel.org>,
Jens Axboe <axboe@...nel.dk>, lkp@...org
Subject: [lkp-robot] [block] 71027e97d7: fio.write_bw_MBps +104% improvement
Greeting,
FYI, we noticed a +104% improvement of fio.write_bw_MBps due to commit:
commit: 71027e97d796d1e9b210a2f64bf2cc25e225a4c0 ("block: stop using discards for zeroing")
https://git.kernel.org/cgit/linux/kernel/git/axboe/linux-block.git for-4.12/test
in testcase: fio-basic
on test machine: 16 threads Intel(R) Xeon(R) CPU D-1541 @ 2.10GHz with 8G memory
with following parameters:
runtime: 300s
disk: 1SSD
fs: ext4
nr_task: 64
rw: randwrite
bs: 4k
ioengine: sync
test_size: 400g
cpufreq_governor: performance
test-description: Fio is a tool that will spawn a number of threads or processes doing a particular type of I/O action as specified by the user.
test-url: https://github.com/axboe/fio
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/01org/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
testcase/path_params/tbox_group/run: fio-basic/300s-1SSD-ext4-64-randwrite-4k-sync-400g-performance/lkp-bdw-de1
5d1429fead5beacc 71027e97d796d1e9b210a2f64b
---------------- --------------------------
%stddev change %stddev
\ | \
41.36 104% 84.24 fio.write_bw_MBps
10589 104% 21566 fio.write_iops
0.01 1200% 0.13 fio.latency_250us%
0.01 500% 0.06 fio.latency_100us%
0.27 ± 5% 241% 0.91 fio.latency_100ms%
0.09 ± 39% 200% 0.29 ± 5% fio.latency_50ms%
0.02 ± 47% 171% 0.05 ± 9% fio.latency_20ms%
8 12% 9 fio.write_clat_90%_us
72.99 12% 81.78 fio.latency_10us%
0.17 ± 6% -30% 0.12 ± 3% fio.latency_50us%
3.02 -43% 1.73 fio.latency_250ms%
19.43 -43% 11.00 fio.latency_4us%
207872 -46% 112128 fio.write_clat_99%_us
33495 -47% 17619 fio.write_clat_stddev
0.02 -50% 0.01 fio.latency_750us%
6040 -51% 2965 fio.write_clat_mean_us
25427300 104% 51781684 fio.time.file_system_outputs
20.43 101% 41.07 fio.time.system_time
8 88% 15 fio.time.percent_of_cpu_this_job_got
142609 75% 249305 fio.time.voluntary_context_switches
1127 ± 5% -60% 455 ± 25% fio.time.involuntary_context_switches
54253 55305 interrupts.CAL:Function_call_interrupts
41705 358% 190909 vmstat.io.bo
27554 53% 42193 vmstat.system.in
22038 -4% 21060 vmstat.system.cs
121 ± 5% 67% 202 turbostat.Avg_MHz
4.97 ± 4% 63% 8.13 turbostat.%Busy
23.58 5% 24.74 turbostat.PkgWatt
9.65 9.81 turbostat.RAMWatt
654 353% 2967 iostat.sda.wrqm/s
80766 153% 204329 iostat.sda.wkB/s
11226 130% 25808 iostat.sda.w/s
3.40 85% 6.29 iostat.sda.avgqu-sz
5.371e+08 ± 5% 116% 1.159e+09 perf-stat.branch-misses
55430664 ± 10% 109% 1.156e+08 ± 4% perf-stat.dTLB-load-misses
6.733e+10 79% 1.208e+11 perf-stat.branch-instructions
3.304e+11 78% 5.894e+11 perf-stat.instructions
5.128e+11 ± 3% 76% 9.042e+11 perf-stat.cpu-cycles
9.157e+10 ± 4% 75% 1.604e+11 perf-stat.dTLB-loads
3.76e+09 ± 5% 70% 6.397e+09 perf-stat.cache-references
3.76e+09 ± 5% 70% 6.397e+09 perf-stat.cache-misses
4.454e+10 67% 7.449e+10 perf-stat.dTLB-stores
3933552 ± 3% 50% 5919721 perf-stat.dTLB-store-misses
28404 ± 4% 35% 38265 ± 6% perf-stat.instructions-per-iTLB-miss
11646070 ± 3% 33% 15457926 ± 5% perf-stat.iTLB-load-misses
1.399e+08 25% 1.744e+08 perf-stat.iTLB-loads
0.80 ± 5% 20% 0.96 perf-stat.branch-miss-rate%
0.06 ± 6% 19% 0.07 ± 3% perf-stat.dTLB-load-miss-rate%
24844 ± 5% 17% 28972 perf-stat.cpu-migrations
6652601 -5% 6352172 perf-stat.context-switches
0.01 ± 3% -10% 0.01 perf-stat.dTLB-store-miss-rate%
fio.write_bw_MBps
85 ++------------------------------O----------------------------O-O-O-O---+
O O O O O O O O O O O O O O O O O OO O O O O O O O O O O O |
80 ++ |
75 ++ |
| |
70 ++ |
65 ++ |
| |
60 ++ |
55 ++ |
| |
50 ++ |
45 ++ |
| |
40 *+*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-**-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*
fio.write_iops
22000 ++----------------------------------------------------------O-O-----+
| O O O O O O O O O O O O O O |
20000 O+O O O O O O O O O O O O O O O O O O |
| |
| |
18000 ++ |
| |
16000 ++ |
| |
14000 ++ |
| |
| |
12000 ++ |
*.*.*.*.**.*.*.*.*.*.*.*.**.*.*.*.*.*.*.*.**.*.*.*.*.*.*.*.**.*.*.*.*
10000 ++------------------------------------------------------------------+
fio.write_clat_mean_us
6500 ++-------------------------------------------------------------------+
| |
6000 *+*.*.*.*.*.**.*.*.*.*.*.*.*.*.*.*.**.*.*.*.*.*.*.*.*.*.*.**.*.*.*.*.*
5500 ++ |
| |
5000 ++ |
| |
4500 ++ |
| |
4000 ++ |
3500 ++ |
| |
3000 O+O O O O O OO O O O O O O O O O O OO O O O O O O O O O O OO O O O |
| |
2500 ++-------------------------------------------------------------------+
fio.write_clat_stddev
34000 *+*-*-*-**-*-*-*-*-*-*-*-**-*-*-*-*-*-*-*-**-*-*-*-*-*-*-*-**-*-*-*-*
| |
32000 ++ |
30000 ++ |
| |
28000 ++ |
26000 ++ |
| |
24000 ++ |
22000 ++ |
| |
20000 ++ |
18000 O+O O O OO O O O O O O O OO O O O O O O OO O O O O O O O |
| O OO O O |
16000 ++------------------------------------------------------------------+
fio.write_clat_99__us
210000 *+*-*-**-*-*-*-*-*-**-*-*-*-*-*-*-**-*-*-*-*-*-**-*-*-*-*-*-**-*-*-*
200000 ++ |
| |
190000 ++ |
180000 ++ |
| |
170000 ++ |
160000 ++ |
150000 ++ |
| |
140000 ++ |
130000 ++ |
| |
120000 O+O O OO O O O O O OO O O O O O O OO O O O O O OO O O O O |
110000 ++--------------------------------------------------------O-OO-O---+
fio.latency_4us_
22 ++---------------------------------------------------------------------+
| * |
20 *+ * : + .*. .* |
| *.*. + + .*. .* .*. *. : * *.*. .* + .*
| * *.*. .* * : * * *.*.*. .* *.*.* * |
18 ++ *.*.*.* : : * |
| :: |
16 ++ * |
| |
14 ++ |
| |
| |
12 ++O O O O O O O O O O O O OO O O O O O O O O O |
O O O O O O O O O O O O |
10 ++---------------------------------------------------------------------+
fio.latency_10us_
84 ++---------------------------------------------------------------------+
| |
82 O+ O O O O O O O O O |
| O O O O O O O O O O O O O O OO O O O O O O O O O |
80 ++ |
| |
78 ++ |
| |
76 ++ * |
| .*. .*. : + |
74 ++ .*. .*.* * *.*. .*. : *. .* .*.*.*.*.* .*. |
|.*.* * * * * * : *.*.*.*.*.* *.*.*.*
72 *+ : + |
| * |
70 ++---------------------------------------------------------------------+
fio.latency_100us_
0.06 O+--O----O-O-O-O-O-O-O-O-OO-O-O-O-O-O-O---O----O-O-O-O-O-O-OO-O-O---+
0.055 ++ |
| |
0.05 ++O O O O O O |
0.045 ++ |
| |
0.04 ++ |
0.035 ++ |
0.03 ++ |
| |
0.025 ++ |
0.02 ++ |
| |
0.015 ++ |
0.01 *+*-*-*-**-*-*-*-*-*-*-*-**-*-*-*-*-*-*-*-**-*-*-*-*-*-*-*-**-*-*-*-*
fio.latency_250us_
0.14 ++-------------------------------------------------------------------+
O O O O O O O O O O O O O O O O O |
0.12 ++O O O O O O O O OO O O O O O O O O |
| |
0.1 ++ |
| |
0.08 ++ |
| |
0.06 ++ |
| |
0.04 ++ |
| |
0.02 ++ |
*.*.*.*.*.*.**.*.*.*.*.*.*.*.*.*.*.**.*.*.*.*.*.*.*.*.*.*.**.*.*.*.*.*
0 ++-------------------------------------------------------------------+
fio.latency_750us_
0.02 *+*-*-*-**-*-*-*-*-*-*-*-**-*-*-*-*-*-*-*-**-*-*-*-*-*-*-*-**-*-*-*-*
| |
| |
0.018 ++ |
| |
| |
0.016 ++ |
| |
0.014 ++ |
| |
| |
0.012 ++ |
| |
| |
0.01 O+O-O-O-OO-O-O-O-O-O-O-O-OO-O-O-O-O-O-O-O-OO-O-O-O-O-O-O-O-OO-O-O---+
fio.latency_100ms_
1 ++--------------------------------------------------------------------+
| O O O |
0.9 ++ O O O |
0.8 ++ O O O O O O O O O O O O O O O O O |
O O O O O O OO O O O O |
0.7 ++ |
0.6 ++ |
| |
0.5 ++ |
0.4 ++ |
| * |
0.3 *+*.*. .* : : *.*.*.*. .*. .*.*
0.2 ++ *. .*.*.* * *.* + : : .*. .*. + *.**.*.* *.*.* |
| * + + :+ * *.* * * |
0.1 ++--------------*--*--------------------------------------------------+
fio.latency_250ms_
3.2 ++--------------------------------------------------------------------+
| .*.*. .*.*.**. .*. .*. .*.*.*.*.*.*. .*.**.*. .*.*. |
3 *+*.* *.* * * * *.*.*.* *.*.* *.*
2.8 ++ |
| |
2.6 ++ |
| |
2.4 ++ |
| |
2.2 ++ |
O O O O O |
2 ++ O O O O O O O O O O O O O O O O O O O O O OO O O |
1.8 ++ |
| O O O O O |
1.6 ++--------------------------------------------------------------------+
turbostat.Avg_MHz
220 ++--------------------------------------------------------------------+
| O |
200 O+ O O O O O O O O O O O O O O |
| O O O O OO O O O O O O O O O OO O O |
| |
180 ++ |
| |
160 ++ |
| |
140 ++ .* |
| * : * |
| *. .* : : .*.*. .*.* * *. .*. .*. + + .*
120 ++ * + : *.*.**.* *.* + + + + *.*.*.* **.* *.* *.* |
* * * * |
100 ++--------------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Xiaolong
View attachment "config-4.11.0-rc4-00227-g71027e9" of type "text/plain" (158087 bytes)
View attachment "job-script" of type "text/plain" (7156 bytes)
View attachment "job.yaml" of type "text/plain" (4730 bytes)
View attachment "reproduce" of type "text/plain" (435 bytes)
Powered by blists - more mailing lists