[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20161220190139.GC23441@yexl-desktop>
Date: Wed, 21 Dec 2016 03:01:39 +0800
From: kernel test robot <xiaolong.ye@...el.com>
To: Jan Kara <jack@...e.cz>
Cc: Theodore Ts'o <tytso@....edu>, LKML <linux-kernel@...r.kernel.org>,
lkp@...org
Subject: [lkp-developer] [ext4] 96f8ba3dd6: fio.write_bw_MBps +510.6%
improvement
Greeting,
FYI, we noticed a +510.6% improvement of fio.write_bw_MBps due to commit:
commit: 96f8ba3dd632aff684cc7c67d9f4af435be0341c ("ext4: avoid split extents for DAX writes")
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
in testcase: fio-basic
on test machine: 56 threads Intel(R) Xeon(R) CPU E5-2695 v3 @ 2.30GHz with 256G memory
with following parameters:
disk: 2pmem
fs: ext4
mount_option: dax
runtime: 200s
nr_task: 50%
time_based: tb
rw: randwrite
bs: 4k
ioengine: sync
test_size: 200G
cpufreq_governor: performance
test-description: Fio is a tool that will spawn a number of threads or processes doing a particular type of I/O action as specified by the user.
test-url: https://github.com/axboe/fio
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
testcase/path_params/tbox_group/run: fio-basic/2pmem-ext4-dax-200s-50%-tb-randwrite-4k-sync-200G-performance/lkp-hsw-ep6
776722e85d3b0936 96f8ba3dd632aff684cc7c67d9
---------------- --------------------------
fail:runs %reproduction fail:runs
| | |
820.67 ± 0% +510.6% 5011 ± 4% fio.write_bw_MBps
210091 ± 0% +510.6% 1282918 ± 4% fio.write_iops
0.14 ± 0% -92.9% 0.01 ± 0% fio.latency_100ms%
24.00 ± 10% -96.9% 0.74 ± 21% fio.latency_100us%
0.01 ± 57% +2.9e+05% 21.76 ± 12% fio.latency_10us%
1.22 ± 37% +3122.9% 39.40 ± 2% fio.latency_20us%
0.32 ± 8% -93.8% 0.02 ± 0% fio.latency_250us%
74.28 ± 3% -48.8% 38.01 ± 9% fio.latency_50us%
5511 ± 5% +117.5% 11986 ± 3% fio.time.involuntary_context_switches
977.75 ± 1% +149.2% 2436 ± 0% fio.time.percent_of_cpu_this_job_got
1874 ± 1% +149.6% 4679 ± 0% fio.time.system_time
89.35 ± 3% +137.8% 212.46 ± 3% fio.time.user_time
164733 ± 2% -58.0% 69111 ± 3% fio.time.voluntary_context_switches
58.00 ± 2% -44.8% 32.00 ± 2% fio.write_clat_90%_us
65.50 ± 1% -44.5% 36.33 ± 2% fio.write_clat_95%_us
85.25 ± 1% -44.9% 47.00 ± 3% fio.write_clat_99%_us
131.52 ± 0% -83.9% 21.12 ± 4% fio.write_clat_mean_us
2270 ± 0% -84.7% 347.78 ± 0% fio.write_clat_stddev
133959 ± 4% +52.6% 204457 ± 3% softirqs.RCU
1433931 ± 1% +94.7% 2791395 ± 0% softirqs.TIMER
5511 ± 5% +117.5% 11986 ± 3% time.involuntary_context_switches
977.75 ± 1% +149.2% 2436 ± 0% time.percent_of_cpu_this_job_got
1874 ± 1% +149.6% 4679 ± 0% time.system_time
89.35 ± 3% +137.8% 212.46 ± 3% time.user_time
164733 ± 2% -58.0% 69111 ± 3% time.voluntary_context_switches
2766132 ± 0% -49.2% 1405817 ± 0% vmstat.io.bo
613430 ± 0% -60.9% 239670 ± 0% vmstat.memory.buff
1671149 ± 0% -39.4% 1012059 ± 0% vmstat.memory.cache
10.00 ± 7% +140.0% 24.00 ± 0% vmstat.procs.r
58099 ± 0% +4.8% 60882 ± 0% vmstat.system.in
762597 ± 0% -47.5% 400049 ± 0% meminfo.Active
0 5e+03 5067 ± 95% latency_stats.max.do_get_write_access.jbd2_journal_get_write_access.__ext4_journal_get_write_access.ext4_split_extent_at.ext4_split_extent.ext4_ext_map_blocks.ext4_map_blocks.ext4_iomap_begin.iomap_apply.dax_iomap_rw.ext4_file_write_iter.__vfs_write
0 5e+03 4996 ± 98% latency_stats.max.do_get_write_access.jbd2_journal_get_write_access.__ext4_journal_get_write_access.ext4_ext_map_blocks.ext4_map_blocks.ext4_iomap_begin.iomap_apply.dax_iomap_rw.ext4_file_write_iter.__vfs_write.vfs_write.SyS_write
0 2e+05 156829 ± 58% latency_stats.sum.do_get_write_access.jbd2_journal_get_write_access.__ext4_journal_get_write_access.ext4_split_extent_at.ext4_split_extent.ext4_ext_map_blocks.ext4_map_blocks.ext4_iomap_begin.iomap_apply.dax_iomap_rw.ext4_file_write_iter.__vfs_write
0 1e+04 12881 ± 59% latency_stats.sum.do_get_write_access.jbd2_journal_get_write_access.__ext4_journal_get_write_access.ext4_ext_map_blocks.ext4_map_blocks.ext4_iomap_begin.iomap_apply.dax_iomap_rw.ext4_file_write_iter.__vfs_write.vfs_write.SyS_write
209560 ± 33% -2e+05 0 latency_stats.sum.do_get_write_access.jbd2_journal_get_write_access.__ext4_journal_get_write_access.ext4_split_extent_at.ext4_split_extent.ext4_split_convert_extents.ext4_ext_map_blocks.ext4_map_blocks.ext4_iomap_begin.iomap_apply.dax_iomap_rw.ext4_file_write_iter
2492341 ± 4% -2e+06 412777 ± 58% latency_stats.sum.wait_transaction_locked.add_transaction_credits.start_this_handle.jbd2__journal_start.__ext4_journal_start_sb.ext4_iomap_begin.iomap_apply.dax_iomap_rw.ext4_file_write_iter.__vfs_write.vfs_write.SyS_write
3976384 ± 8% -4e+06 338379 ± 57% latency_stats.sum.wait_transaction_locked.add_transaction_credits.start_this_handle.jbd2__journal_start.__ext4_journal_start_sb.ext4_iomap_end.iomap_apply.dax_iomap_rw.ext4_file_write_iter.__vfs_write.vfs_write.SyS_write
1.549e+08 -1e+08 21221204 ± 57% latency_stats.sum.jbd2_log_wait_commit.jbd2_log_do_checkpoint.__jbd2_log_wait_for_space.add_transaction_credits.start_this_handle.jbd2__journal_start.__ext4_journal_start_sb.ext4_iomap_end.iomap_apply.dax_iomap_rw.ext4_file_write_iter.__vfs_write
1.549e+08 -1e+08 21221204 ± 57% latency_stats.sum.max
perf-stat.cpu-cycles
1.4e+13 ++----------------------------------------------------------------+
| |
1.2e+13 O+ O O OO O O O O O O |
| OO OO O OO OO O OO O OO OO O O O O OO OO OO OO OO O
1e+13 ++ |
| |
8e+12 ++ |
| |
6e+12 ++ |
| |
4e+12 ++ .* .**. *. .*. *. .* |
*.** * **.* **.** **.* ** * * *.*.**.** |
2e+12 ++ : :: : |
| :: :: |
0 ++-----------------------------------*--*O------------------------+
perf-stat.branch-misses
1e+10 ++------------------------------------------------------------------+
9e+09 ++ .* .* .* * |
*.** *.**.*.** *.*.** *. .* + * * .* .*.* |
8e+09 ++ **.* * : : ** * |
7e+09 ++ : : : |
| : : : |
6e+09 ++ : :: : |
5e+09 ++ : :: : |
4e+09 ++O O :O: O: |
O O OO OO O OO OO O OO OO OO O OO O: :O:: O OO O OO OO O OO OO OO O
3e+09 ++ : : :: |
2e+09 ++ :: :: |
| : : |
1e+09 ++ : : |
0 ++------------------------------------*--*-O------------------------+
perf-stat.iTLB-loads
3.5e+08 ++----------------------------------------------------------------+
| |
3e+08 ++ O O O O O |
| O OO OO O O O O O OO O O O O O O OO |
2.5e+08 O+O O OO O O O O O O O O O O O
| O O |
2e+08 ++ |
| |
1.5e+08 ++ .**. *. *.* .* *. * |
*.**.** **.**.**.**.*.* * * * * *.*.* * |
1e+08 ++ : : : |
| : :: : |
5e+07 ++ :: :: |
| : : |
0 ++-----------------------------------*--*O------------------------+
perf-stat.node-loads
3.5e+09 ++----------------------------------------------------------------+
| |
3e+09 *+* * **. *.* |
| *. + :+ * : *. *. *. *.** * *. .* |
2.5e+09 ++ ** * *.* *.* * * : : : *.** * |
| : : : |
2e+09 ++ : :: : |
| : :: : |
1.5e+09 ++ : :: : |
| :: :: |
1e+09 ++ O OO O O O OO O O :O :O O O OO OO OO O O
O OO OO OO O O O O OO O::O:: O O O OO O |
5e+08 ++ : : |
| : : |
0 ++-----------------------------------*--*O------------------------+
perf-stat.branch-miss-rate_
3 ++--------------------------------------------------------------------+
*. *.*.* *. .**.*.* .**.*. *. .* *.**.* |
2.5 ++* *.* * * * *.** * * *.* |
| : : : |
| : : : |
2 ++ : : : |
| : :: : |
1.5 ++ : :: : |
| : : :: |
1 ++ O : : :: |
O O O OO O O OO O OO OO O OO O OO O: :OO: O OO O O OO OO O OO O
| O O O: :: O O O |
0.5 ++ : : |
| : : |
0 ++-------------------------------------*--*-O-------------------------+
perf-stat.ipc
0.6 *+*--------*-*------**------------------------------------------------+
| *.*.**. : **.* **.*.**. .**.** * *.**.**.* |
0.5 ++ * * : : : |
| : : : |
| : : : |
0.4 ++ : :: : |
| : :: : |
0.3 ++ O :O:: : O O |
| O O O : : O: O O O |
0.2 O+ O O OO OO O OO O O OO O OO OO O: : :: O O O O OO OO O O O
| : :O:: |
| : : |
0.1 ++ : : |
| : : |
0 ++-------------------------------------*--*-O-------------------------+
fio.write_bw_MBps
8000 ++-------------------------------------------------------------------+
| O O |
7000 ++ O O |
6000 ++O O |
| O O O |
5000 ++ O O O O O O O O O O OO |
O OO OO OO O O OO O O O O O OO OO O
4000 ++ O |
| |
3000 ++ |
2000 ++ |
| |
1000 ++ |
*.**.**.*.**.**.*.**.**.*.**.**.*.**.*. *. *.*.**.** |
0 ++-------------------------------------*--*O-------------------------+
fio.write_iops
2e+06 ++----------------------------------------------------------------+
1.8e+06 ++ O O |
| O O O |
1.6e+06 ++O |
1.4e+06 ++ O O O O |
| O O O O O O O O O OO OO O
1.2e+06 O+ OO O OO OO O OO O O O O O OO |
1e+06 ++ O |
800000 ++ |
| |
600000 ++ |
400000 ++ |
*. .* *. |
200000 ++**.**.** *.**.* **.*.**.**.**.**. *. *.*.**.** |
0 ++-----------------------------------*--*O------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Xiaolong
View attachment "config-4.9.0-rc4-00044-g96f8ba3" of type "text/plain" (153623 bytes)
View attachment "job-script" of type "text/plain" (7093 bytes)
View attachment "job.yaml" of type "text/plain" (4701 bytes)
View attachment "reproduce" of type "text/plain" (658 bytes)
Powered by blists - more mailing lists