[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20171204022027.GW21779@yexl-desktop>
Date: Mon, 4 Dec 2017 10:20:27 +0800
From: kernel test robot <xiaolong.ye@...el.com>
To: Josef Bacik <josef@...icpanda.com>
Cc: David Sterba <dsterba@...e.com>, Josef Bacik <jbacik@...com>,
LKML <linux-kernel@...r.kernel.org>,
Linus Torvalds <torvalds@...ux-foundation.org>, lkp@...org
Subject: [lkp-robot] [btrfs] 69fe2d75dd: fio.write_clat_90%_us +332.6%
regression
Greeting,
FYI, we noticed a +332.6% regression of fio.write_clat_90%_us due to commit:
commit: 69fe2d75dd91d0124ad2ab6e9fef07633bd730e0 ("btrfs: make the delalloc block rsv per inode")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
in testcase: fio-basic
on test machine: 48 threads Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz with 64G memory
with following parameters:
runtime: 300s
disk: 1HDD
fs: btrfs
nr_task: 1
test_size: 128G
rw: randwrite
bs: 4k
ioengine: sync
cpufreq_governor: performance
test-description: Fio is a tool that will spawn a number of threads or processes doing a particular type of I/O action as specified by the user.
test-url: https://github.com/axboe/fio
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
=========================================================================================
bs/compiler/cpufreq_governor/disk/fs/ioengine/kconfig/nr_task/rootfs/runtime/rw/tbox_group/test_size/testcase:
4k/gcc-7/performance/1HDD/btrfs/sync/x86_64-rhel-7.2/1/debian-x86_64-2016-08-31.cgz/300s/randwrite/ivb44/128G/fio-basic
commit:
dd48d4072e ("btrfs: add tracepoints for outstanding extents mods")
69fe2d75dd ("btrfs: make the delalloc block rsv per inode")
dd48d4072e0cdac5 69fe2d75dd91d0124ad2ab6e9f
---------------- --------------------------
%stddev %change %stddev
\ | \
2.21 ± 3% +0.4 2.58 ± 8% fio.latency_100us%
7.85 -1.4 6.48 fio.latency_10ms%
67.92 -5.4 62.55 ± 2% fio.latency_10us%
0.51 ± 5% -0.1 0.39 ± 4% fio.latency_20ms%
0.08 ± 10% +0.2 0.31 ± 16% fio.latency_2ms%
1.33 ± 11% +3.3 4.62 fio.latency_4ms%
3.01 ± 19% -1.3 1.69 ± 46% fio.latency_4us%
0.11 ± 7% -0.1 0.04 ± 20% fio.latency_500us%
1.54 ± 3% +2.5 4.00 ± 4% fio.latency_50us%
855.31 -5.5% 808.06 fio.time.elapsed_time
855.31 -5.5% 808.06 fio.time.elapsed_time.max
79233 ± 5% +16.3% 92169 fio.time.voluntary_context_switches
4.15 -1.7% 4.08 fio.write_bw_MBps
713.75 ±127% +332.6% 3088 fio.write_clat_90%_us
5728 -23.1% 4405 fio.write_clat_95%_us
8352 -23.2% 6410 fio.write_clat_99%_us
57734 +8.4% 62586 fio.write_clat_stddev
1062 -1.7% 1044 fio.write_iops
239171 -12.9% 208214 ± 2% meminfo.Active(file)
337021 ± 2% -5.6% 318255 ± 4% softirqs.RCU
4506 +4.2% 4696 vmstat.system.cs
1.195e+09 ± 4% -8.4% 1.094e+09 ± 9% cpuidle.C3.time
1469797 ± 4% -8.7% 1342191 ± 9% cpuidle.C3.usage
4599 ± 4% +8.1% 4973 ± 2% cpuidle.POLL.usage
59780 -12.9% 52042 ± 2% proc-vmstat.nr_active_file
59780 -12.9% 52042 ± 2% proc-vmstat.nr_zone_active_file
84195 ± 4% -29.5% 59399 proc-vmstat.pgactivate
458.09 -6.4% 428.91 iostat.sda.await
373.46 +5.5% 393.97 iostat.sda.w/s
458.11 -6.4% 428.93 iostat.sda.w_await
20.72 -5.7% 19.53 iostat.sda.wrqm/s
2273 ± 5% -18.2% 1860 ± 3% slabinfo.Acpi-ParseExt.active_objs
2289 ± 4% -18.7% 1860 ± 3% slabinfo.Acpi-ParseExt.num_objs
26287 -7.3% 24371 ± 2% slabinfo.anon_vma_chain.active_objs
26351 -7.5% 24371 ± 2% slabinfo.anon_vma_chain.num_objs
7273 ± 2% -12.1% 6396 ± 2% slabinfo.btrfs_extent_buffer.active_objs
7306 ± 2% -12.4% 6403 ± 2% slabinfo.btrfs_extent_buffer.num_objs
5511 ± 4% -6.0% 5179 ± 3% slabinfo.sigqueue.active_objs
5723 ± 4% -7.4% 5300 ± 3% slabinfo.sigqueue.num_objs
5.43 ± 10% -1.2 4.24 ± 18% perf-stat.branch-miss-rate%
5.608e+09 ± 5% -12.2% 4.927e+09 ± 4% perf-stat.branch-misses
5.932e+09 -6.7% 5.533e+09 perf-stat.cache-misses
1.862e+10 -7.5% 1.723e+10 ± 2% perf-stat.cache-references
69656 ± 11% -28.2% 50007 ± 2% perf-stat.cpu-migrations
1.205e+09 ± 4% -12.9% 1.049e+09 ± 3% perf-stat.dTLB-load-misses
3.973e+08 -7.2% 3.687e+08 ± 2% perf-stat.dTLB-store-misses
1.638e+11 -5.8% 1.543e+11 perf-stat.dTLB-stores
83.40 -0.7 82.71 perf-stat.iTLB-load-miss-rate%
4.914e+08 -6.4% 4.6e+08 ± 2% perf-stat.iTLB-load-misses
2061342 -5.4% 1950324 perf-stat.minor-faults
2061368 -5.4% 1950328 perf-stat.page-faults
15.80 ± 9% -14.6% 13.49 ± 7% sched_debug.cfs_rq:/.runnable_load_avg.avg
11.40 ± 9% -26.7% 8.36 ± 11% sched_debug.cpu.cpu_load[0].avg
452.18 ± 12% -20.7% 358.62 ± 12% sched_debug.cpu.cpu_load[0].max
67.47 ± 10% -23.1% 51.88 ± 12% sched_debug.cpu.cpu_load[0].stddev
11.34 ± 7% -22.0% 8.85 sched_debug.cpu.cpu_load[1].avg
9.03 ± 9% -19.9% 7.24 ± 3% sched_debug.cpu.cpu_load[2].avg
6.46 ± 11% -19.3% 5.22 ± 5% sched_debug.cpu.cpu_load[3].avg
4.39 ± 14% -20.7% 3.48 ± 6% sched_debug.cpu.cpu_load[4].avg
22618 ± 10% -10.7% 20187 ± 5% sched_debug.cpu.load.avg
16053 ± 13% -16.3% 13439 ± 4% sched_debug.cpu.nr_load_updates.min
-25.52 -51.5% -12.38 sched_debug.cpu.nr_uninterruptible.min
2152 ± 31% -23.1% 1654 ± 42% sched_debug.cpu.ttwu_count.min
1062 ± 49% -35.7% 683.33 ± 44% sched_debug.cpu.ttwu_local.min
86.55 ± 3% -2.1 84.47 ± 3% perf-profile.calltrace.cycles-pp.call_cpuidle.do_idle.cpu_startup_entry.start_secondary.verify_cpu
86.44 ± 3% -2.1 84.36 ± 3% perf-profile.calltrace.cycles-pp.cpuidle_enter.call_cpuidle.do_idle.cpu_startup_entry.start_secondary
90.62 ± 3% -2.0 88.61 ± 2% perf-profile.calltrace.cycles-pp.start_secondary.verify_cpu
90.11 ± 3% -2.0 88.10 ± 3% perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.verify_cpu
90.62 ± 3% -2.0 88.61 ± 2% perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.verify_cpu
53.37 ± 2% -1.2 52.16 ± 11% perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.cpuidle_enter.call_cpuidle.do_idle
87.62 ± 3% -0.5 87.08 ± 2% perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.call_cpuidle.do_idle.cpu_startup_entry
92.70 ± 2% -0.4 92.29 perf-profile.calltrace.cycles-pp.verify_cpu
19.46 ± 19% -0.3 19.16 ± 59% perf-profile.calltrace.cycles-pp.poll_idle.cpuidle_enter_state.cpuidle_enter.call_cpuidle.do_idle
4.73 ± 46% +0.2 4.89 ± 25% perf-profile.calltrace.cycles-pp.process_one_work.worker_thread.kthread.ret_from_fork
4.82 ± 44% +0.2 5.00 ± 24% perf-profile.calltrace.cycles-pp.worker_thread.kthread.ret_from_fork
4.93 ± 44% +0.2 5.13 ± 24% perf-profile.calltrace.cycles-pp.ret_from_fork
4.93 ± 44% +0.2 5.13 ± 24% perf-profile.calltrace.cycles-pp.kthread.ret_from_fork
10.78 ± 13% +0.9 11.67 ± 24% perf-profile.calltrace.cycles-pp.smp_apic_timer_interrupt.__irqentry_text_start.cpuidle_enter_state.cpuidle_enter.call_cpuidle
10.92 ± 13% +0.9 11.82 ± 24% perf-profile.calltrace.cycles-pp.__irqentry_text_start.cpuidle_enter_state.cpuidle_enter.call_cpuidle.do_idle
90.62 ± 3% -2.0 88.61 ± 2% perf-profile.children.cycles-pp.start_secondary
53.43 ± 2% -1.2 52.21 ± 11% perf-profile.children.cycles-pp.intel_idle
88.57 ± 3% -0.6 87.98 ± 2% perf-profile.children.cycles-pp.call_cpuidle
88.44 ± 3% -0.6 87.86 ± 2% perf-profile.children.cycles-pp.cpuidle_enter
87.67 ± 3% -0.6 87.12 ± 2% perf-profile.children.cycles-pp.cpuidle_enter_state
92.17 ± 2% -0.4 91.76 perf-profile.children.cycles-pp.do_idle
92.70 ± 2% -0.4 92.29 perf-profile.children.cycles-pp.verify_cpu
92.70 ± 2% -0.4 92.29 perf-profile.children.cycles-pp.cpu_startup_entry
19.47 ± 19% -0.3 19.16 ± 59% perf-profile.children.cycles-pp.poll_idle
4.73 ± 46% +0.2 4.89 ± 25% perf-profile.children.cycles-pp.process_one_work
4.82 ± 44% +0.2 5.00 ± 24% perf-profile.children.cycles-pp.worker_thread
4.93 ± 44% +0.2 5.13 ± 24% perf-profile.children.cycles-pp.kthread
4.94 ± 44% +0.2 5.14 ± 24% perf-profile.children.cycles-pp.ret_from_fork
4.94 ± 12% +0.6 5.49 ± 24% perf-profile.children.cycles-pp.hrtimer_interrupt
11.12 ± 13% +1.0 12.16 ± 23% perf-profile.children.cycles-pp.smp_apic_timer_interrupt
11.31 ± 13% +1.1 12.37 ± 23% perf-profile.children.cycles-pp.__irqentry_text_start
53.43 ± 2% -1.2 52.21 ± 11% perf-profile.self.cycles-pp.intel_idle
19.25 ± 19% -0.3 18.94 ± 61% perf-profile.self.cycles-pp.poll_idle
fio.write_clat_95__us
6000 +-+------------------------------------------------------------------+
| +..+. +. +. + +..+.+..+..+ +..+ |
5000 +-+ : |
| : |
O: O O O O O O O O O O O O O O O O O O O O O |
4000 +-+ : |
|: : |
3000 +-+ : |
| : : |
2000 +-+ : |
| : : |
| :: |
1000 +-+: |
| : |
0 +-+-----------------------------------------------O------------------+
fio.write_clat_99__us
9000 +-+------------------------------------------------------------------+
| +..+..+.+..+..+.+..+..+.+..+..+.+..+.+..+..+.+..+..+.+..+..+.+..|
8000 +-+ : |
7000 +-+ : |
O: O O O O O O O O O O O O O O O O O O O O O |
6000 +-+ : |
5000 +-+ : |
|: : |
4000 +-+ : |
3000 +-+ : |
| : : |
2000 +-+: |
1000 +-+: |
| : |
0 +-+-----------------------------------------------O------------------+
fio.latency_2ms_
0.45 +-+--------------------------O---------------------------------------+
| O O O |
0.4 +-+ O O O O |
0.35 +-+ O O O |
| O |
0.3 O-+O O O O O O O |
0.25 +-+ O O |
| |
0.2 +-+ |
0.15 +-+ |
| .+.. |
0.1 +-+ +..+.. .+. .+..+.+..+.. .+..+. .+..+.+. .+. |
0.05 +-+ : +.+..+. +. + +. +.+..+. +..|
| + : |
0 +-+-----------------------------------------------O------------------+
fio.latency_4ms_
5 +-+-------------------------------------------------------------------+
4.5 O-+O O O O O O O O O O O O O O O O O O O O O |
| |
4 +-+ |
3.5 +-+ |
| |
3 +-+ |
2.5 +-+ |
2 +-+ .+.. |
| +..+.. +.. .+. .+..+.+.. .+. .+. |
1.5 +-+ : +. .+..+.. + +. +. +. +. +.+.. .+. .|
1 +-+ : +. + +. +. |
| : : |
0.5 +-+: |
0 +-+------------------------------------------------O------------------+
fio.latency_10ms_
9 +-+---------------------------------------------------------------------+
| +.+..+.. .+. .+..+.+..+.. .+..+.+..+..+..+.+.. .+.. |
8 +-+ : +. +..+. +. +. +.+..+..|
7 +-+ : |
O: O :O O O O O O O O O O O O O O O O O O O O |
6 +-+ : |
5 +-+ : |
|: : |
4 +-+ : |
3 +-+ : |
| : : |
2 +-+ : |
1 +-+: |
| : |
0 +-+--------------------------------------------------O------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Xiaolong
View attachment "config-4.14.0-rc7-00089-g69fe2d7" of type "text/plain" (163375 bytes)
View attachment "job-script" of type "text/plain" (7494 bytes)
View attachment "job.yaml" of type "text/plain" (4922 bytes)
View attachment "reproduce" of type "text/plain" (605 bytes)
Powered by blists - more mailing lists