[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20190115015420.GO17624@shao2-debian>
Date: Tue, 15 Jan 2019 09:54:20 +0800
From: kernel test robot <rong.a.chen@...el.com>
To: Josef Bacik <josef@...icpanda.com>
Cc: David Sterba <dsterba@...e.com>,
Nikolay Borisov <nborisov@...e.com>,
LKML <linux-kernel@...r.kernel.org>,
Linus Torvalds <torvalds@...ux-foundation.org>, lkp@...org
Subject: [LKP] [btrfs] 64403612b7: fio.write_bw_MBps 5.3% improvement
Greeting,
FYI, we noticed a 5.3% improvement of fio.write_bw_MBps due to commit:
commit: 64403612b73a94bc7b02cf8ca126e3b8ced6e921 ("btrfs: rework btrfs_check_space_for_delayed_refs")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
in testcase: fio-basic
on test machine: 56 threads Intel(R) Xeon(R) CPU E5-2695 v3 @ 2.30GHz with 256G memory
with following parameters:
disk: 2pmem
fs: btrfs
runtime: 200s
nr_task: 50%
time_based: tb
rw: randwrite
bs: 4k
ioengine: mmap
test_size: 100G
cpufreq_governor: performance
ucode: 0x3d
test-description: Fio is a tool that will spawn a number of threads or processes doing a particular type of I/O action as specified by the user.
test-url: https://github.com/axboe/fio
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
=========================================================================================
bs/compiler/cpufreq_governor/disk/fs/ioengine/kconfig/nr_task/rootfs/runtime/rw/tbox_group/test_size/testcase/time_based/ucode:
4k/gcc-7/performance/2pmem/btrfs/mmap/x86_64-rhel-7.2/50%/debian-x86_64-2018-04-03.cgz/200s/randwrite/lkp-hsw-ep6/100G/fio-basic/tb/0x3d
commit:
413df7252d ("btrfs: add new flushing states for the delayed refs rsv")
64403612b7 ("btrfs: rework btrfs_check_space_for_delayed_refs")
413df7252d5256df 64403612b73a94bc7b02cf8ca1
---------------- --------------------------
fail:runs %reproduction fail:runs
| | |
:4 25% 1:4 kmsg.pstore:crypto_comp_decompress_failed,ret=
:4 25% 1:4 kmsg.pstore:decompression_failed
%stddev %change %stddev
\ | \
0.12 ± 5% +0.0 0.15 ± 2% fio.latency_1000us%
0.18 ± 8% +0.2 0.43 ± 3% fio.latency_100ms%
54.73 -38.5 16.20 ± 6% fio.latency_100us%
0.20 ± 7% +0.3 0.54 ± 2% fio.latency_20ms%
0.27 ± 5% +0.1 0.42 fio.latency_250ms%
35.44 ± 3% +33.8 69.25 fio.latency_250us%
0.58 ± 2% +0.4 0.99 ± 7% fio.latency_2ms%
0.10 ± 5% -0.1 0.01 ± 11% fio.latency_500ms%
5.66 ± 8% +3.9 9.60 ± 2% fio.latency_500us%
0.08 ± 15% +0.2 0.31 ± 3% fio.latency_50ms%
0.03 ± 6% -0.0 0.01 fio.latency_750ms%
0.18 ± 3% +0.1 0.24 ± 6% fio.latency_750us%
30709938 +5.3% 32342124 fio.time.file_system_inputs
30732802 +5.3% 32353876 fio.time.file_system_outputs
3838742 +5.3% 4042765 fio.time.major_page_faults
561886 +4.9% 589639 fio.time.maximum_resident_set_size
131.75 +23.0% 162.00 fio.time.percent_of_cpu_this_job_got
250.65 +23.3% 308.96 fio.time.system_time
36.67 ± 2% +11.2% 40.76 fio.time.user_time
5410115 +35.6% 7336588 fio.time.voluntary_context_switches
3838742 +5.3% 4042765 fio.workload
74.89 +5.3% 78.82 fio.write_bw_MBps
240.50 +14.8% 276.00 fio.write_clat_90%_us
312.00 +30.8% 408.00 ± 7% fio.write_clat_95%_us
7784 ± 5% +251.1% 27328 ± 3% fio.write_clat_99%_us
1456 -5.1% 1382 fio.write_clat_mean_us
21857 -41.0% 12894 ± 4% fio.write_clat_stddev
19170 +5.3% 20178 fio.write_iops
205460 ± 5% +36.0% 279377 interrupts.CAL:Function_call_interrupts
16.34 +1.8 18.11 mpstat.cpu.sys%
2357846 -15.1% 2002644 ± 3% softirqs.RCU
258.50 +8.4% 280.25 turbostat.Avg_MHz
13072038 ± 2% -11.2% 11603783 ± 3% cpuidle.POLL.time
3533776 ± 2% -13.9% 3043996 ± 3% cpuidle.POLL.usage
69795 +6.0% 74011 vmstat.io.bi
133182 +21.6% 161958 vmstat.io.bo
405297 +11.2% 450757 vmstat.system.cs
3201128 ± 2% -9.7% 2891692 ± 3% numa-numastat.node0.local_node
3210255 ± 2% -9.8% 2894508 ± 3% numa-numastat.node0.numa_hit
9126 ± 34% -69.1% 2819 ±105% numa-numastat.node0.other_node
1908 ±160% +328.4% 8175 ± 36% numa-numastat.node1.other_node
19952 ± 9% -51.9% 9598 ± 25% slabinfo.avc_xperms_data.active_objs
19971 ± 9% -51.9% 9598 ± 25% slabinfo.avc_xperms_data.num_objs
10424 ± 3% -19.4% 8406 ± 26% slabinfo.kmalloc-192.active_objs
10430 ± 3% -19.3% 8413 ± 26% slabinfo.kmalloc-192.num_objs
3105110 ± 3% -36.8% 1962938 meminfo.Active
2832284 ± 3% -40.3% 1691217 meminfo.Active(file)
338871 ± 2% -41.5% 198312 meminfo.Dirty
1522 ± 4% -17.8% 1250 ± 7% meminfo.Writeback
129239 ± 2% -8.5% 118201 meminfo.max_used_kB
1556735 ± 4% -39.2% 947031 ± 11% numa-meminfo.node0.Active
107694 ± 41% +56.4% 168451 ± 11% numa-meminfo.node0.Active(anon)
1449040 ± 5% -46.3% 778580 ± 11% numa-meminfo.node0.Active(file)
86788 ± 51% +67.9% 145679 ± 12% numa-meminfo.node0.AnonHugePages
105836 ± 42% +55.3% 164410 ± 12% numa-meminfo.node0.AnonPages
169901 -42.4% 97927 ± 3% numa-meminfo.node0.Dirty
1550352 ± 4% -34.7% 1012540 ± 8% numa-meminfo.node1.Active
1385245 ± 3% -34.4% 909285 ± 7% numa-meminfo.node1.Active(file)
169956 ± 5% -41.0% 100346 ± 5% numa-meminfo.node1.Dirty
402663 ± 5% +12.7% 453834 ± 10% numa-meminfo.node1.SUnreclaim
16471 +12.3% 18500 sched_debug.cfs_rq:/.exec_clock.min
1060 ± 78% -80.3% 209.11 ± 4% sched_debug.cfs_rq:/.load_avg.stddev
272.46 ± 6% +14.0% 310.72 ± 5% sched_debug.cfs_rq:/.util_avg.avg
610.12 ± 7% -18.2% 499.00 ± 10% sched_debug.cpu.cpu_load[0].max
590.31 ± 11% -20.7% 468.38 ± 14% sched_debug.cpu.cpu_load[1].max
562.69 ± 9% -20.2% 448.88 ± 14% sched_debug.cpu.cpu_load[2].max
544.19 ± 12% -22.9% 419.62 ± 15% sched_debug.cpu.cpu_load[3].max
108.84 ± 11% -13.2% 94.48 ± 4% sched_debug.cpu.cpu_load[3].stddev
531.06 ± 31% -33.4% 353.44 ± 11% sched_debug.cpu.cpu_load[4].max
20.24 ± 6% +17.1% 23.70 ± 5% sched_debug.cpu.nr_uninterruptible.stddev
646369 ± 2% +10.8% 716332 sched_debug.cpu.sched_count.min
406767 ± 2% +9.1% 443952 sched_debug.cpu.ttwu_count.max
26927 ± 41% +56.4% 42115 ± 11% numa-vmstat.node0.nr_active_anon
362626 ± 5% -46.3% 194697 ± 11% numa-vmstat.node0.nr_active_file
26463 ± 42% +55.3% 41110 ± 12% numa-vmstat.node0.nr_anon_pages
26289467 ± 4% -71.3% 7542005 ±125% numa-vmstat.node0.nr_dirtied
42424 -42.1% 24578 ± 4% numa-vmstat.node0.nr_dirty
192.00 ± 4% -18.1% 157.25 ± 6% numa-vmstat.node0.nr_writeback
26244101 ± 4% -71.4% 7513973 ±126% numa-vmstat.node0.nr_written
26927 ± 41% +56.4% 42115 ± 11% numa-vmstat.node0.nr_zone_active_anon
362626 ± 5% -46.3% 194697 ± 11% numa-vmstat.node0.nr_zone_active_file
43198 -41.3% 25368 ± 4% numa-vmstat.node0.nr_zone_write_pending
26688163 ± 4% -71.5% 7600592 ±125% numa-vmstat.node0.numa_hit
26617350 ± 4% -71.6% 7561433 ±126% numa-vmstat.node0.numa_local
346578 ± 3% -34.4% 227448 ± 7% numa-vmstat.node1.nr_active_file
3554965 ± 30% +543.5% 22874525 ± 41% numa-vmstat.node1.nr_dirtied
42577 ± 5% -41.0% 25122 ± 4% numa-vmstat.node1.nr_dirty
100760 ± 5% +12.7% 113524 ± 10% numa-vmstat.node1.nr_slab_unreclaimable
196.00 ± 4% -18.2% 160.25 ± 7% numa-vmstat.node1.nr_writeback
3509312 ± 30% +551.0% 22845714 ± 41% numa-vmstat.node1.nr_written
346578 ± 3% -34.4% 227448 ± 7% numa-vmstat.node1.nr_zone_active_file
43362 ± 5% -40.2% 25948 ± 4% numa-vmstat.node1.nr_zone_write_pending
3966804 ± 29% +476.3% 22858741 ± 41% numa-vmstat.node1.numa_hit
3886136 ± 30% +485.3% 22746346 ± 41% numa-vmstat.node1.numa_local
707636 ± 3% -40.3% 422726 proc-vmstat.nr_active_file
7333780 +20.8% 8856315 proc-vmstat.nr_dirtied
85214 ± 2% -41.7% 49695 proc-vmstat.nr_dirty
3052409 -7.2% 2833731 proc-vmstat.nr_file_pages
7933910 +2.7% 8147593 proc-vmstat.nr_free_pages
55490 +1.7% 56430 proc-vmstat.nr_inactive_anon
2049449 +3.2% 2115006 proc-vmstat.nr_inactive_file
1924130 +4.8% 2015538 proc-vmstat.nr_mapped
56956 +1.8% 57989 proc-vmstat.nr_shmem
208961 +2.5% 214081 proc-vmstat.nr_slab_unreclaimable
380.25 ± 3% -20.7% 301.50 ± 7% proc-vmstat.nr_writeback
7322611 +20.7% 8841894 proc-vmstat.nr_written
707636 ± 3% -40.3% 422726 proc-vmstat.nr_zone_active_file
55490 +1.7% 56430 proc-vmstat.nr_zone_inactive_anon
2049449 +3.2% 2115006 proc-vmstat.nr_zone_inactive_file
86678 ± 2% -40.8% 51327 proc-vmstat.nr_zone_write_pending
6450055 -9.0% 5867193 proc-vmstat.numa_hit
6439020 -9.1% 5856197 proc-vmstat.numa_local
1652471 ± 6% +22.4% 2022763 ± 4% proc-vmstat.numa_pte_updates
808042 ± 10% -50.4% 400794 ± 10% proc-vmstat.pgactivate
7155849 -8.6% 6537029 proc-vmstat.pgalloc_normal
8299588 +4.9% 8703691 proc-vmstat.pgfault
3838742 +5.3% 4042765 proc-vmstat.pgmajfault
15354969 +5.3% 16171062 proc-vmstat.pgpgin
29290335 +20.8% 35379032 proc-vmstat.pgpgout
2.18 ± 11% -1.2 1.01 ± 70% perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_kernel.secondary_startup_64
2.18 ± 11% -1.2 1.01 ± 70% perf-profile.calltrace.cycles-pp.start_kernel.secondary_startup_64
2.18 ± 11% -1.2 1.01 ± 70% perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_kernel.secondary_startup_64
1.97 ± 11% -1.1 0.90 ± 69% perf-profile.calltrace.cycles-pp.cpuidle_enter_state.do_idle.cpu_startup_entry.start_kernel.secondary_startup_64
1.81 ± 12% -1.0 0.81 ± 70% perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.do_idle.cpu_startup_entry.start_kernel
2.00 ± 4% -0.3 1.65 ± 4% perf-profile.calltrace.cycles-pp.hrtimer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt.cpuidle_enter_state.do_idle
1.38 ± 3% -0.3 1.13 ± 8% perf-profile.calltrace.cycles-pp.__hrtimer_run_queues.hrtimer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt.cpuidle_enter_state
0.64 ± 2% -0.2 0.41 ± 57% perf-profile.calltrace.cycles-pp.tick_sched_handle.tick_sched_timer.__hrtimer_run_queues.hrtimer_interrupt.smp_apic_timer_interrupt
0.74 ± 3% -0.1 0.62 ± 7% perf-profile.calltrace.cycles-pp.tick_sched_timer.__hrtimer_run_queues.hrtimer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt
0.39 ± 57% +0.2 0.57 ± 8% perf-profile.calltrace.cycles-pp.__btrfs_lookup_bio_sums.btrfs_submit_bio_hook.submit_one_bio.extent_read_full_page.filemap_fault
2.29 ± 2% +0.2 2.50 ± 5% perf-profile.calltrace.cycles-pp.enqueue_task_fair.ttwu_do_activate.try_to_wake_up.autoremove_wake_function.__wake_up_common
2.12 ± 6% +0.2 2.34 ± 2% perf-profile.calltrace.cycles-pp.extent_read_full_page.filemap_fault.__do_fault.__handle_mm_fault.handle_mm_fault
1.76 ± 6% +0.3 2.02 ± 10% perf-profile.calltrace.cycles-pp.btrfs_tree_read_lock.btrfs_read_lock_root_node.btrfs_search_slot.btrfs_lookup_file_extent.run_delalloc_nocow
2.21 ± 6% +0.3 2.54 ± 8% perf-profile.calltrace.cycles-pp.btrfs_search_slot.btrfs_lookup_file_extent.run_delalloc_nocow.btrfs_run_delalloc_range.writepage_delalloc
2.21 ± 6% +0.3 2.54 ± 8% perf-profile.calltrace.cycles-pp.btrfs_lookup_file_extent.run_delalloc_nocow.btrfs_run_delalloc_range.writepage_delalloc.__extent_writepage
3.51 ± 5% +0.4 3.89 ± 9% perf-profile.calltrace.cycles-pp.run_delalloc_nocow.btrfs_run_delalloc_range.writepage_delalloc.__extent_writepage.extent_write_cache_pages
3.52 ± 5% +0.4 3.90 ± 9% perf-profile.calltrace.cycles-pp.btrfs_run_delalloc_range.writepage_delalloc.__extent_writepage.extent_write_cache_pages.extent_writepages
3.65 ± 5% +0.4 4.05 ± 9% perf-profile.calltrace.cycles-pp.writepage_delalloc.__extent_writepage.extent_write_cache_pages.extent_writepages.do_writepages
3.78 ± 5% +0.4 4.19 ± 9% perf-profile.calltrace.cycles-pp.__extent_writepage.extent_write_cache_pages.extent_writepages.do_writepages.__writeback_single_inode
3.95 ± 5% +0.4 4.38 ± 9% perf-profile.calltrace.cycles-pp.extent_writepages.do_writepages.__writeback_single_inode.writeback_sb_inodes.wb_writeback
3.95 ± 5% +0.4 4.38 ± 9% perf-profile.calltrace.cycles-pp.extent_write_cache_pages.extent_writepages.do_writepages.__writeback_single_inode.writeback_sb_inodes
0.58 ± 12% +0.6 1.16 ± 10% perf-profile.calltrace.cycles-pp.btrfs_delalloc_reserve_space.btrfs_page_mkwrite.do_page_mkwrite.__handle_mm_fault.handle_mm_fault
0.00 +0.7 0.70 ± 14% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.reserve_metadata_bytes.btrfs_delalloc_reserve_metadata.btrfs_delalloc_reserve_space
0.00 +0.7 0.74 ± 14% perf-profile.calltrace.cycles-pp._raw_spin_lock.reserve_metadata_bytes.btrfs_delalloc_reserve_metadata.btrfs_delalloc_reserve_space.btrfs_page_mkwrite
1.19 ± 4% +0.7 1.94 ± 5% perf-profile.calltrace.cycles-pp.btrfs_page_mkwrite.do_page_mkwrite.__handle_mm_fault.handle_mm_fault.__do_page_fault
1.19 ± 4% +0.8 1.95 ± 5% perf-profile.calltrace.cycles-pp.do_page_mkwrite.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault
0.30 ±101% +0.8 1.11 ± 11% perf-profile.calltrace.cycles-pp.btrfs_delalloc_reserve_metadata.btrfs_delalloc_reserve_space.btrfs_page_mkwrite.do_page_mkwrite.__handle_mm_fault
0.00 +0.8 0.81 ± 21% perf-profile.calltrace.cycles-pp.__btrfs_run_delayed_refs.btrfs_run_delayed_refs.btrfs_commit_transaction.transaction_kthread.kthread
0.00 +0.8 0.82 ± 21% perf-profile.calltrace.cycles-pp.btrfs_run_delayed_refs.btrfs_commit_transaction.transaction_kthread.kthread.ret_from_fork
0.14 ±173% +0.9 1.02 ± 12% perf-profile.calltrace.cycles-pp.reserve_metadata_bytes.btrfs_delalloc_reserve_metadata.btrfs_delalloc_reserve_space.btrfs_page_mkwrite.do_page_mkwrite
4.16 ± 6% +1.0 5.19 ± 3% perf-profile.calltrace.cycles-pp.handle_mm_fault.__do_page_fault.do_page_fault.page_fault
4.12 ± 6% +1.0 5.15 ± 3% perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault.page_fault
4.26 ± 6% +1.1 5.31 ± 2% perf-profile.calltrace.cycles-pp.page_fault
4.25 ± 6% +1.1 5.31 ± 2% perf-profile.calltrace.cycles-pp.do_page_fault.page_fault
4.24 ± 6% +1.1 5.30 ± 2% perf-profile.calltrace.cycles-pp.__do_page_fault.do_page_fault.page_fault
0.00 +1.8 1.80 ± 23% perf-profile.calltrace.cycles-pp.btrfs_commit_transaction.transaction_kthread.kthread.ret_from_fork
0.00 +1.8 1.81 ± 23% perf-profile.calltrace.cycles-pp.transaction_kthread.kthread.ret_from_fork
1.539e+09 +9.3% 1.682e+09 perf-stat.i.branch-instructions
7158935 +11.3% 7970184 perf-stat.i.cache-misses
1.255e+08 +12.6% 1.414e+08 ± 4% perf-stat.i.cache-references
410718 +11.4% 457384 perf-stat.i.context-switches
1.482e+10 +9.1% 1.616e+10 ± 2% perf-stat.i.cpu-cycles
1750 ± 3% +74.8% 3059 ± 6% perf-stat.i.cpu-migrations
2.056e+09 +9.9% 2.26e+09 perf-stat.i.dTLB-loads
1.005e+09 +9.2% 1.097e+09 ± 3% perf-stat.i.dTLB-stores
729999 ± 9% +27.0% 926769 ± 22% perf-stat.i.iTLB-load-misses
6404547 +7.3% 6875207 ± 2% perf-stat.i.iTLB-loads
7.69e+09 +9.8% 8.444e+09 perf-stat.i.instructions
17684 +6.2% 18776 perf-stat.i.major-faults
3647552 +10.9% 4044488 ± 2% perf-stat.i.node-load-misses
1727080 +14.0% 1968627 perf-stat.i.node-loads
867876 +10.2% 956518 ± 2% perf-stat.i.node-store-misses
840940 +9.0% 916606 ± 2% perf-stat.i.node-stores
20479 +5.4% 21580 perf-stat.i.page-faults
1.69 -0.1 1.63 ± 3% perf-stat.overall.branch-miss-rate%
434820 +3.4% 449734 perf-stat.overall.path-length
1.532e+09 +9.3% 1.675e+09 perf-stat.ps.branch-instructions
7125917 +11.3% 7932991 perf-stat.ps.cache-misses
1.25e+08 +12.6% 1.407e+08 ± 4% perf-stat.ps.cache-references
408844 +11.4% 455268 perf-stat.ps.context-switches
1.475e+10 +9.1% 1.608e+10 ± 2% perf-stat.ps.cpu-cycles
1742 ± 3% +74.8% 3045 ± 6% perf-stat.ps.cpu-migrations
2.047e+09 +9.9% 2.249e+09 perf-stat.ps.dTLB-loads
1e+09 +9.2% 1.092e+09 ± 3% perf-stat.ps.dTLB-stores
726673 ± 9% +26.9% 922503 ± 22% perf-stat.ps.iTLB-load-misses
6375316 +7.3% 6843399 ± 2% perf-stat.ps.iTLB-loads
7.654e+09 +9.8% 8.405e+09 perf-stat.ps.instructions
17603 +6.2% 18689 perf-stat.ps.major-faults
3630742 +10.9% 4025629 ± 2% perf-stat.ps.node-load-misses
1719047 +14.0% 1959367 perf-stat.ps.node-loads
863893 +10.2% 952077 ± 2% perf-stat.ps.node-store-misses
837092 +9.0% 912357 ± 2% perf-stat.ps.node-stores
20389 +5.4% 21485 perf-stat.ps.page-faults
1.669e+12 +8.9% 1.818e+12 perf-stat.total.instructions
fio.write_clat_99__us
30000 +-+-----------------------------------------------------------------+
| O O |
O O O O O O O O O
25000 +-+O O O O O O O O O O O O O O O O |
| |
| |
20000 +-+ |
| |
15000 +-+ |
| |
| |
10000 +-+ |
|..+.+..+.+..+..+.+..+..+.+..+.+..+..+.+..+.+..+.. .+..+.. |
| + + |
5000 +-+-----------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Rong Chen
View attachment "config-4.20.0-rc7-00110-g6440361" of type "text/plain" (168504 bytes)
View attachment "job-script" of type "text/plain" (7740 bytes)
View attachment "job.yaml" of type "text/plain" (5272 bytes)
View attachment "reproduce" of type "text/plain" (826 bytes)
Powered by blists - more mailing lists