[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20220325071501.GA8478@xsang-OptiPlex-9020>
Date: Fri, 25 Mar 2022 15:15:01 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Filipe Manana <fdmanana@...e.com>
Cc: lkp@...ts.01.org, lkp@...el.com, ying.huang@...el.com,
feng.tang@...el.com, zhengjun.xing@...ux.intel.com,
fengwei.yin@...el.com, LKML <linux-kernel@...r.kernel.org>,
linux-btrfs@...r.kernel.org
Subject: [btrfs] a052d3d1b6: fio.write_iops 3241.7% improvement
Greeting,
FYI, we noticed a 3241.7% improvement of fio.write_iops due to commit:
commit: a052d3d1b6c77f193f7051cd5d4b08138fd57332 ("btrfs: only reserve the needed data space amount during fallocate")
https://git.kernel.org/cgit/linux/kernel/git/fdmanana/linux.git misc-next
in testcase: fio-basic
on test machine: 96 threads 2 sockets Ice Lake with 256G memory
with following parameters:
runtime: 300s
disk: 1HDD
fs: btrfs
nr_task: 100%
test_size: 128G
rw: randwrite
bs: 4k
ioengine: falloc
cpufreq_governor: performance
ucode: 0xb000280
test-description: Fio is a tool that will spawn a number of threads or processes doing a particular type of I/O action as specified by the user.
test-url: https://github.com/axboe/fio
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
sudo bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
sudo bin/lkp run generated-yaml-file
# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.
=========================================================================================
bs/compiler/cpufreq_governor/disk/fs/ioengine/kconfig/nr_task/rootfs/runtime/rw/tbox_group/test_size/testcase/ucode:
4k/gcc-9/performance/1HDD/btrfs/falloc/x86_64-rhel-8.3/100%/debian-10.4-x86_64-20200603.cgz/300s/randwrite/lkp-icl-2sp1/128G/fio-basic/0xb000280
commit:
3d83c164a0 ("btrfs: move common inode creation code into btrfs_create_new_inode()")
a052d3d1b6 ("btrfs: only reserve the needed data space amount during fallocate")
3d83c164a02f65c3 a052d3d1b6c77f193f7051cd5d4
---------------- ---------------------------
%stddev %change %stddev
\ | \
99.68 -99.6 0.04 ± 49% fio.latency_100us%
0.07 ± 65% +0.1 0.16 ± 16% fio.latency_10us%
0.02 ± 30% +0.1 0.11 ± 36% fio.latency_20us%
0.04 ± 28% -0.0 0.01 fio.latency_250us%
0.01 ± 29% +69.5 69.56 ± 3% fio.latency_2us%
0.02 ± 40% +0.2 0.21 ± 64% fio.latency_4us%
28.60 ± 2% -95.5% 1.28 ± 10% fio.time.elapsed_time
28.60 ± 2% -95.5% 1.28 ± 10% fio.time.elapsed_time.max
16835 ± 3% -96.6% 571.33 ± 13% fio.time.involuntary_context_switches
16349 ± 2% -10.9% 14569 fio.time.minor_page_faults
9381 -46.1% 5060 ± 10% fio.time.percent_of_cpu_this_job_got
2658 ± 2% -98.8% 32.01 fio.time.system_time
25.08 ± 5% +28.4% 32.19 ± 4% fio.time.user_time
3616 ± 2% -76.6% 846.00 ± 5% fio.time.voluntary_context_switches
4651 ± 2% +3241.7% 155452 ± 12% fio.write_bw_MBps
85333 ± 2% -98.5% 1314 fio.write_clat_90%_us
87210 -98.4% 1397 fio.write_clat_95%_us
90453 -98.2% 1664 ± 8% fio.write_clat_99%_us
80075 ± 2% -98.5% 1198 fio.write_clat_mean_us
44839 ± 22% -53.9% 20667 ± 10% fio.write_clat_stddev
1190883 ± 2% +3241.7% 39795840 ± 12% fio.write_iops
207365 ± 19% -42.9% 118345 ± 4% numa-numastat.node1.numa_hit
530.96 -87.7% 65.31 ±223% pmeter.Average_Active_Power
80.02 ± 4% -38.0% 49.62 ± 4% uptime.boot
12.79 ± 14% +545.5% 82.54 ± 2% iostat.cpu.idle
86.29 ± 2% -88.9% 9.55 ± 10% iostat.cpu.system
0.91 ± 4% +771.1% 7.90 ± 13% iostat.cpu.user
7.25 ± 28% +64.1 71.35 ± 8% mpstat.cpu.all.idle%
0.75 ± 5% +0.3 1.05 ± 24% mpstat.cpu.all.irq%
0.01 ± 40% +0.0 0.04 ± 36% mpstat.cpu.all.soft%
91.06 ± 2% -77.0 14.08 ± 21% mpstat.cpu.all.sys%
0.93 ± 3% +12.6 13.48 ± 23% mpstat.cpu.all.usr%
12.17 ± 16% +574.0% 82.00 ± 2% vmstat.cpu.id
85.50 -89.1% 9.33 ± 13% vmstat.cpu.sy
1186 ± 2% -100.0% 0.00 vmstat.io.bo
83.83 ± 2% -80.1% 16.67 ± 33% vmstat.procs.r
2723 ± 2% +180.4% 7637 ± 6% vmstat.system.cs
184148 -25.1% 137836 ± 5% vmstat.system.in
10972 ± 8% +33.5% 14652 ± 8% numa-vmstat.node0.nr_kernel_stack
2566 ± 36% +99.6% 5123 ± 13% numa-vmstat.node0.nr_page_table_pages
305.83 ± 56% -93.8% 18.83 ±218% numa-vmstat.node1.nr_inactive_file
1137 ± 74% -58.5% 472.67 ± 86% numa-vmstat.node1.nr_page_table_pages
7671 ± 23% -35.3% 4967 ± 13% numa-vmstat.node1.nr_slab_reclaimable
27154 ± 12% -20.8% 21509 ± 9% numa-vmstat.node1.nr_slab_unreclaimable
305.83 ± 56% -93.8% 18.83 ±218% numa-vmstat.node1.nr_zone_inactive_file
44723 ± 18% -50.4% 22163 ± 46% numa-meminfo.node0.AnonHugePages
10975 ± 8% +33.7% 14674 ± 8% numa-meminfo.node0.KernelStack
10286 ± 36% +99.5% 20522 ± 13% numa-meminfo.node0.PageTables
1921 ± 25% -46.9% 1021 ± 36% numa-meminfo.node1.Active
1226 ± 56% -93.8% 75.67 ±218% numa-meminfo.node1.Inactive(file)
30687 ± 23% -35.2% 19870 ± 13% numa-meminfo.node1.KReclaimable
4537 ± 74% -58.3% 1893 ± 85% numa-meminfo.node1.PageTables
30687 ± 23% -35.2% 19870 ± 13% numa-meminfo.node1.SReclaimable
108615 ± 12% -20.8% 86038 ± 9% numa-meminfo.node1.SUnreclaim
139303 ± 13% -24.0% 105909 ± 9% numa-meminfo.node1.Slab
3647 -37.6% 2275 ± 12% meminfo.Active
3419 -39.9% 2056 ± 13% meminfo.Active(anon)
58547 ± 3% -41.6% 34178 ± 4% meminfo.AnonHugePages
330276 +34.8% 445049 ± 4% meminfo.AnonPages
3798478 ± 2% -65.4% 1314197 ± 16% meminfo.Committed_AS
366737 +27.7% 468352 ± 4% meminfo.Inactive
364992 +28.2% 467743 ± 4% meminfo.Inactive(anon)
20396 +11.8% 22795 ± 2% meminfo.KernelStack
55354 -21.6% 43382 meminfo.Mapped
14784 ± 3% +52.0% 22466 ± 9% meminfo.PageTables
38638 -34.7% 25212 ± 2% meminfo.Shmem
2939 ± 2% -77.0% 676.67 ± 13% turbostat.Avg_MHz
91.92 ± 2% -66.3 25.58 ± 13% turbostat.Busy%
3196 -16.1% 2681 turbostat.Bzy_MHz
6.41 ± 41% +39.4 45.81 ± 34% turbostat.C1E%
1.70 ±128% +26.1 27.81 ± 49% turbostat.C6%
6.70 ± 37% +694.4% 53.20 ± 29% turbostat.CPU%c1
1.39 ±147% +1428.2% 21.22 ± 67% turbostat.CPU%c6
59.33 -10.4% 53.17 ± 3% turbostat.CoreTmp
5954076 ± 3% -89.3% 638209 ± 16% turbostat.IRQ
59.67 -10.6% 53.33 ± 2% turbostat.PkgTmp
347.98 -15.4% 294.40 ± 4% turbostat.PkgWatt
14.31 ± 29% -5.8 8.52 ±142% perf-profile.calltrace.cycles-pp._dl_catch_error
10.95 ± 75% -5.4 5.56 ±141% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe._dl_catch_error
10.95 ± 75% -5.4 5.56 ±141% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe._dl_catch_error
5.22 ±100% -5.2 0.00 perf-profile.calltrace.cycles-pp.mmput.do_exit.do_group_exit.__x64_sys_exit_group.do_syscall_64
5.22 ±100% -5.2 0.00 perf-profile.calltrace.cycles-pp.exit_mmap.mmput.do_exit.do_group_exit.__x64_sys_exit_group
7.00 ±111% -4.6 2.44 ±147% perf-profile.calltrace.cycles-pp.unmap_vmas.exit_mmap.mmput.do_exit.do_group_exit
7.00 ±111% -4.6 2.44 ±147% perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.exit_mmap.mmput.do_exit
5.48 ±113% -3.0 2.44 ±147% perf-profile.calltrace.cycles-pp.zap_pte_range.unmap_page_range.unmap_vmas.exit_mmap.mmput
14.31 ± 29% -6.4 7.96 ±144% perf-profile.children.cycles-pp._dl_catch_error
4.89 ±103% -4.9 0.00 perf-profile.children.cycles-pp.release_pages
7.00 ±111% -4.6 2.44 ±147% perf-profile.children.cycles-pp.unmap_vmas
7.00 ±111% -4.6 2.44 ±147% perf-profile.children.cycles-pp.unmap_page_range
5.98 ±103% -4.3 1.67 ±223% perf-profile.children.cycles-pp.walk_component
5.48 ±113% -3.0 2.44 ±147% perf-profile.children.cycles-pp.zap_pte_range
5.82 ±110% -0.1 5.68 ±162% perf-profile.children.cycles-pp.format_decode
5.82 ±110% -2.2 3.60 ±144% perf-profile.self.cycles-pp.format_decode
852.67 -39.6% 514.67 ± 13% proc-vmstat.nr_active_anon
82485 +35.0% 111385 ± 4% proc-vmstat.nr_anon_pages
5711 ± 73% -99.2% 44.17 ± 63% proc-vmstat.nr_dirtied
91158 +28.7% 117287 ± 4% proc-vmstat.nr_inactive_anon
20403 +11.5% 22754 ± 2% proc-vmstat.nr_kernel_stack
14082 -18.9% 11426 proc-vmstat.nr_mapped
3707 ± 4% +51.3% 5608 ± 9% proc-vmstat.nr_page_table_pages
9647 -32.2% 6544 ± 2% proc-vmstat.nr_shmem
27727 -4.2% 26564 proc-vmstat.nr_slab_reclaimable
852.67 -39.6% 514.67 ± 13% proc-vmstat.nr_zone_active_anon
91158 +28.7% 117287 ± 4% proc-vmstat.nr_zone_inactive_anon
380393 -16.7% 316892 proc-vmstat.numa_hit
293361 ± 2% -21.7% 229776 proc-vmstat.numa_local
380455 -16.7% 316873 proc-vmstat.pgalloc_normal
258327 -29.5% 182208 ± 2% proc-vmstat.pgfault
278534 ± 3% -42.0% 161619 ± 3% proc-vmstat.pgfree
12395 ± 3% -37.0% 7812 ± 6% proc-vmstat.pgreuse
0.39 ± 6% +1.1 1.44 ± 35% perf-stat.i.branch-miss-rate%
11226008 ± 7% +440.8% 60709258 ± 21% perf-stat.i.branch-misses
42.60 -25.4 17.23 ± 22% perf-stat.i.cache-miss-rate%
29893686 -48.5% 15393933 ± 18% perf-stat.i.cache-misses
2142 +161.1% 5594 ± 17% perf-stat.i.context-switches
20.93 -90.3% 2.04 ± 20% perf-stat.i.cpi
96050 +2.4% 98325 ± 3% perf-stat.i.cpu-clock
2.914e+11 -79.6% 5.937e+10 ± 46% perf-stat.i.cpu-cycles
161.58 ± 2% +172.4% 440.20 ± 25% perf-stat.i.cpu-migrations
9574 ± 2% -63.2% 3520 ± 38% perf-stat.i.cycles-between-cache-misses
0.00 ± 84% +0.0 0.05 ± 35% perf-stat.i.dTLB-load-miss-rate%
107745 ± 44% +1472.2% 1693986 ± 34% perf-stat.i.dTLB-load-misses
0.00 ± 28% +0.0 0.03 ± 38% perf-stat.i.dTLB-store-miss-rate%
48471 ± 12% +1086.3% 575011 ± 20% perf-stat.i.dTLB-store-misses
0.06 ± 7% +967.1% 0.62 ± 24% perf-stat.i.ipc
169.22 ± 3% +2000.8% 3555 ± 39% perf-stat.i.major-faults
3.04 -79.9% 0.61 ± 47% perf-stat.i.metric.GHz
4997 ± 3% +804.7% 45213 ± 28% perf-stat.i.minor-faults
95.51 -33.3 62.24 ± 11% perf-stat.i.node-load-miss-rate%
5112839 -76.9% 1180256 ± 48% perf-stat.i.node-load-misses
157058 ± 2% +184.7% 447159 ± 27% perf-stat.i.node-loads
69.69 -41.4 28.28 ± 33% perf-stat.i.node-store-miss-rate%
6517770 -76.5% 1530670 ± 32% perf-stat.i.node-store-misses
5166 ± 3% +843.9% 48766 ± 28% perf-stat.i.page-faults
96050 +2.4% 98326 ± 3% perf-stat.i.task-clock
42.43 ± 2% -24.8 17.59 ± 23% perf-stat.overall.cache-miss-rate%
21.12 -92.9% 1.49 ± 41% perf-stat.overall.cpi
9748 -60.1% 3886 ± 41% perf-stat.overall.cycles-between-cache-misses
0.00 ± 46% +0.0 0.03 ± 89% perf-stat.overall.dTLB-load-miss-rate%
0.05 +1526.5% 0.77 ± 31% perf-stat.overall.ipc
97.02 -27.6 69.44 ± 19% perf-stat.overall.node-load-miss-rate%
70.99 -40.6 30.34 ± 34% perf-stat.overall.node-store-miss-rate%
12059 -73.1% 3245 ± 81% perf-stat.overall.path-length
10857952 ± 7% +231.5% 35995103 ± 25% perf-stat.ps.branch-misses
28909823 -68.0% 9254837 ± 27% perf-stat.ps.cache-misses
68161639 ± 2% -22.7% 52690454 ± 13% perf-stat.ps.cache-references
2071 +54.7% 3204 ± 8% perf-stat.ps.context-switches
92860 -37.4% 58162 ± 16% perf-stat.ps.cpu-clock
2.818e+11 -86.4% 3.828e+10 ± 58% perf-stat.ps.cpu-cycles
156.29 ± 2% +61.4% 252.29 ± 15% perf-stat.ps.cpu-migrations
104205 ± 44% +820.5% 959180 ± 20% perf-stat.ps.dTLB-load-misses
46858 ± 12% +603.2% 329524 ± 12% perf-stat.ps.dTLB-store-misses
162.82 ± 3% +1085.3% 1929 ± 25% perf-stat.ps.major-faults
4827 ± 3% +425.3% 25360 ± 15% perf-stat.ps.minor-faults
4943794 -84.6% 760468 ± 60% perf-stat.ps.node-load-misses
151872 ± 2% +65.9% 251935 ± 14% perf-stat.ps.node-loads
6302973 -84.9% 950283 ± 44% perf-stat.ps.node-store-misses
2576891 ± 4% -19.4% 2077892 ± 9% perf-stat.ps.node-stores
4990 ± 3% +446.8% 27289 ± 15% perf-stat.ps.page-faults
92860 -37.4% 58162 ± 16% perf-stat.ps.task-clock
4.046e+11 -73.1% 1.089e+11 ± 81% perf-stat.total.instructions
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://01.org/lkp
View attachment "config-5.17.0-rc8-00144-ga052d3d1b6c7" of type "text/plain" (162136 bytes)
View attachment "job-script" of type "text/plain" (8458 bytes)
View attachment "job.yaml" of type "text/plain" (5712 bytes)
View attachment "reproduce" of type "text/plain" (707 bytes)
Powered by blists - more mailing lists