[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20220420071127.GC16310@xsang-OptiPlex-9020>
Date: Wed, 20 Apr 2022 15:11:27 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Dave Chinner <dchinner@...hat.com>
Cc: "Darrick J. Wong" <djwong@...nel.org>,
LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org,
lkp@...el.com, ying.huang@...el.com, feng.tang@...el.com,
zhengjun.xing@...ux.intel.com, fengwei.yin@...el.com
Subject: [xfs] 0b02c8c0d7: fio.write_iops -9.0% regression
Greeting,
FYI, we noticed a -9.0% regression of fio.write_iops due to commit:
commit: 0b02c8c0d75a738c98c35f02efb36217c170d78c ("xfs: set prealloc flag in xfs_alloc_file_space()")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
in testcase: fio-basic
on test machine: 96 threads 2 sockets Ice Lake with 256G memory
with following parameters:
runtime: 300s
disk: 1HDD
fs: xfs
nr_task: 100%
test_size: 128G
rw: write
bs: 4k
ioengine: falloc
cpufreq_governor: performance
ucode: 0xb000280
test-description: Fio is a tool that will spawn a number of threads or processes doing a particular type of I/O action as specified by the user.
test-url: https://github.com/axboe/fio
In addition to that, the commit also has significant impact on the following tests:
+------------------+------------------------------------------------+
| testcase: change | fio-basic: fio.write_iops 19.5% improvement |
| test machine | 96 threads 2 sockets Ice Lake with 256G memory |
| test parameters | bs=4k |
| | cpufreq_governor=performance |
| | disk=1HDD |
| | fs=xfs |
| | ioengine=falloc |
| | nr_task=1 |
| | runtime=300s |
| | rw=write |
| | test_size=128G |
| | ucode=0xb000280 |
+------------------+------------------------------------------------+
If you fix the issue, kindly add following tag
Reported-by: kernel test robot <oliver.sang@...el.com>
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
sudo bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
sudo bin/lkp run generated-yaml-file
# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.
=========================================================================================
bs/compiler/cpufreq_governor/disk/fs/ioengine/kconfig/nr_task/rootfs/runtime/rw/tbox_group/test_size/testcase/ucode:
4k/gcc-9/performance/1HDD/xfs/falloc/x86_64-rhel-8.3/100%/debian-10.4-x86_64-20200603.cgz/300s/write/lkp-icl-2sp1/128G/fio-basic/0xb000280
commit:
fbe7e52003 ("xfs: fallocate() should call file_modified()")
0b02c8c0d7 ("xfs: set prealloc flag in xfs_alloc_file_space()")
fbe7e520036583a7 0b02c8c0d75a738c98c35f02efb
---------------- ---------------------------
%stddev %change %stddev
\ | \
80.55 ± 15% +10.4 90.94 fio.latency_100us%
0.19 ± 43% -0.2 0.01 ± 32% fio.latency_10us%
1.01 ±117% -1.0 0.04 ± 63% fio.latency_20us%
8.06 ±106% -8.0 0.08 ± 16% fio.latency_50us%
15740 ± 5% +15.7% 18205 fio.time.involuntary_context_switches
8976 ± 4% +4.9% 9412 fio.time.percent_of_cpu_this_job_got
2542 ± 4% +15.4% 2932 fio.time.system_time
4637 -9.0% 4220 fio.write_bw_MBps
100010 ± 9% -10.2% 89770 ± 2% fio.write_clat_90%_us
144384 ± 2% +12.5% 162474 fio.write_clat_95%_us
76605 ± 4% +15.4% 88388 fio.write_clat_mean_us
1187140 -9.0% 1080402 fio.write_iops
358691 +1.7% 364745 proc-vmstat.numa_hit
271640 +2.2% 277686 proc-vmstat.numa_local
358538 +1.8% 364830 proc-vmstat.pgalloc_normal
7.22 ±119% -5.3 1.88 ±161% perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.do_idle.cpu_startup_entry.start_secondary
7.22 ±119% -5.2 2.07 ±165% perf-profile.calltrace.cycles-pp.cpuidle_enter.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
8.08 ±114% -3.8 4.26 ±144% perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
8.08 ±114% -3.8 4.26 ±144% perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
8.08 ±114% -3.8 4.26 ±144% perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64_no_verify
8.21 ±114% -3.8 4.43 ±143% perf-profile.calltrace.cycles-pp.secondary_startup_64_no_verify
6.68 ±130% -0.4 6.23 ±147% perf-profile.calltrace.cycles-pp._IO_vfscanf.fscanf
7.22 ±119% -5.0 2.24 ±156% perf-profile.children.cycles-pp.cpuidle_enter
7.22 ±119% -5.0 2.24 ±156% perf-profile.children.cycles-pp.cpuidle_enter_state
8.08 ±114% -3.8 4.26 ±144% perf-profile.children.cycles-pp.start_secondary
8.21 ±114% -3.8 4.43 ±143% perf-profile.children.cycles-pp.secondary_startup_64_no_verify
8.21 ±114% -3.8 4.43 ±143% perf-profile.children.cycles-pp.cpu_startup_entry
8.21 ±114% -3.8 4.43 ±143% perf-profile.children.cycles-pp.do_idle
3.94 ± 78% -2.8 1.14 ±185% perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
7.72 ±116% -1.5 6.23 ±147% perf-profile.children.cycles-pp._IO_vfscanf
7.72 ±116% -2.8 4.94 ±141% perf-profile.self.cycles-pp._IO_vfscanf
9.56 ± 6% -20.3% 7.62 ± 4% perf-stat.i.MPKI
0.87 ± 9% -0.3 0.55 ± 4% perf-stat.i.branch-miss-rate%
28854751 ± 3% -39.1% 17572480 ± 2% perf-stat.i.branch-misses
33.09 ± 3% +15.2 48.25 perf-stat.i.cache-miss-rate%
50957571 ± 2% +8.6% 55343067 perf-stat.i.cache-misses
1.531e+08 -25.6% 1.139e+08 ± 2% perf-stat.i.cache-references
16.91 +12.3% 18.99 perf-stat.i.cpi
5508 -4.7% 5250 perf-stat.i.cycles-between-cache-misses
218292 ± 55% -48.4% 112740 ± 31% perf-stat.i.dTLB-load-misses
2.158e+09 -25.1% 1.616e+09 perf-stat.i.dTLB-stores
0.06 ± 2% -10.9% 0.06 ± 2% perf-stat.i.ipc
173.60 -9.1% 157.82 perf-stat.i.major-faults
286.45 ± 3% +10.7% 316.97 ± 3% perf-stat.i.metric.K/sec
108.46 ± 4% -10.8% 96.72 perf-stat.i.metric.M/sec
5465 -6.7% 5101 ± 2% perf-stat.i.minor-faults
93.79 +2.0 95.78 perf-stat.i.node-load-miss-rate%
10120472 ± 3% +15.0% 11643033 perf-stat.i.node-load-misses
530906 ± 2% -33.0% 355501 ± 3% perf-stat.i.node-loads
65.02 -3.3 61.74 perf-stat.i.node-store-miss-rate%
5639450 ± 3% +15.0% 6488073 perf-stat.i.node-stores
5639 -6.7% 5259 ± 2% perf-stat.i.page-faults
9.30 ± 6% -19.8% 7.46 ± 2% perf-stat.overall.MPKI
0.83 ± 9% -0.3 0.52 ± 2% perf-stat.overall.branch-miss-rate%
33.30 ± 3% +15.3 48.61 perf-stat.overall.cache-miss-rate%
17.26 +11.8% 19.30 perf-stat.overall.cpi
5589 -4.7% 5326 perf-stat.overall.cycles-between-cache-misses
0.00 ± 61% -0.0 0.00 ± 31% perf-stat.overall.dTLB-load-miss-rate%
0.06 -10.6% 0.05 perf-stat.overall.ipc
95.01 +2.0 97.04 perf-stat.overall.node-load-miss-rate%
65.59 -3.4 62.19 perf-stat.overall.node-store-miss-rate%
27877289 ± 3% -38.9% 17030231 ± 2% perf-stat.ps.branch-misses
49235817 ± 2% +9.0% 53642585 perf-stat.ps.cache-misses
1.479e+08 -25.4% 1.104e+08 ± 2% perf-stat.ps.cache-references
210467 ± 55% -48.2% 109086 ± 31% perf-stat.ps.dTLB-load-misses
2.085e+09 -24.9% 1.566e+09 perf-stat.ps.dTLB-stores
166.72 -8.8% 151.97 perf-stat.ps.major-faults
5267 -6.4% 4932 ± 2% perf-stat.ps.minor-faults
9778653 ± 3% +15.4% 11285466 perf-stat.ps.node-load-misses
512925 ± 2% -32.8% 344562 ± 3% perf-stat.ps.node-loads
5448424 ± 3% +15.4% 6288141 perf-stat.ps.node-stores
5434 -6.4% 5084 ± 2% perf-stat.ps.page-faults
***************************************************************************************************
lkp-icl-2sp1: 96 threads 2 sockets Ice Lake with 256G memory
=========================================================================================
bs/compiler/cpufreq_governor/disk/fs/ioengine/kconfig/nr_task/rootfs/runtime/rw/tbox_group/test_size/testcase/ucode:
4k/gcc-9/performance/1HDD/xfs/falloc/x86_64-rhel-8.3/1/debian-10.4-x86_64-20200603.cgz/300s/write/lkp-icl-2sp1/128G/fio-basic/0xb000280
commit:
fbe7e52003 ("xfs: fallocate() should call file_modified()")
0b02c8c0d7 ("xfs: set prealloc flag in xfs_alloc_file_space()")
fbe7e520036583a7 0b02c8c0d75a738c98c35f02efb
---------------- ---------------------------
%stddev %change %stddev
\ | \
0.23 ± 16% -0.1 0.16 ± 19% fio.latency_10us%
58.12 -16.1% 48.74 ± 2% fio.time.elapsed_time
58.12 -16.1% 48.74 ± 2% fio.time.elapsed_time.max
45.53 -20.1% 36.37 ± 2% fio.time.system_time
6226 -15.9% 5235 ± 2% fio.time.voluntary_context_switches
2268 +19.5% 2709 ± 2% fio.write_bw_MBps
1562 -17.9% 1282 fio.write_clat_90%_us
1594 -17.6% 1314 fio.write_clat_95%_us
1688 -17.2% 1397 fio.write_clat_99%_us
1491 -18.7% 1212 ± 2% fio.write_clat_mean_us
284.55 -11.4% 252.08 ± 2% fio.write_clat_stddev
580741 +19.5% 693756 ± 2% fio.write_iops
0.33 ± 6% +0.1 0.40 ± 5% mpstat.cpu.all.usr%
23764 ±105% +160.9% 62001 ± 31% numa-numastat.node1.other_node
23764 ±105% +160.9% 62001 ± 31% numa-vmstat.node1.numa_other
5.635e+09 -16.4% 4.71e+09 ± 2% cpuidle..time
11466994 -16.3% 9598650 ± 2% cpuidle..usage
20.26 ± 24% +57.5% 31.91 ± 14% sched_debug.cfs_rq:/.util_est_enqueued.avg
92.40 ± 15% +32.5% 122.41 ± 12% sched_debug.cfs_rq:/.util_est_enqueued.stddev
113.13 -10.2% 101.64 ± 2% uptime.boot
10239 -10.7% 9143 ± 2% uptime.idle
11391 ± 6% -20.5% 9059 ± 9% turbostat.C1
11372369 -22.9% 8768716 ± 20% turbostat.C1E
11580573 -16.3% 9697445 ± 2% turbostat.IRQ
1229 ± 11% -31.3% 845.17 ± 23% turbostat.POLL
403186 -4.6% 384712 proc-vmstat.numa_hit
316491 -5.8% 298019 proc-vmstat.numa_local
403264 -4.7% 384256 proc-vmstat.pgalloc_normal
319706 -9.1% 290634 proc-vmstat.pgfault
17842 -10.6% 15948 proc-vmstat.pgreuse
7.85 ± 93% -3.0 4.89 ±186% perf-profile.calltrace.cycles-pp.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault
6.24 ±100% -2.1 4.17 ±141% perf-profile.calltrace.cycles-pp.asm_exc_page_fault.fault_in_readable.fault_in_iov_iter_readable.generic_perform_write.__generic_file_write_iter
5.81 ± 93% -1.2 4.58 ±154% perf-profile.calltrace.cycles-pp.telldir
7.85 ± 93% -3.0 4.89 ±186% perf-profile.children.cycles-pp.do_fault
4.24 ±104% -0.7 3.53 ±144% perf-profile.children.cycles-pp.number
4.52 ± 79% +0.1 4.58 ±154% perf-profile.children.cycles-pp.telldir
4.24 ±104% -0.7 3.53 ±144% perf-profile.self.cycles-pp.number
4524366 ± 3% +298.3% 18020724 ±164% perf-stat.i.cache-references
102.84 +2.9% 105.79 perf-stat.i.cpu-migrations
9.80 +22.4% 12.00 ± 3% perf-stat.i.major-faults
49.44 ± 3% +278.1% 186.93 ±161% perf-stat.i.metric.K/sec
32.06 ± 11% +8.6 40.64 ± 14% perf-stat.overall.node-store-miss-rate%
15257 -15.1% 12959 perf-stat.overall.path-length
4447276 ± 3% +296.8% 17647918 ±164% perf-stat.ps.cache-references
101.10 +2.5% 103.65 perf-stat.ps.cpu-migrations
9.62 +21.9% 11.73 ± 2% perf-stat.ps.major-faults
5.12e+11 -15.1% 4.348e+11 perf-stat.total.instructions
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://01.org/lkp
View attachment "config-5.17.0-rc1-00004-g0b02c8c0d75a" of type "text/plain" (161090 bytes)
View attachment "job-script" of type "text/plain" (8449 bytes)
View attachment "job.yaml" of type "text/plain" (5702 bytes)
View attachment "reproduce" of type "text/plain" (726 bytes)
Powered by blists - more mailing lists