[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20210408022629.GA1696@xsang-OptiPlex-9020>
Date: Thu, 8 Apr 2021 10:26:29 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Dave Chinner <dchinner@...hat.com>
Cc: "Darrick J. Wong" <djwong@...nel.org>,
Christoph Hellwig <hch@....de>,
LKML <linux-kernel@...r.kernel.org>,
Linux Memory Management List <linux-mm@...ck.org>,
lkp@...ts.01.org, lkp@...el.com, ying.huang@...el.com,
feng.tang@...el.com, zhengjun.xing@...el.com
Subject: [xfs] 1fea323ff0: aim7.jobs-per-min 2.4% improvement
Greeting,
FYI, we noticed a 2.4% improvement of aim7.jobs-per-min due to commit:
commit: 1fea323ff00526dcc04fbb4ee6e7d04e4e2ab0e1 ("xfs: reduce debug overhead of dir leaf/node checks")
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
in testcase: aim7
on test machine: 88 threads Intel(R) Xeon(R) Gold 6238M CPU @ 2.10GHz with 128G memory
with following parameters:
disk: 4BRD_12G
md: RAID1
fs: xfs
test: disk_rw
load: 3000
cpufreq_governor: performance
ucode: 0x5003006
test-description: AIM7 is a traditional UNIX system level benchmark suite which is used to test and measure the performance of multiuser system.
test-url: https://sourceforge.net/projects/aimbench/files/aim-suite7/
In addition to that, the commit also has significant impact on the following tests:
+------------------+------------------------------------------------------------------------+
| testcase: change | aim7: aim7.jobs-per-min 1.6% improvement |
| test machine | 144 threads Intel(R) Xeon(R) Gold 5318H CPU @ 2.50GHz with 128G memory |
| test parameters | cpufreq_governor=performance |
| | disk=1BRD_48G |
| | fs=xfs |
| | load=3000 |
| | test=disk_rw |
| | ucode=0x700001e |
+------------------+------------------------------------------------------------------------+
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml
bin/lkp run compatible-job.yaml
=========================================================================================
compiler/cpufreq_governor/disk/fs/kconfig/load/md/rootfs/tbox_group/test/testcase/ucode:
gcc-9/performance/4BRD_12G/xfs/x86_64-rhel-8.3/3000/RAID1/debian-10.4-x86_64-20200603.cgz/lkp-csl-2sp9/disk_rw/aim7/0x5003006
commit:
39d3c0b596 ("xfs: No need for inode number error injection in __xfs_dir3_data_check")
1fea323ff0 ("xfs: reduce debug overhead of dir leaf/node checks")
39d3c0b5968b5421 1fea323ff00526dcc04fbb4ee6e
---------------- ---------------------------
fail:runs %reproduction fail:runs
| | |
:6 33% 2:6 kmsg.XFS(md#):xlog_verify_grant_tail:space>BBTOB(tail_blocks)
%stddev %change %stddev
\ | \
505405 +2.4% 517621 aim7.jobs-per-min
35.82 -2.4% 34.98 aim7.time.elapsed_time
35.82 -2.4% 34.98 aim7.time.elapsed_time.max
2866 ± 35% +39.0% 3985 ± 3% interrupts.CPU53.NMI:Non-maskable_interrupts
2866 ± 35% +39.0% 3985 ± 3% interrupts.CPU53.PMI:Performance_monitoring_interrupts
286711 -2.5% 279423 proc-vmstat.nr_dirty
554636 -1.3% 547330 proc-vmstat.nr_file_pages
286865 -2.5% 279593 proc-vmstat.nr_inactive_file
286865 -2.5% 279593 proc-vmstat.nr_zone_inactive_file
287057 -2.6% 279704 proc-vmstat.nr_zone_write_pending
1.313e+10 +2.0% 1.34e+10 perf-stat.i.branch-instructions
52558 +2.7% 53962 perf-stat.i.context-switches
1942 +7.4% 2086 ± 2% perf-stat.i.cpu-migrations
1.9e+10 +2.0% 1.939e+10 perf-stat.i.dTLB-loads
1.061e+10 +2.5% 1.087e+10 perf-stat.i.dTLB-stores
6.606e+10 +2.1% 6.743e+10 perf-stat.i.instructions
487.84 +2.1% 498.03 perf-stat.i.metric.M/sec
3171946 +6.1% 3364545 perf-stat.i.node-store-misses
10014711 +2.8% 10299278 perf-stat.i.node-stores
24.04 +0.6 24.62 perf-stat.overall.node-store-miss-rate%
1.286e+10 +2.0% 1.311e+10 perf-stat.ps.branch-instructions
51473 +2.6% 52806 perf-stat.ps.context-switches
1903 +7.2% 2040 ± 2% perf-stat.ps.cpu-migrations
1.861e+10 +2.0% 1.898e+10 perf-stat.ps.dTLB-loads
1.039e+10 +2.4% 1.064e+10 perf-stat.ps.dTLB-stores
6.469e+10 +2.0% 6.598e+10 perf-stat.ps.instructions
3106311 +6.0% 3293166 perf-stat.ps.node-store-misses
9812026 +2.8% 10082006 perf-stat.ps.node-stores
2.29 ± 7% -0.2 2.07 ± 2% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.unlink
2.28 ± 7% -0.2 2.06 ± 2% perf-profile.calltrace.cycles-pp.do_unlinkat.do_syscall_64.entry_SYSCALL_64_after_hwframe.unlink
2.31 ± 7% -0.2 2.10 ± 2% perf-profile.calltrace.cycles-pp.unlink
2.29 ± 7% -0.2 2.08 ± 2% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.unlink
1.66 ± 8% -0.2 1.48 ± 4% perf-profile.calltrace.cycles-pp.rwsem_down_write_slowpath.do_unlinkat.do_syscall_64.entry_SYSCALL_64_after_hwframe.unlink
2.00 ± 6% -0.2 1.83 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.creat64
2.00 ± 6% -0.2 1.83 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.creat64
1.98 ± 6% -0.2 1.80 ± 2% perf-profile.calltrace.cycles-pp.do_filp_open.do_sys_openat2.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
2.00 ± 6% -0.2 1.83 perf-profile.calltrace.cycles-pp.do_sys_openat2.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe.creat64
2.00 ± 6% -0.2 1.83 perf-profile.calltrace.cycles-pp.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe.creat64
1.97 ± 6% -0.2 1.80 ± 2% perf-profile.calltrace.cycles-pp.path_openat.do_filp_open.do_sys_openat2.do_sys_open.do_syscall_64
2.01 ± 6% -0.2 1.84 perf-profile.calltrace.cycles-pp.creat64
0.90 ± 11% -0.1 0.79 ± 3% perf-profile.calltrace.cycles-pp.rwsem_spin_on_owner.rwsem_down_write_slowpath.do_unlinkat.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.92 ± 9% -0.1 0.81 ± 2% perf-profile.calltrace.cycles-pp.rwsem_down_write_slowpath.path_openat.do_filp_open.do_sys_openat2.do_sys_open
0.73 ± 11% -0.1 0.63 ± 3% perf-profile.calltrace.cycles-pp.rwsem_spin_on_owner.rwsem_down_write_slowpath.path_openat.do_filp_open.do_sys_openat2
0.69 ± 6% -0.1 0.61 ± 6% perf-profile.calltrace.cycles-pp.osq_lock.rwsem_down_write_slowpath.do_unlinkat.do_syscall_64.entry_SYSCALL_64_after_hwframe
2.58 ± 8% -0.3 2.29 ± 3% perf-profile.children.cycles-pp.rwsem_down_write_slowpath
2.29 ± 7% -0.2 2.06 ± 2% perf-profile.children.cycles-pp.do_unlinkat
2.32 ± 7% -0.2 2.10 ± 2% perf-profile.children.cycles-pp.unlink
1.62 ± 11% -0.2 1.42 ± 3% perf-profile.children.cycles-pp.rwsem_spin_on_owner
2.08 ± 6% -0.2 1.90 perf-profile.children.cycles-pp.do_sys_open
2.04 ± 6% -0.2 1.86 ± 2% perf-profile.children.cycles-pp.do_filp_open
2.07 ± 6% -0.2 1.90 perf-profile.children.cycles-pp.do_sys_openat2
2.02 ± 6% -0.2 1.85 ± 2% perf-profile.children.cycles-pp.creat64
2.03 ± 6% -0.2 1.86 perf-profile.children.cycles-pp.path_openat
0.82 ± 6% -0.1 0.72 ± 5% perf-profile.children.cycles-pp.osq_lock
0.18 ± 84% -0.1 0.09 ± 9% perf-profile.children.cycles-pp.xfs_vn_lookup
0.50 ± 2% -0.1 0.44 ± 2% perf-profile.children.cycles-pp.__fsnotify_parent
0.14 ± 6% -0.0 0.10 ± 7% perf-profile.children.cycles-pp.write@plt
0.12 ± 11% -0.0 0.09 perf-profile.children.cycles-pp.xfs_dir2_leafn_lookup_for_entry
0.11 ± 18% -0.0 0.08 ± 8% perf-profile.children.cycles-pp.xfs_dir_lookup
0.22 ± 7% -0.0 0.20 ± 2% perf-profile.children.cycles-pp.update_process_times
0.09 ± 8% -0.0 0.07 ± 14% perf-profile.children.cycles-pp.xfs_dir2_node_lookup
0.25 ± 44% +0.1 0.35 ± 5% perf-profile.children.cycles-pp.xfs_file_llseek
1.61 ± 11% -0.2 1.41 ± 3% perf-profile.self.cycles-pp.rwsem_spin_on_owner
0.81 ± 6% -0.1 0.72 ± 5% perf-profile.self.cycles-pp.osq_lock
0.48 ± 2% -0.1 0.41 ± 3% perf-profile.self.cycles-pp.__fsnotify_parent
0.10 ± 6% -0.1 0.04 ± 44% perf-profile.self.cycles-pp.write@plt
0.24 ± 44% +0.1 0.34 ± 5% perf-profile.self.cycles-pp.xfs_file_llseek
0.77 ± 13% +0.2 0.94 ± 4% perf-profile.self.cycles-pp.xfs_file_buffered_write
aim7.jobs-per-min
540000 +------------------------------------------------------------------+
| O |
530000 |-+ O O O O O |
| O O O O O O O O |
| O O O O O O O O O O O O |
520000 |-+ O O O O O O |
| O |
510000 |-+ .+ |
| .+.+.+ |
500000 |-+ +.+ |
| : |
| +. + + +. : |
490000 |.+.+. + +. + + .+.+. + + + +.+ |
| + + + + +.+.+.+.++.+ |
480000 +------------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
***************************************************************************************************
lkp-cpl-4sp1: 144 threads Intel(R) Xeon(R) Gold 5318H CPU @ 2.50GHz with 128G memory
=========================================================================================
compiler/cpufreq_governor/disk/fs/kconfig/load/rootfs/tbox_group/test/testcase/ucode:
gcc-9/performance/1BRD_48G/xfs/x86_64-rhel-8.3/3000/debian-10.4-x86_64-20200603.cgz/lkp-cpl-4sp1/disk_rw/aim7/0x700001e
commit:
39d3c0b596 ("xfs: No need for inode number error injection in __xfs_dir3_data_check")
1fea323ff0 ("xfs: reduce debug overhead of dir leaf/node checks")
39d3c0b5968b5421 1fea323ff00526dcc04fbb4ee6e
---------------- ---------------------------
%stddev %change %stddev
\ | \
500977 +1.6% 509113 aim7.jobs-per-min
36.14 -1.6% 35.57 aim7.time.elapsed_time
36.14 -1.6% 35.57 aim7.time.elapsed_time.max
40.93 ± 2% -4.3% 39.19 aim7.time.user_time
28267 ± 79% -81.7% 5164 ± 5% numa-meminfo.node2.KernelStack
28180 ± 78% -81.7% 5162 ± 5% numa-vmstat.node2.nr_kernel_stack
291109 -1.6% 286393 proc-vmstat.nr_dirty
11049 ± 5% +9.0% 12039 ± 4% slabinfo.pde_opener.active_objs
11049 ± 5% +9.0% 12039 ± 4% slabinfo.pde_opener.num_objs
1579 ± 33% +29.9% 2051 ± 25% interrupts.CPU109.NMI:Non-maskable_interrupts
1579 ± 33% +29.9% 2051 ± 25% interrupts.CPU109.PMI:Performance_monitoring_interrupts
1785 ± 30% +45.8% 2602 ± 8% interrupts.CPU117.NMI:Non-maskable_interrupts
1785 ± 30% +45.8% 2602 ± 8% interrupts.CPU117.PMI:Performance_monitoring_interrupts
891.67 ± 8% +99.4% 1778 ± 47% interrupts.CPU4.CAL:Function_call_interrupts
1.301e+10 +1.6% 1.322e+10 perf-stat.i.branch-instructions
52509 +2.1% 53602 perf-stat.i.context-switches
1.89e+10 +1.8% 1.924e+10 perf-stat.i.dTLB-loads
1.061e+10 +1.9% 1.081e+10 perf-stat.i.dTLB-stores
6.554e+10 +1.6% 6.66e+10 perf-stat.i.instructions
296.86 +1.8% 302.18 perf-stat.i.metric.M/sec
76.63 +1.0 77.63 perf-stat.i.node-load-miss-rate%
3774653 ± 2% +6.6% 4025641 ± 3% perf-stat.i.node-loads
4414091 ± 3% +7.6% 4747750 perf-stat.i.node-store-misses
9344160 +2.3% 9559103 perf-stat.i.node-stores
32.07 ± 2% +1.1 33.18 perf-stat.overall.node-store-miss-rate%
1.271e+10 +1.8% 1.293e+10 perf-stat.ps.branch-instructions
51278 +2.3% 52440 perf-stat.ps.context-switches
1.846e+10 +2.0% 1.883e+10 perf-stat.ps.dTLB-loads
1.036e+10 +2.1% 1.057e+10 perf-stat.ps.dTLB-stores
6.4e+10 +1.8% 6.516e+10 perf-stat.ps.instructions
3686827 ± 2% +6.9% 3940762 ± 3% perf-stat.ps.node-loads
4310625 ± 3% +7.8% 4645614 perf-stat.ps.node-store-misses
9127740 +2.5% 9355048 perf-stat.ps.node-stores
2.54 -0.2 2.30 ± 4% perf-profile.calltrace.cycles-pp.creat64
2.53 -0.2 2.29 ± 4% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.creat64
2.52 -0.2 2.28 ± 4% perf-profile.calltrace.cycles-pp.do_sys_openat2.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe.creat64
2.50 -0.2 2.26 ± 4% perf-profile.calltrace.cycles-pp.do_filp_open.do_sys_openat2.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
2.50 -0.2 2.26 ± 4% perf-profile.calltrace.cycles-pp.path_openat.do_filp_open.do_sys_openat2.do_sys_open.do_syscall_64
2.52 -0.2 2.28 ± 4% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.creat64
2.52 -0.2 2.28 ± 4% perf-profile.calltrace.cycles-pp.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe.creat64
2.82 ± 2% -0.2 2.62 ± 3% perf-profile.calltrace.cycles-pp.unlink
2.79 ± 2% -0.2 2.59 ± 3% perf-profile.calltrace.cycles-pp.do_unlinkat.do_syscall_64.entry_SYSCALL_64_after_hwframe.unlink
2.80 ± 2% -0.2 2.61 ± 3% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.unlink
2.79 ± 2% -0.2 2.60 ± 3% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.unlink
2.10 -0.1 1.95 ± 4% perf-profile.calltrace.cycles-pp.rwsem_down_write_slowpath.do_unlinkat.do_syscall_64.entry_SYSCALL_64_after_hwframe.unlink
1.21 ± 3% -0.1 1.10 ± 4% perf-profile.calltrace.cycles-pp.rwsem_down_write_slowpath.path_openat.do_filp_open.do_sys_openat2.do_sys_open
0.96 ± 5% -0.1 0.87 ± 4% perf-profile.calltrace.cycles-pp.xfs_generic_create.path_openat.do_filp_open.do_sys_openat2.do_sys_open
1.07 ± 3% -0.1 0.99 ± 3% perf-profile.calltrace.cycles-pp.rwsem_spin_on_owner.rwsem_down_write_slowpath.do_unlinkat.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.92 ± 2% -0.1 0.85 ± 4% perf-profile.calltrace.cycles-pp.rwsem_spin_on_owner.rwsem_down_write_slowpath.path_openat.do_filp_open.do_sys_openat2
3.31 -0.3 3.06 ± 4% perf-profile.children.cycles-pp.rwsem_down_write_slowpath
2.56 -0.2 2.31 ± 4% perf-profile.children.cycles-pp.do_filp_open
2.55 ± 2% -0.2 2.30 ± 4% perf-profile.children.cycles-pp.creat64
2.56 -0.2 2.31 ± 4% perf-profile.children.cycles-pp.path_openat
2.60 -0.2 2.36 ± 4% perf-profile.children.cycles-pp.do_sys_open
2.60 -0.2 2.36 ± 4% perf-profile.children.cycles-pp.do_sys_openat2
2.83 ± 2% -0.2 2.63 ± 3% perf-profile.children.cycles-pp.unlink
2.79 ± 2% -0.2 2.60 ± 3% perf-profile.children.cycles-pp.do_unlinkat
1.99 ± 2% -0.1 1.84 ± 3% perf-profile.children.cycles-pp.rwsem_spin_on_owner
0.96 ± 5% -0.1 0.87 ± 4% perf-profile.children.cycles-pp.xfs_generic_create
0.45 ± 5% -0.1 0.37 ± 7% perf-profile.children.cycles-pp.__fsnotify_parent
0.17 ± 5% -0.0 0.13 ± 9% perf-profile.children.cycles-pp.write@plt
0.12 ± 4% -0.0 0.09 ± 5% perf-profile.children.cycles-pp.xfs_dir2_leafn_lookup_for_entry
0.09 ± 7% -0.0 0.07 ± 7% perf-profile.children.cycles-pp.generic_file_llseek_size
0.09 ± 7% -0.0 0.07 ± 10% perf-profile.children.cycles-pp.xfs_dir2_node_lookup
0.08 ± 11% -0.0 0.06 ± 7% perf-profile.children.cycles-pp.wake_up_q
1.97 ± 2% -0.2 1.82 ± 3% perf-profile.self.cycles-pp.rwsem_spin_on_owner
0.43 ± 5% -0.1 0.34 ± 7% perf-profile.self.cycles-pp.__fsnotify_parent
1.17 ± 3% -0.1 1.10 ± 4% perf-profile.self.cycles-pp.write
0.10 ± 7% -0.1 0.05 ± 45% perf-profile.self.cycles-pp.write@plt
0.09 ± 7% -0.0 0.07 ± 7% perf-profile.self.cycles-pp.generic_file_llseek_size
0.19 ± 3% -0.0 0.17 ± 4% perf-profile.self.cycles-pp.xfs_get_extsz_hint
0.21 ± 6% +0.0 0.24 ± 6% perf-profile.self.cycles-pp.propagate_protected_usage
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
---
0DAY/LKP+ Test Infrastructure Open Source Technology Center
https://lists.01.org/hyperkitty/list/lkp@lists.01.org Intel Corporation
Thanks,
Oliver Sang
View attachment "config-5.12.0-rc4-00020-g1fea323ff005" of type "text/plain" (172899 bytes)
View attachment "job-script" of type "text/plain" (8050 bytes)
View attachment "job.yaml" of type "text/plain" (5508 bytes)
View attachment "reproduce" of type "text/plain" (1026 bytes)
Powered by blists - more mailing lists