[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210420061014.GC31773@xsang-OptiPlex-9020>
Date: Tue, 20 Apr 2021 14:10:14 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Shaokun Zhang <zhangshaokun@...ilicon.com>
Cc: 0day robot <lkp@...el.com>,
Alexander Viro <viro@...iv.linux.org.uk>,
Shaokun Zhang <zhangshaokun@...ilicon.com>,
LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org,
ying.huang@...el.com, feng.tang@...el.com, zhengjun.xing@...el.com,
linux-fsdevel@...r.kernel.org, Yuqi Jin <jinyuqi@...wei.com>
Subject: [fs] aec499039e: unixbench.score 19.2% improvement
Greeting,
FYI, we noticed a 19.2% improvement of unixbench.score due to commit:
commit: aec499039e7b21224ef29e5a2daba328aec14442 ("[PATCH] fs: Optimized file struct to improve performance")
url: https://github.com/0day-ci/linux/commits/Shaokun-Zhang/fs-Optimized-file-struct-to-improve-performance/20210409-114859
base: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git 5e46d1b78a03d52306f21f77a4e4a144b6d31486
in testcase: unixbench
on test machine: 96 threads Intel(R) Xeon(R) CPU @ 2.30GHz with 128G memory
with following parameters:
runtime: 300s
nr_task: 30%
test: syscall
cpufreq_governor: performance
ucode: 0x4003006
test-description: UnixBench is the original BYTE UNIX benchmark suite aims to test performance of Unix-like system.
test-url: https://github.com/kdlucas/byte-unixbench
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml
bin/lkp run compatible-job.yaml
=========================================================================================
compiler/cpufreq_governor/kconfig/nr_task/rootfs/runtime/tbox_group/test/testcase/ucode:
gcc-9/performance/x86_64-rhel-8.3/30%/debian-10.4-x86_64-20200603.cgz/300s/lkp-csl-2sp4/syscall/unixbench/0x4003006
commit:
5e46d1b78a ("reiserfs: update reiserfs_xattrs_initialized() condition")
aec499039e ("fs: Optimized file struct to improve performance")
5e46d1b78a03d523 aec499039e7b21224ef29e5a2da
---------------- ---------------------------
%stddev %change %stddev
\ | \
2768 +19.2% 3298 unixbench.score
176.43 +19.8% 211.43 unixbench.time.user_time
1.622e+09 +19.2% 1.933e+09 unixbench.workload
348.17 ± 48% -25.2% 260.57 ± 68% proc-vmstat.nr_mlock
4081405 ±133% -99.2% 33639 ± 15% turbostat.C1
1.348e+10 ± 89% -76.6% 3.151e+09 ±190% cpuidle.C6.time
1360129 ±137% -86.4% 184629 ± 2% cpuidle.POLL.time
1.00 ± 10% -0.2 0.81 ± 3% mpstat.cpu.all.irq%
0.49 +0.1 0.59 mpstat.cpu.all.usr%
0.01 ± 23% -36.4% 0.00 ± 13% perf-sched.sch_delay.avg.ms.schedule_timeout.rcu_gp_kthread.kthread.ret_from_fork
0.06 ± 43% -48.4% 0.03 ± 42% perf-sched.sch_delay.max.ms.do_task_dead.do_exit.do_group_exit.__x64_sys_exit_group.do_syscall_64
0.05 ± 49% -55.1% 0.02 ± 47% perf-sched.sch_delay.max.ms.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
765.20 ± 20% -34.3% 502.83 ± 29% perf-sched.wait_and_delay.avg.ms.schedule_timeout.io_schedule_timeout.wait_for_completion_io.blk_execute_rq
1930 ± 13% -31.8% 1316 ± 30% perf-sched.wait_and_delay.max.ms.schedule_timeout.io_schedule_timeout.wait_for_completion_io.blk_execute_rq
765.19 ± 20% -34.3% 502.82 ± 29% perf-sched.wait_time.avg.ms.schedule_timeout.io_schedule_timeout.wait_for_completion_io.blk_execute_rq
1930 ± 13% -31.8% 1316 ± 30% perf-sched.wait_time.max.ms.schedule_timeout.io_schedule_timeout.wait_for_completion_io.blk_execute_rq
2787 ±215% -100.0% 0.71 ±162% interrupts.124:PCI-MSI.31981657-edge.i40e-eth0-TxRx-88
385.17 ±128% -99.9% 0.29 ±158% interrupts.61:PCI-MSI.31981594-edge.i40e-eth0-TxRx-25
4052 ± 49% -57.3% 1732 ±102% interrupts.CPU27.NMI:Non-maskable_interrupts
4052 ± 49% -57.3% 1732 ±102% interrupts.CPU27.PMI:Performance_monitoring_interrupts
438.67 ±122% +697.3% 3497 ± 37% interrupts.CPU3.NMI:Non-maskable_interrupts
438.67 ±122% +697.3% 3497 ± 37% interrupts.CPU3.PMI:Performance_monitoring_interrupts
289.00 ± 84% +1542.3% 4746 ± 24% interrupts.CPU51.NMI:Non-maskable_interrupts
289.00 ± 84% +1542.3% 4746 ± 24% interrupts.CPU51.PMI:Performance_monitoring_interrupts
135.17 ± 18% -29.9% 94.71 ± 26% interrupts.CPU59.RES:Rescheduling_interrupts
4872 ± 27% -48.9% 2490 ± 90% interrupts.CPU74.NMI:Non-maskable_interrupts
4872 ± 27% -48.9% 2490 ± 90% interrupts.CPU74.PMI:Performance_monitoring_interrupts
2786 ±215% -100.0% 0.43 ±169% interrupts.CPU88.124:PCI-MSI.31981657-edge.i40e-eth0-TxRx-88
13.38 ± 7% -13.4 0.00 perf-profile.calltrace.cycles-pp.dnotify_flush.filp_close.__x64_sys_close.do_syscall_64.entry_SYSCALL_64_after_hwframe
30.66 ± 9% -6.4 24.27 ± 10% perf-profile.calltrace.cycles-pp.filp_close.__x64_sys_close.do_syscall_64.entry_SYSCALL_64_after_hwframe
30.82 ± 9% -6.4 24.46 ± 10% perf-profile.calltrace.cycles-pp.__x64_sys_close.do_syscall_64.entry_SYSCALL_64_after_hwframe
7.10 ± 8% -1.3 5.85 ± 11% perf-profile.calltrace.cycles-pp.filp_close.__x64_sys_close.do_syscall_64.entry_SYSCALL_64_after_hwframe.__close
7.14 ± 8% -1.2 5.89 ± 11% perf-profile.calltrace.cycles-pp.__x64_sys_close.do_syscall_64.entry_SYSCALL_64_after_hwframe.__close
7.18 ± 8% -1.2 5.93 ± 11% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__close
7.15 ± 8% -1.2 5.91 ± 11% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__close
7.27 ± 8% -1.2 6.04 ± 11% perf-profile.calltrace.cycles-pp.__close
5.29 ± 8% +5.4 10.68 ± 10% perf-profile.calltrace.cycles-pp.fput_many.filp_close.__x64_sys_close.do_syscall_64.entry_SYSCALL_64_after_hwframe
13.39 ± 7% -13.3 0.07 ± 12% perf-profile.children.cycles-pp.dnotify_flush
37.79 ± 8% -7.6 30.16 ± 10% perf-profile.children.cycles-pp.filp_close
37.97 ± 8% -7.6 30.36 ± 10% perf-profile.children.cycles-pp.__x64_sys_close
7.30 ± 8% -1.2 6.07 ± 11% perf-profile.children.cycles-pp.__close
0.70 ± 10% -0.1 0.56 ± 10% perf-profile.children.cycles-pp.hrtimer_interrupt
0.71 ± 11% -0.1 0.57 ± 10% perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
0.39 ± 16% -0.1 0.29 ± 9% perf-profile.children.cycles-pp.__hrtimer_run_queues
0.27 ± 13% -0.1 0.22 ± 10% perf-profile.children.cycles-pp.tick_sched_timer
5.29 ± 8% +5.4 10.69 ± 10% perf-profile.children.cycles-pp.fput_many
13.33 ± 7% -13.3 0.06 ± 11% perf-profile.self.cycles-pp.dnotify_flush
5.27 ± 8% +5.4 10.64 ± 10% perf-profile.self.cycles-pp.fput_many
17.97 ± 46% -58.2% 7.51 ± 16% perf-stat.i.MPKI
1.073e+09 +16.2% 1.247e+09 perf-stat.i.branch-instructions
2.60 ± 34% -1.0 1.62 ± 2% perf-stat.i.branch-miss-rate%
56435130 ± 26% -31.6% 38588360 ± 21% perf-stat.i.cache-references
12.06 ± 3% -15.8% 10.16 perf-stat.i.cpi
0.10 ±100% -0.1 0.02 ±202% perf-stat.i.dTLB-load-miss-rate%
1.682e+09 +16.9% 1.965e+09 perf-stat.i.dTLB-loads
0.03 ± 93% -0.0 0.01 ±142% perf-stat.i.dTLB-store-miss-rate%
1.11e+09 +17.5% 1.304e+09 perf-stat.i.dTLB-stores
5.314e+09 +16.2% 6.176e+09 perf-stat.i.instructions
0.10 ± 11% +18.1% 0.12 ± 2% perf-stat.i.ipc
40.93 +16.1% 47.51 perf-stat.i.metric.M/sec
89.63 ± 2% +6.5 96.16 perf-stat.i.node-load-miss-rate%
3653512 ± 3% -57.3% 1561506 perf-stat.i.node-load-misses
371566 ± 19% -90.8% 34031 ± 8% perf-stat.i.node-loads
10.59 ± 25% -40.8% 6.27 ± 23% perf-stat.overall.MPKI
1.92 ± 8% -0.3 1.61 perf-stat.overall.branch-miss-rate%
13.04 -14.2% 11.19 perf-stat.overall.cpi
0.02 ± 89% -0.0 0.00 ±148% perf-stat.overall.dTLB-load-miss-rate%
0.00 ± 72% -0.0 0.00 ± 69% perf-stat.overall.dTLB-store-miss-rate%
318.50 +13.2% 360.58 perf-stat.overall.instructions-per-iTLB-miss
0.08 +16.6% 0.09 perf-stat.overall.ipc
90.76 ± 2% +7.1 97.87 perf-stat.overall.node-load-miss-rate%
1286 -2.7% 1251 perf-stat.overall.path-length
1.072e+09 +16.2% 1.246e+09 perf-stat.ps.branch-instructions
1.68e+09 +16.9% 1.964e+09 perf-stat.ps.dTLB-loads
1.109e+09 +17.6% 1.303e+09 perf-stat.ps.dTLB-stores
5.307e+09 +16.3% 6.171e+09 perf-stat.ps.instructions
3649615 ± 3% -57.2% 1560409 perf-stat.ps.node-load-misses
371135 ± 19% -90.9% 33946 ± 8% perf-stat.ps.node-loads
2.086e+12 +16.0% 2.419e+12 perf-stat.total.instructions
10629 ± 12% -17.7% 8746 ± 8% softirqs.CPU10.RCU
9891 ± 7% -14.6% 8447 ± 9% softirqs.CPU13.RCU
43153 ± 3% -7.4% 39975 ± 4% softirqs.CPU30.SCHED
9938 ± 6% -12.9% 8660 ± 2% softirqs.CPU33.RCU
9900 ± 9% -14.1% 8500 ± 5% softirqs.CPU38.RCU
9730 ± 6% -10.3% 8731 ± 7% softirqs.CPU40.RCU
10238 ± 8% -15.0% 8703 ± 9% softirqs.CPU44.RCU
10045 ± 10% -15.7% 8471 ± 6% softirqs.CPU45.RCU
10074 ± 7% -15.4% 8524 ± 6% softirqs.CPU46.RCU
9793 ± 6% -12.0% 8617 ± 8% softirqs.CPU49.RCU
10809 ± 18% -19.0% 8750 ± 8% softirqs.CPU50.RCU
10484 ± 7% -13.3% 9088 ± 10% softirqs.CPU53.RCU
10059 ± 7% -13.2% 8732 ± 7% softirqs.CPU54.RCU
10298 ± 4% -13.5% 8912 ± 7% softirqs.CPU55.RCU
9932 ± 8% -12.4% 8699 ± 5% softirqs.CPU60.RCU
10268 ± 9% -17.1% 8514 ± 7% softirqs.CPU61.RCU
9895 ± 5% -9.0% 9008 ± 5% softirqs.CPU67.RCU
10294 ± 8% -12.0% 9060 ± 5% softirqs.CPU68.RCU
11048 ± 14% -17.2% 9152 ± 6% softirqs.CPU69.RCU
9586 ± 7% -9.1% 8715 ± 5% softirqs.CPU74.RCU
9555 ± 7% -10.1% 8587 ± 5% softirqs.CPU76.RCU
9892 ± 10% -14.8% 8425 ± 5% softirqs.CPU80.RCU
9722 ± 6% -13.5% 8407 ± 6% softirqs.CPU82.RCU
9883 ± 6% -12.7% 8624 ± 4% softirqs.CPU83.RCU
9507 ± 5% -9.9% 8567 ± 4% softirqs.CPU84.RCU
9878 ± 8% -14.1% 8485 ± 3% softirqs.CPU85.RCU
37959 ± 4% -12.9% 33055 ± 6% softirqs.CPU85.SCHED
10338 ± 12% -16.6% 8623 ± 4% softirqs.CPU86.RCU
9885 ± 8% -14.8% 8423 ± 4% softirqs.CPU87.RCU
9934 ± 7% -12.9% 8649 ± 5% softirqs.CPU88.RCU
10119 ± 8% -16.0% 8502 ± 5% softirqs.CPU89.RCU
9958 ± 7% -13.5% 8612 ± 4% softirqs.CPU92.RCU
9917 ± 8% -14.3% 8498 ± 5% softirqs.CPU93.RCU
10070 ± 8% -14.3% 8625 ± 6% softirqs.CPU94.RCU
10157 ± 11% -11.7% 8967 ± 7% softirqs.CPU95.RCU
19377 ± 60% -69.7% 5871 ± 82% softirqs.NET_RX
944995 ± 4% -10.5% 845954 ± 6% softirqs.RCU
unixbench.score
3400 +--------------------------------------------------------------------+
3300 |-+O O O OO OO OO OO |
|O O OO O O O |
3200 |-+ O O O O |
3100 |-+ |
| |
3000 |-+ |
2900 |-+ |
2800 |-+ .+ ++. |
| +. .++ +.+++.++.++.++.++ :+ +|
2700 |+.++. .+ +. +.+ .++.+ + .++ + : + |
2600 |-+ ++ + :.+ + ++ + + + |
| + : + :+ |
2500 |-+ + + |
2400 +--------------------------------------------------------------------+
unixbench.workload
2e+09 +-----------------------------------------------------------------+
| O O O OO OO OOO O |
1.9e+09 |O+ O OO O O |
| O O O O O |
| |
1.8e+09 |-+ |
| |
1.7e+09 |-+ |
| + ++. |
1.6e+09 |-+ + + .+ ++ +.+++.++.+++.++.+ :+ +|
|+.+ + + :+ +.+ .+++ +. +.+ + : + |
| ++ .+ + + + ++ + + + |
1.5e+09 |-+ + + :+ |
| + |
1.4e+09 +-----------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
---
0DAY/LKP+ Test Infrastructure Open Source Technology Center
https://lists.01.org/hyperkitty/list/lkp@lists.01.org Intel Corporation
Thanks,
Oliver Sang
View attachment "config-5.12.0-rc5-00018-gaec499039e7b" of type "text/plain" (172853 bytes)
View attachment "job-script" of type "text/plain" (8205 bytes)
View attachment "job.yaml" of type "text/plain" (5486 bytes)
View attachment "reproduce" of type "text/plain" (280 bytes)
Powered by blists - more mailing lists