[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20201030081456.GY31092@shao2-debian>
Date: Fri, 30 Oct 2020 16:14:56 +0800
From: kernel test robot <rong.a.chen@...el.com>
To: Kent Overstreet <kent.overstreet@...il.com>
Cc: linux-kernel@...r.kernel.org, akpm@...ux-foundation.org,
Kent Overstreet <kent.overstreet@...il.com>, axboe@...nel.dk,
willy@...radead.org, linux-fsdevel@...r.kernel.org,
0day robot <lkp@...el.com>, lkp@...ts.01.org,
ying.huang@...el.com, feng.tang@...el.com, zhengjun.xing@...el.com
Subject: [fs] 2b2f891180: stress-ng.sendfile.ops_per_sec 32.0% improvement
Greeting,
FYI, we noticed a 32.0% improvement of stress-ng.sendfile.ops_per_sec due to commit:
commit: 2b2f89118025e62137e4d1514866069b24d810a4 ("[PATCH v2 2/2] fs: generic_file_buffered_read() now uses find_get_pages_contig")
url: https://github.com/0day-ci/linux/commits/Kent-Overstreet/generic_file_buffered_read-improvements/20201026-053158
base: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git 986b9eacb25910865b50e5f298aa8e2df7642f1b
in testcase: stress-ng
on test machine: 96 threads Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 192G memory
with following parameters:
nr_threads: 100%
disk: 1HDD
testtime: 30s
class: pipe
cpufreq_governor: performance
ucode: 0x5002f01
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
=========================================================================================
class/compiler/cpufreq_governor/disk/kconfig/nr_threads/rootfs/tbox_group/testcase/testtime/ucode:
pipe/gcc-9/performance/1HDD/x86_64-rhel-8.3/100%/debian-10.4-x86_64-20200603.cgz/lkp-csl-2sp5/stress-ng/30s/0x5002f01
commit:
aa5222a7e8 ("fs: Break generic_file_buffered_read up into multiple functions")
2b2f891180 ("fs: generic_file_buffered_read() now uses find_get_pages_contig")
aa5222a7e8f6ab9e 2b2f89118025e62137e4d151486
---------------- ---------------------------
fail:runs %reproduction fail:runs
| | |
:4 25% 1:4 dmesg.WARNING:missing_R10_value_at__fsnotify_parent/0x
%stddev %change %stddev
\ | \
1.38e+08 ± 8% -21.4% 1.084e+08 ± 18% stress-ng.pipe.ops
4598964 ± 8% -21.4% 3614448 ± 18% stress-ng.pipe.ops_per_sec
14543902 +32.0% 19194451 stress-ng.sendfile.ops
484783 +32.0% 639800 stress-ng.sendfile.ops_per_sec
9893 -20.7% 7844 stress-ng.time.maximum_resident_set_size
259655 ± 3% +11.0% 288128 ± 5% cpuidle.POLL.time
214144 -3.0% 207721 ± 2% vmstat.system.in
16311 ± 22% -17.2% 13502 ± 23% numa-meminfo.node0.KernelStack
37461 ± 42% -42.0% 21717 ± 60% numa-meminfo.node0.PageTables
147305 ± 16% -17.5% 121457 ± 18% numa-meminfo.node0.SUnreclaim
2102 ± 5% -9.5% 1901 ± 4% slabinfo.PING.active_objs
2102 ± 5% -9.5% 1901 ± 4% slabinfo.PING.num_objs
3295 ± 4% -6.9% 3069 ± 5% slabinfo.sock_inode_cache.active_objs
3295 ± 4% -6.9% 3069 ± 5% slabinfo.sock_inode_cache.num_objs
2.303e+10 -4.0% 2.21e+10 ± 2% perf-stat.i.branch-instructions
209728 -4.5% 200298 ± 2% perf-stat.i.cpu-migrations
2110607 -8.5% 1932169 ± 2% perf-stat.i.node-loads
0.48 +0.0 0.52 perf-stat.overall.branch-miss-rate%
2.261e+10 -3.8% 2.174e+10 ± 3% perf-stat.ps.branch-instructions
2109192 -8.2% 1936642 ± 2% perf-stat.ps.node-loads
24778 ± 2% -5.4% 23434 ± 2% proc-vmstat.nr_active_anon
76250 +6.2% 81012 ± 9% proc-vmstat.nr_anon_pages
80059 +6.0% 84835 ± 8% proc-vmstat.nr_inactive_anon
28762 ± 2% -4.6% 27434 ± 2% proc-vmstat.nr_shmem
24778 ± 2% -5.4% 23434 ± 2% proc-vmstat.nr_zone_active_anon
80059 +6.0% 84835 ± 8% proc-vmstat.nr_zone_inactive_anon
5269 +38.8% 7313 ± 16% sched_debug.cfs_rq:/.load.min
274.69 +1285.1% 3804 ±142% sched_debug.cfs_rq:/.load_avg.max
46.74 ± 4% +1059.0% 541.77 ±144% sched_debug.cfs_rq:/.load_avg.stddev
3.97 ± 26% -45.5% 2.16 ± 57% sched_debug.cfs_rq:/.removed.load_avg.avg
31.04 ± 11% -35.7% 19.95 ± 57% sched_debug.cfs_rq:/.removed.load_avg.stddev
1850 ± 3% -9.8% 1669 ± 2% sched_debug.cfs_rq:/.runnable_avg.max
-65895 -277.0% 116623 ±123% sched_debug.cfs_rq:/.spread0.avg
1253 ± 2% -11.9% 1105 ± 2% sched_debug.cfs_rq:/.util_avg.max
120.28 ± 11% -20.8% 95.27 ± 12% sched_debug.cfs_rq:/.util_avg.stddev
90.38 ± 52% +87.0% 169.00 ± 14% sched_debug.cfs_rq:/.util_est_enqueued.min
91.18 -0.8 90.33 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe
93.72 -0.5 93.22 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe
0.71 ± 17% +0.2 0.91 ± 13% perf-profile.calltrace.cycles-pp.common_file_perm.security_file_permission.vfs_read.ksys_read.do_syscall_64
1.08 ± 15% +0.3 1.36 ± 13% perf-profile.calltrace.cycles-pp.security_file_permission.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
91.28 -0.8 90.44 perf-profile.children.cycles-pp.do_syscall_64
93.81 -0.5 93.32 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
0.06 ± 20% +0.0 0.08 ± 10% perf-profile.children.cycles-pp.__x64_sys_write
0.06 ± 22% +0.0 0.08 ± 15% perf-profile.children.cycles-pp.prepare_to_wait_event
0.04 ± 63% +0.0 0.07 ± 20% perf-profile.children.cycles-pp.clockevents_program_event
0.02 ±173% +0.1 0.07 ± 22% perf-profile.children.cycles-pp.ktime_get
0.18 ± 14% +0.1 0.28 ± 17% perf-profile.children.cycles-pp.finish_task_switch
0.53 ± 10% +0.2 0.71 ± 14% perf-profile.children.cycles-pp.asm_call_sysvec_on_stack
1.97 ± 9% +0.2 2.20 ± 4% perf-profile.children.cycles-pp.mutex_lock
1.45 ± 18% +0.3 1.77 ± 13% perf-profile.children.cycles-pp.common_file_perm
2.08 ± 16% +0.5 2.53 ± 13% perf-profile.children.cycles-pp.security_file_permission
0.11 ± 22% +0.0 0.16 ± 17% perf-profile.self.cycles-pp.copyin
0.03 ±100% +0.0 0.07 ± 10% perf-profile.self.cycles-pp.__x64_sys_write
0.19 ± 19% +0.0 0.24 ± 14% perf-profile.self.cycles-pp.ksys_write
0.36 ± 11% +0.1 0.46 ± 15% perf-profile.self.cycles-pp.security_file_permission
0.63 ± 17% +0.2 0.78 ± 12% perf-profile.self.cycles-pp.syscall_exit_to_user_mode
1.16 ± 18% +0.3 1.44 ± 13% perf-profile.self.cycles-pp.common_file_perm
0.00 +1.8e+104% 183.75 ±132% interrupts.93:PCI-MSI.31981626-edge.i40e-eth0-TxRx-57
7738 ± 2% -37.7% 4819 ± 34% interrupts.CPU21.NMI:Non-maskable_interrupts
7738 ± 2% -37.7% 4819 ± 34% interrupts.CPU21.PMI:Performance_monitoring_interrupts
7735 ± 2% -37.5% 4834 ± 36% interrupts.CPU23.NMI:Non-maskable_interrupts
7735 ± 2% -37.5% 4834 ± 36% interrupts.CPU23.PMI:Performance_monitoring_interrupts
4378 ± 23% +335.6% 19072 ± 95% interrupts.CPU42.CAL:Function_call_interrupts
6507 ± 21% +33.6% 8696 ± 22% interrupts.CPU45.CAL:Function_call_interrupts
17585 ± 34% -53.7% 8147 ± 80% interrupts.CPU51.CAL:Function_call_interrupts
30252 ± 76% -75.5% 7413 ± 67% interrupts.CPU53.CAL:Function_call_interrupts
22394 ± 68% -72.6% 6125 ± 67% interrupts.CPU55.CAL:Function_call_interrupts
21262 ± 42% -70.6% 6255 ± 70% interrupts.CPU56.CAL:Function_call_interrupts
9201 ± 83% +111.2% 19434 ± 64% interrupts.CPU56.RES:Rescheduling_interrupts
20517 ± 57% -58.0% 8627 ± 69% interrupts.CPU58.CAL:Function_call_interrupts
30292 ± 79% -69.4% 9282 ±103% interrupts.CPU59.CAL:Function_call_interrupts
20349 ± 56% -60.6% 8013 ± 91% interrupts.CPU6.CAL:Function_call_interrupts
21097 ± 55% -63.7% 7660 ± 96% interrupts.CPU61.CAL:Function_call_interrupts
11855 ± 80% +132.5% 27562 ± 65% interrupts.CPU63.RES:Rescheduling_interrupts
12953 ± 23% -52.3% 6181 ± 75% interrupts.CPU66.CAL:Function_call_interrupts
19018 ± 38% -65.0% 6660 ±102% interrupts.CPU67.CAL:Function_call_interrupts
26718 ± 70% -66.5% 8941 ± 54% interrupts.CPU7.CAL:Function_call_interrupts
12561 ± 98% +114.0% 26876 ± 72% interrupts.CPU7.RES:Rescheduling_interrupts
10208 ± 83% +113.2% 21760 ± 60% interrupts.CPU8.RES:Rescheduling_interrupts
5057 ± 16% +235.1% 16946 ±105% interrupts.CPU87.CAL:Function_call_interrupts
9429 ± 66% +62.9% 15363 ± 43% interrupts.CPU9.RES:Rescheduling_interrupts
stress-ng.sendfile.ops
2e+07 +-----------------------------------------------------------------+
|O OOO O O |
1.9e+07 |-O OO OOOOOOOOOO |
| O OO |
1.8e+07 |-+O |
| O |
1.7e+07 |-+ |
| |
1.6e+07 |-+ |
| ++ |
1.5e+07 |-+ + ++++ +++++++ +++++++ +++ ++++++++ +++++ +: + + |
|++++++++++ :: : ++ + :+ :: :: + + + + ++++++++++|
1.4e+07 |-+ + + + + + + + |
| |
1.3e+07 +-----------------------------------------------------------------+
stress-ng.sendfile.ops_per_sec
660000 +------------------------------------------------------------------+
640000 |O+ OOO OOOOOO |
| O OO OOOOOO O |
620000 |-+ O OO |
600000 |-+O |
| O |
580000 |-+ |
560000 |-+ |
540000 |-+ |
| |
520000 |-+ ++ |
500000 |-+ + + +++ ++ +++++++ + + +++ ++ +++++++ + :++ |
|++++++ +++++ +:+ :: + + +::+:+ :: + + +:: :+++ ++++ |
480000 |-+ + + :: + :+ :: + :+ + + +|
460000 +------------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Rong Chen
View attachment "config-5.9.0-14770-g2b2f89118025" of type "text/plain" (171554 bytes)
View attachment "job-script" of type "text/plain" (8147 bytes)
View attachment "job.yaml" of type "text/plain" (5573 bytes)
View attachment "reproduce" of type "text/plain" (390 bytes)
Powered by blists - more mailing lists