[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20201203064536.GE27350@xsang-OptiPlex-9020>
Date: Thu, 3 Dec 2020 14:45:36 +0800
From: kernel test robot <oliver.sang@...el.com>
To: David Howells <dhowells@...hat.com>
Cc: lkp@...ts.01.org, lkp@...el.com, ying.huang@...el.com,
feng.tang@...el.com, zhengjun.xing@...el.com,
Pavel Begunkov <asml.silence@...il.com>,
Matthew Wilcox <willy@...radead.org>,
Jens Axboe <axboe@...nel.dk>,
Alexander Viro <viro@...iv.linux.org.uk>, dhowells@...hat.com,
Linus Torvalds <torvalds@...ux-foundation.org>,
linux-fsdevel@...r.kernel.org, linux-block@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: [iov_iter] 9bd0e337c6: will-it-scale.per_process_ops -4.8%
regression
Greeting,
FYI, we noticed a -4.8% regression of will-it-scale.per_process_ops due to commit:
commit: 9bd0e337c633aed3e8ec3c7397b7ae0b8436f163 ("[PATCH 01/29] iov_iter: Switch to using a table of operations")
url: https://github.com/0day-ci/linux/commits/David-Howells/RFC-iov_iter-Switch-to-using-an-ops-table/20201121-222344
base: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git 27bba9c532a8d21050b94224ffd310ad0058c353
in testcase: will-it-scale
on test machine: 48 threads Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz with 112G memory
with following parameters:
nr_task: 50%
mode: process
test: pwrite1
cpufreq_governor: performance
ucode: 0x42e
test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
test-url: https://github.com/antonblanchard/will-it-scale
If you fix the issue, kindly add following tag
Reported-by: kernel test robot <oliver.sang@...el.com>
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
gcc-9/performance/x86_64-rhel-8.3/process/50%/debian-10.4-x86_64-20200603.cgz/lkp-ivb-2ep1/pwrite1/will-it-scale/0x42e
commit:
27bba9c532 ("Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi")
9bd0e337c6 ("iov_iter: Switch to using a table of operations")
27bba9c532a8d210 9bd0e337c633aed3e8ec3c7397b
---------------- ---------------------------
%stddev %change %stddev
\ | \
28443113 -4.8% 27064036 will-it-scale.24.processes
1185129 -4.8% 1127667 will-it-scale.per_process_ops
28443113 -4.8% 27064036 will-it-scale.workload
13.84 +1.0% 13.98 boot-time.dhcp
0.00 ± 9% -13.5% 0.00 ± 3% sched_debug.cpu.next_balance.stddev
1251 ± 9% -17.2% 1035 ± 10% slabinfo.dmaengine-unmap-16.active_objs
1251 ± 9% -17.2% 1035 ± 10% slabinfo.dmaengine-unmap-16.num_objs
24623 ± 5% -18.0% 20184 ± 15% softirqs.CPU0.RCU
28877 ± 10% -30.6% 20051 ± 15% softirqs.CPU19.RCU
5693 ± 31% +402.3% 28595 ± 22% softirqs.CPU19.SCHED
21142 ± 15% -26.5% 15533 ± 11% softirqs.CPU27.RCU
20776 ± 38% -50.5% 10290 ± 58% softirqs.CPU3.SCHED
26618 ± 11% -35.3% 17214 ± 6% softirqs.CPU37.RCU
10894 ± 48% +175.5% 30012 ± 34% softirqs.CPU37.SCHED
17015 ± 4% +39.2% 23681 ± 7% softirqs.CPU43.RCU
411.75 ± 58% +76.8% 728.00 ± 32% numa-vmstat.node0.nr_active_anon
34304 ± 2% -35.6% 22103 ± 48% numa-vmstat.node0.nr_anon_pages
36087 ± 2% -31.0% 24915 ± 43% numa-vmstat.node0.nr_inactive_anon
2233 ± 51% +60.4% 3582 ± 7% numa-vmstat.node0.nr_shmem
411.75 ± 58% +76.8% 728.00 ± 32% numa-vmstat.node0.nr_zone_active_anon
36087 ± 2% -31.0% 24915 ± 43% numa-vmstat.node0.nr_zone_inactive_anon
24265 ± 3% +51.3% 36707 ± 29% numa-vmstat.node1.nr_anon_pages
25441 ± 2% +44.9% 36858 ± 29% numa-vmstat.node1.nr_inactive_anon
537.25 ± 20% +22.8% 659.50 ± 10% numa-vmstat.node1.nr_page_table_pages
25441 ± 2% +44.9% 36858 ± 29% numa-vmstat.node1.nr_zone_inactive_anon
1649 ± 58% +76.7% 2913 ± 32% numa-meminfo.node0.Active
1649 ± 58% +76.7% 2913 ± 32% numa-meminfo.node0.Active(anon)
137223 ± 2% -35.6% 88410 ± 48% numa-meminfo.node0.AnonPages
164997 ± 9% -28.4% 118095 ± 42% numa-meminfo.node0.AnonPages.max
144353 ± 2% -31.0% 99656 ± 43% numa-meminfo.node0.Inactive
144353 ± 2% -31.0% 99656 ± 43% numa-meminfo.node0.Inactive(anon)
8937 ± 51% +60.3% 14328 ± 7% numa-meminfo.node0.Shmem
97072 ± 3% +51.3% 146858 ± 29% numa-meminfo.node1.AnonPages
127410 ± 5% +43.2% 182468 ± 16% numa-meminfo.node1.AnonPages.max
101822 ± 2% +44.9% 147521 ± 29% numa-meminfo.node1.Inactive
101822 ± 2% +44.9% 147521 ± 29% numa-meminfo.node1.Inactive(anon)
2148 ± 20% +22.9% 2639 ± 10% numa-meminfo.node1.PageTables
3431 ± 89% -85.1% 512.25 ±109% interrupts.38:PCI-MSI.2621444-edge.eth0-TxRx-3
348.50 ± 62% +152.7% 880.75 ± 27% interrupts.40:PCI-MSI.2621446-edge.eth0-TxRx-5
1697 ± 63% -53.1% 796.75 ± 13% interrupts.CPU13.CAL:Function_call_interrupts
89.75 ± 36% +220.3% 287.50 ± 20% interrupts.CPU13.RES:Rescheduling_interrupts
745.75 ± 3% +104.6% 1526 ± 69% interrupts.CPU19.CAL:Function_call_interrupts
293.00 ± 5% -60.0% 117.25 ± 47% interrupts.CPU19.RES:Rescheduling_interrupts
778.50 ± 9% +123.7% 1741 ± 64% interrupts.CPU22.CAL:Function_call_interrupts
6450 ± 29% -38.0% 4000 ± 4% interrupts.CPU24.NMI:Non-maskable_interrupts
6450 ± 29% -38.0% 4000 ± 4% interrupts.CPU24.PMI:Performance_monitoring_interrupts
2012 ± 56% -57.6% 852.75 ± 6% interrupts.CPU26.CAL:Function_call_interrupts
184.25 ± 37% -47.9% 96.00 ± 49% interrupts.CPU27.RES:Rescheduling_interrupts
0.50 ±100% +64250.0% 321.75 ±170% interrupts.CPU28.TLB:TLB_shootdowns
3431 ± 89% -85.1% 512.25 ±109% interrupts.CPU29.38:PCI-MSI.2621444-edge.eth0-TxRx-3
348.50 ± 62% +152.7% 880.75 ± 27% interrupts.CPU31.40:PCI-MSI.2621446-edge.eth0-TxRx-5
156.50 ± 51% -51.3% 76.25 ± 59% interrupts.CPU33.RES:Rescheduling_interrupts
883.50 ± 18% -23.8% 673.25 ± 22% interrupts.CPU36.CAL:Function_call_interrupts
7492 ± 13% -45.6% 4073 ± 63% interrupts.CPU37.NMI:Non-maskable_interrupts
7492 ± 13% -45.6% 4073 ± 63% interrupts.CPU37.PMI:Performance_monitoring_interrupts
250.50 ± 19% -52.5% 119.00 ± 50% interrupts.CPU37.RES:Rescheduling_interrupts
4688 ± 27% +63.5% 7667 ± 15% interrupts.CPU40.NMI:Non-maskable_interrupts
4688 ± 27% +63.5% 7667 ± 15% interrupts.CPU40.PMI:Performance_monitoring_interrupts
96.75 ± 92% +135.1% 227.50 ± 22% interrupts.CPU43.RES:Rescheduling_interrupts
2932 ± 36% +73.4% 5084 ± 21% interrupts.CPU47.NMI:Non-maskable_interrupts
2932 ± 36% +73.4% 5084 ± 21% interrupts.CPU47.PMI:Performance_monitoring_interrupts
57.50 ± 78% +250.4% 201.50 ± 42% interrupts.CPU47.RES:Rescheduling_interrupts
4207 ± 61% +86.0% 7827 ± 11% interrupts.CPU8.NMI:Non-maskable_interrupts
4207 ± 61% +86.0% 7827 ± 11% interrupts.CPU8.PMI:Performance_monitoring_interrupts
1.089e+10 -2.3% 1.064e+10 perf-stat.i.branch-instructions
1.62 +0.7 2.34 perf-stat.i.branch-miss-rate%
1.741e+08 +42.3% 2.476e+08 perf-stat.i.branch-misses
1.36 +3.3% 1.41 perf-stat.i.cpi
1.233e+08 ± 3% -7.1% 1.146e+08 perf-stat.i.dTLB-load-misses
2.38e+10 -3.3% 2.302e+10 perf-stat.i.dTLB-loads
57501510 -4.9% 54711717 perf-stat.i.dTLB-store-misses
1.828e+10 -3.7% 1.761e+10 perf-stat.i.dTLB-stores
98.97 -2.9 96.02 ± 2% perf-stat.i.iTLB-load-miss-rate%
29795797 ± 4% -5.0% 28320171 perf-stat.i.iTLB-load-misses
299268 ± 2% +298.1% 1191476 ± 50% perf-stat.i.iTLB-loads
5.335e+10 -3.7% 5.138e+10 perf-stat.i.instructions
0.74 -3.7% 0.71 perf-stat.i.ipc
0.20 ± 8% +12.1% 0.23 perf-stat.i.major-faults
1104 -3.2% 1069 perf-stat.i.metric.M/sec
72308 +2.3% 73975 ± 2% perf-stat.i.node-stores
0.10 +7.9% 0.11 ± 8% perf-stat.overall.MPKI
1.60 +0.7 2.33 perf-stat.overall.branch-miss-rate%
1.35 +4.1% 1.41 perf-stat.overall.cpi
99.00 -3.0 95.98 ± 2% perf-stat.overall.iTLB-load-miss-rate%
0.74 -3.9% 0.71 perf-stat.overall.ipc
1.085e+10 -2.3% 1.06e+10 perf-stat.ps.branch-instructions
1.735e+08 +42.3% 2.468e+08 perf-stat.ps.branch-misses
1.229e+08 ± 3% -7.1% 1.142e+08 perf-stat.ps.dTLB-load-misses
2.372e+10 -3.3% 2.294e+10 perf-stat.ps.dTLB-loads
57306258 -4.9% 54525679 perf-stat.ps.dTLB-store-misses
1.822e+10 -3.7% 1.755e+10 perf-stat.ps.dTLB-stores
29695158 ± 4% -5.0% 28224049 perf-stat.ps.iTLB-load-misses
298257 ± 2% +298.1% 1187498 ± 50% perf-stat.ps.iTLB-loads
5.317e+10 -3.7% 5.12e+10 perf-stat.ps.instructions
0.20 ± 7% +12.0% 0.23 ± 2% perf-stat.ps.major-faults
1.613e+13 -3.9% 1.55e+13 perf-stat.total.instructions
8.00 ± 14% -8.0 0.00 perf-profile.calltrace.cycles-pp.iov_iter_copy_from_user_atomic.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.new_sync_write
7.38 ± 14% -7.4 0.00 perf-profile.calltrace.cycles-pp.copyin.iov_iter_copy_from_user_atomic.generic_perform_write.__generic_file_write_iter.generic_file_write_iter
7.27 ± 14% -7.3 0.00 perf-profile.calltrace.cycles-pp.copy_user_enhanced_fast_string.copyin.iov_iter_copy_from_user_atomic.generic_perform_write.__generic_file_write_iter
0.69 ± 14% -0.4 0.29 ±100% perf-profile.calltrace.cycles-pp.up_write.generic_file_write_iter.new_sync_write.vfs_write.ksys_pwrite64
0.62 ± 15% -0.3 0.30 ±101% perf-profile.calltrace.cycles-pp.unlock_page.shmem_write_end.generic_perform_write.__generic_file_write_iter.generic_file_write_iter
0.85 ± 8% -0.2 0.66 ± 15% perf-profile.calltrace.cycles-pp.__fget_light.ksys_pwrite64.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pwrite
0.91 ± 11% -0.1 0.79 ± 12% perf-profile.calltrace.cycles-pp.file_update_time.__generic_file_write_iter.generic_file_write_iter.new_sync_write.vfs_write
0.00 +1.0 1.01 ± 13% perf-profile.calltrace.cycles-pp.__get_user_nocheck_1.xxx_fault_in_readable.generic_perform_write.__generic_file_write_iter.generic_file_write_iter
0.00 +1.4 1.42 ± 12% perf-profile.calltrace.cycles-pp.xxx_advance.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.new_sync_write
0.00 +2.1 2.15 ± 13% perf-profile.calltrace.cycles-pp.xxx_fault_in_readable.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.new_sync_write
0.00 +6.8 6.82 ± 13% perf-profile.calltrace.cycles-pp.copy_user_enhanced_fast_string.copyin.xxx_copy_from_user_atomic.generic_perform_write.__generic_file_write_iter
0.00 +6.9 6.92 ± 13% perf-profile.calltrace.cycles-pp.copyin.xxx_copy_from_user_atomic.generic_perform_write.__generic_file_write_iter.generic_file_write_iter
0.00 +8.1 8.09 ± 14% perf-profile.calltrace.cycles-pp.xxx_copy_from_user_atomic.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.new_sync_write
8.03 ± 14% -8.0 0.00 perf-profile.children.cycles-pp.iov_iter_copy_from_user_atomic
0.85 ± 8% -0.2 0.66 ± 15% perf-profile.children.cycles-pp.__fget_light
0.69 ± 14% -0.2 0.52 ± 15% perf-profile.children.cycles-pp.up_write
0.62 ± 13% -0.2 0.46 ± 14% perf-profile.children.cycles-pp.apparmor_file_permission
0.94 ± 11% -0.1 0.82 ± 13% perf-profile.children.cycles-pp.file_update_time
0.51 ± 12% -0.1 0.40 ± 14% perf-profile.children.cycles-pp.balance_dirty_pages_ratelimited
0.55 ± 12% -0.1 0.47 ± 12% perf-profile.children.cycles-pp.current_time
0.62 ± 14% -0.1 0.55 ± 13% perf-profile.children.cycles-pp.unlock_page
0.24 ± 13% -0.0 0.20 ± 16% perf-profile.children.cycles-pp.timestamp_truncate
0.18 ± 11% -0.0 0.14 ± 15% perf-profile.children.cycles-pp.file_remove_privs
0.55 ± 14% +0.3 0.87 ± 15% perf-profile.children.cycles-pp.__x86_retpoline_rax
0.00 +1.4 1.42 ± 12% perf-profile.children.cycles-pp.xxx_advance
0.00 +2.2 2.22 ± 13% perf-profile.children.cycles-pp.xxx_fault_in_readable
0.00 +8.1 8.12 ± 14% perf-profile.children.cycles-pp.xxx_copy_from_user_atomic
1.02 ± 16% -0.2 0.82 ± 12% perf-profile.self.cycles-pp.shmem_getpage_gfp
0.82 ± 8% -0.2 0.63 ± 15% perf-profile.self.cycles-pp.__fget_light
0.66 ± 14% -0.2 0.49 ± 15% perf-profile.self.cycles-pp.up_write
0.54 ± 15% -0.2 0.39 ± 14% perf-profile.self.cycles-pp.apparmor_file_permission
0.59 ± 13% -0.1 0.46 ± 13% perf-profile.self.cycles-pp.ksys_pwrite64
0.50 ± 12% -0.1 0.40 ± 13% perf-profile.self.cycles-pp.balance_dirty_pages_ratelimited
0.24 ± 15% -0.0 0.19 ± 15% perf-profile.self.cycles-pp.timestamp_truncate
0.20 ± 13% -0.0 0.17 ± 12% perf-profile.self.cycles-pp.current_time
0.12 ± 14% +0.1 0.19 ± 14% perf-profile.self.cycles-pp.__x86_indirect_thunk_rax
0.43 ± 14% +0.3 0.68 ± 15% perf-profile.self.cycles-pp.__x86_retpoline_rax
0.00 +1.1 1.14 ± 15% perf-profile.self.cycles-pp.xxx_copy_from_user_atomic
0.00 +1.2 1.21 ± 12% perf-profile.self.cycles-pp.xxx_fault_in_readable
0.00 +1.3 1.28 ± 12% perf-profile.self.cycles-pp.xxx_advance
will-it-scale.24.processes
2.88e+07 +----------------------------------------------------------------+
2.86e+07 |-+ +.+.+..+. |
| +. + +. .+. |
2.84e+07 |.+.+.+.+. + +.+.+.+.+ + + |
2.82e+07 |-+ +.+ |
| |
2.8e+07 |-+ |
2.78e+07 |-+ |
2.76e+07 |-+ |
| |
2.74e+07 |-+ |
2.72e+07 |-O O O O O O O O O O O O O O O O |
| O O O O O O O O O O O O O |
2.7e+07 |-+ O O |
2.68e+07 +----------------------------------------------------------------+
will-it-scale.per_process_ops
1.2e+06 +----------------------------------------------------------------+
| +.+.+..+. |
1.19e+06 |-+ +. + +. .+. |
1.18e+06 |.+.+.+.+ + +.+.+.+.+ + + |
| + .+ |
1.17e+06 |-+ + |
| |
1.16e+06 |-+ |
| |
1.15e+06 |-+ |
1.14e+06 |-+ |
| O O O O |
1.13e+06 |-O O O O O O O O O O O O O O O O O O O O O O O |
| O O O O |
1.12e+06 +----------------------------------------------------------------+
will-it-scale.workload
2.88e+07 +----------------------------------------------------------------+
2.86e+07 |-+ +.+.+..+. |
| +. + +. .+. |
2.84e+07 |.+.+.+.+. + +.+.+.+.+ + + |
2.82e+07 |-+ +.+ |
| |
2.8e+07 |-+ |
2.78e+07 |-+ |
2.76e+07 |-+ |
| |
2.74e+07 |-+ |
2.72e+07 |-O O O O O O O O O O O O O O O O |
| O O O O O O O O O O O O O |
2.7e+07 |-+ O O |
2.68e+07 +----------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Oliver Sang
View attachment "config-5.10.0-rc4-00369-g9bd0e337c633" of type "text/plain" (170131 bytes)
View attachment "job-script" of type "text/plain" (8009 bytes)
View attachment "job.yaml" of type "text/plain" (5479 bytes)
View attachment "reproduce" of type "text/plain" (339 bytes)
Powered by blists - more mailing lists