[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20210908054503.GB839@xsang-OptiPlex-9020>
Date: Wed, 8 Sep 2021 13:45:03 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Alex Shi <alex.shi@...ux.alibaba.com>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
Hugh Dickins <hughd@...gle.com>,
Johannes Weiner <hannes@...xchg.org>,
Alexander Duyck <alexander.duyck@...il.com>,
Andrea Arcangeli <aarcange@...hat.com>,
Andrey Ryabinin <aryabinin@...tuozzo.com>,
"Chen, Rong A" <rong.a.chen@...el.com>,
Daniel Jordan <daniel.m.jordan@...cle.com>,
"Huang, Ying" <ying.huang@...el.com>, Jann Horn <jannh@...gle.com>,
Joonsoo Kim <iamjoonsoo.kim@....com>,
"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
"Kirill A. Shutemov" <kirill@...temov.name>,
Konstantin Khlebnikov <khlebnikov@...dex-team.ru>,
Matthew Wilcox <willy@...radead.org>,
Mel Gorman <mgorman@...hsingularity.net>,
Michal Hocko <mhocko@...nel.org>,
Michal Hocko <mhocko@...e.com>,
Mika Penttilä <mika.penttila@...tfour.com>,
Minchan Kim <minchan@...nel.org>,
Shakeel Butt <shakeelb@...gle.com>, Tejun Heo <tj@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>,
Vladimir Davydov <vdavydov.dev@...il.com>,
Vlastimil Babka <vbabka@...e.cz>,
Wei Yang <richard.weiyang@...il.com>,
Yang Shi <yang.shi@...ux.alibaba.com>,
Andrew Morton <akpm@...ux-foundation.org>,
LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org,
lkp@...el.com, feng.tang@...el.com, zhengjun.xing@...ux.intel.com
Subject: [mm/lru] 75cc3c9161: fio.read_iops -4.7% regression
Greeting,
FYI, we noticed a -4.7% regression of fio.read_iops due to commit:
commit: 75cc3c9161cd95f43ebf6c6a938d4d98ab195bbd ("mm/lru: move lock into lru_note_cost")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
in testcase: fio-basic
on test machine: 96 threads 2 sockets Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 256G memory
with following parameters:
disk: 2pmem
fs: ext4
runtime: 200s
nr_task: 50%
time_based: tb
rw: randread
bs: 4k
ioengine: mmap
test_size: 200G
cpufreq_governor: performance
ucode: 0x5003006
test-description: Fio is a tool that will spawn a number of threads or processes doing a particular type of I/O action as specified by the user.
test-url: https://github.com/axboe/fio
If you fix the issue, kindly add following tag
Reported-by: kernel test robot <oliver.sang@...el.com>
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
bin/lkp run generated-yaml-file
=========================================================================================
bs/compiler/cpufreq_governor/disk/fs/ioengine/kconfig/nr_task/rootfs/runtime/rw/tbox_group/test_size/testcase/time_based/ucode:
4k/gcc-9/performance/2pmem/ext4/mmap/x86_64-rhel-8.3/50%/debian-10.4-x86_64-20200603.cgz/200s/randread/lkp-csl-2sp6/200G/fio-basic/tb/0x5003006
commit:
c7c7b80c39 ("mm/swap.c: fold vm event PGROTATED into pagevec_move_tail_fn")
75cc3c9161 ("mm/lru: move lock into lru_note_cost")
c7c7b80c39a18d99 75cc3c9161cd95f43ebf6c6a938
---------------- ---------------------------
%stddev %change %stddev
\ | \
0.06 +0.0 0.06 fio.latency_20ms%
0.17 ± 6% -0.1 0.10 ± 12% fio.latency_250us%
2.42 ± 5% +0.3 2.73 ± 4% fio.latency_50us%
10762 -4.7% 10251 fio.read_bw_MBps
15928 +5.4% 16792 fio.read_clat_mean_us
620449 ± 4% +15.2% 714702 ± 5% fio.read_clat_stddev
2755207 -4.7% 2624496 fio.read_iops
4.356e+09 -4.7% 4.15e+09 fio.time.file_system_inputs
548995 -13.6% 474105 fio.time.involuntary_context_switches
5.445e+08 -4.7% 5.188e+08 fio.time.major_page_faults
5.512e+08 -4.7% 5.252e+08 fio.workload
2.60 -4.2% 2.50 iostat.cpu.user
993.70 ± 5% -9.6% 898.57 ± 5% sched_debug.cfs_rq:/.util_est_enqueued.max
148.41 -1.2% 146.66 turbostat.RAMWatt
213.43 ± 3% -34.7% 139.43 ± 5% numa-vmstat.node0.nr_isolated_file
210.57 ± 5% -33.0% 141.14 ± 5% numa-vmstat.node1.nr_isolated_file
10692349 -4.9% 10171517 vmstat.io.bi
7671 -9.8% 6917 vmstat.system.cs
42.20 ± 3% +7.6% 45.42 ± 3% perf-sched.total_wait_and_delay.average.ms
42.18 ± 3% +7.6% 45.40 ± 3% perf-sched.total_wait_time.average.ms
10233 ± 2% -17.2% 8477 ± 3% perf-sched.wait_and_delay.count.__sched_text_start.__sched_text_start.preempt_schedule_common._cond_resched.__alloc_pages_slowpath
459.86 ± 7% +36.0% 625.57 ± 14% perf-sched.wait_and_delay.count.__sched_text_start.__sched_text_start.preempt_schedule_common._cond_resched.down_read
20707 ± 4% -9.2% 18791 ± 5% perf-sched.wait_and_delay.count.__sched_text_start.__sched_text_start.worker_thread.kthread.ret_from_fork
0.01 ± 8% +11171.4% 1.24 ±178% perf-sched.wait_time.avg.ms.__sched_text_start.__sched_text_start.preempt_schedule_common._cond_resched.__alloc_pages_nodemask
0.02 ± 71% +1.7e+05% 36.36 ±177% perf-sched.wait_time.max.ms.__sched_text_start.__sched_text_start.preempt_schedule_common._cond_resched.__alloc_pages_nodemask
46297815 -1.3% 45718611 interrupts.CAL:Function_call_interrupts
536.29 ± 11% -18.9% 434.71 ± 12% interrupts.CPU13.RES:Rescheduling_interrupts
562.71 ± 13% -18.5% 458.43 ± 7% interrupts.CPU16.RES:Rescheduling_interrupts
757634 ± 9% -13.6% 654501 ± 6% interrupts.CPU16.TLB:TLB_shootdowns
536.00 ± 13% -18.3% 438.00 ± 5% interrupts.CPU17.RES:Rescheduling_interrupts
550.57 ± 9% -21.3% 433.57 ± 8% interrupts.CPU18.RES:Rescheduling_interrupts
4251 ± 18% +60.7% 6833 ± 9% interrupts.CPU25.NMI:Non-maskable_interrupts
4251 ± 18% +60.7% 6833 ± 9% interrupts.CPU25.PMI:Performance_monitoring_interrupts
506.29 ± 11% -21.9% 395.57 ± 12% interrupts.CPU35.RES:Rescheduling_interrupts
772187 ± 11% -17.0% 640700 ± 9% interrupts.CPU35.TLB:TLB_shootdowns
752779 ± 10% -23.8% 573337 ± 19% interrupts.CPU37.TLB:TLB_shootdowns
374466 -4.8% 356349 proc-vmstat.allocstall_movable
8293 ± 2% -6.9% 7723 ± 2% proc-vmstat.kswapd_low_wmark_hit_quickly
426.29 -34.5% 279.14 ± 4% proc-vmstat.nr_isolated_file
4.288e+08 -5.8% 4.039e+08 proc-vmstat.numa_hit
4.287e+08 -5.8% 4.038e+08 proc-vmstat.numa_local
8297 ± 2% -6.9% 7727 ± 2% proc-vmstat.pageoutrun
20856484 -4.5% 19927281 proc-vmstat.pgalloc_dma32
5.25e+08 -4.7% 5e+08 proc-vmstat.pgalloc_normal
1.09e+09 -4.7% 1.038e+09 proc-vmstat.pgfault
5.355e+08 -4.8% 5.097e+08 proc-vmstat.pgfree
5.445e+08 -4.7% 5.187e+08 proc-vmstat.pgmajfault
2.178e+09 -4.7% 2.075e+09 proc-vmstat.pgpgin
9.606e+08 -5.0% 9.122e+08 proc-vmstat.pgscan_direct
1.079e+09 -4.7% 1.028e+09 proc-vmstat.pgscan_file
4.938e+08 -4.8% 4.698e+08 proc-vmstat.pgsteal_direct
5.345e+08 -4.8% 5.087e+08 proc-vmstat.pgsteal_file
40747069 ± 2% -4.6% 38881007 proc-vmstat.pgsteal_kswapd
33706519 -4.6% 32144491 proc-vmstat.workingset_refault_file
22.69 ± 9% -11.5 11.23 ± 9% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irq.shrink_inactive_list.shrink_lruvec.shrink_node
21.89 ± 9% -10.7 11.20 ± 9% perf-profile.calltrace.cycles-pp._raw_spin_lock_irq.shrink_inactive_list.shrink_lruvec.shrink_node.do_try_to_free_pages
11.69 ± 10% -2.6 9.05 ± 9% perf-profile.calltrace.cycles-pp.shrink_page_list.shrink_inactive_list.shrink_lruvec.shrink_node.do_try_to_free_pages
4.06 ± 12% -1.8 2.27 ± 9% perf-profile.calltrace.cycles-pp.try_to_unmap_flush.shrink_page_list.shrink_inactive_list.shrink_lruvec.shrink_node
4.06 ± 12% -1.8 2.27 ± 9% perf-profile.calltrace.cycles-pp.arch_tlbbatch_flush.try_to_unmap_flush.shrink_page_list.shrink_inactive_list.shrink_lruvec
4.06 ± 12% -1.8 2.27 ± 9% perf-profile.calltrace.cycles-pp.on_each_cpu_cond_mask.arch_tlbbatch_flush.try_to_unmap_flush.shrink_page_list.shrink_inactive_list
3.96 ± 12% -1.7 2.22 ± 10% perf-profile.calltrace.cycles-pp.smp_call_function_many_cond.on_each_cpu_cond_mask.arch_tlbbatch_flush.try_to_unmap_flush.shrink_page_list
0.00 +12.5 12.46 ± 8% perf-profile.calltrace.cycles-pp.lru_note_cost.shrink_inactive_list.shrink_lruvec.shrink_node.do_try_to_free_pages
0.00 +12.5 12.50 ± 10% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irq.lru_note_cost.shrink_inactive_list.shrink_lruvec
0.00 +12.6 12.56 ± 10% perf-profile.calltrace.cycles-pp._raw_spin_lock_irq.lru_note_cost.shrink_inactive_list.shrink_lruvec.shrink_node
12.95 ± 10% -2.9 10.09 ± 9% perf-profile.children.cycles-pp.shrink_page_list
4.34 ± 12% -1.9 2.46 ± 9% perf-profile.children.cycles-pp.try_to_unmap_flush
4.34 ± 12% -1.9 2.46 ± 9% perf-profile.children.cycles-pp.arch_tlbbatch_flush
4.34 ± 12% -1.9 2.46 ± 9% perf-profile.children.cycles-pp.on_each_cpu_cond_mask
4.24 ± 12% -1.8 2.41 ± 10% perf-profile.children.cycles-pp.smp_call_function_many_cond
2.77 ± 10% -0.3 2.44 ± 10% perf-profile.children.cycles-pp.page_referenced
2.09 ± 10% -0.2 1.84 ± 10% perf-profile.children.cycles-pp.page_referenced_one
1.98 ± 10% -0.2 1.75 ± 10% perf-profile.children.cycles-pp.page_vma_mapped_walk
0.98 ± 10% -0.1 0.85 ± 10% perf-profile.children.cycles-pp.isolate_lru_pages
0.37 ± 11% -0.0 0.32 ± 9% perf-profile.children.cycles-pp.sync_regs
0.09 ± 15% -0.0 0.05 ± 6% perf-profile.children.cycles-pp.smp_call_function_single
0.06 ± 10% +13.0 13.02 ± 8% perf-profile.children.cycles-pp.lru_note_cost
4.09 ± 12% -1.8 2.28 ± 10% perf-profile.self.cycles-pp.smp_call_function_many_cond
3.96 ± 11% -0.5 3.42 ± 10% perf-profile.self.cycles-pp.filemap_map_pages
1.58 ± 10% -0.2 1.40 ± 10% perf-profile.self.cycles-pp.page_vma_mapped_walk
0.30 ± 11% -0.1 0.23 ± 16% perf-profile.self.cycles-pp.__remove_mapping
0.09 ± 18% -0.0 0.04 ± 40% perf-profile.self.cycles-pp.smp_call_function_single
0.36 ± 10% -0.0 0.32 ± 9% perf-profile.self.cycles-pp.sync_regs
0.16 ± 10% -0.0 0.13 ± 14% perf-profile.self.cycles-pp.move_pages_to_lru
0.06 ± 10% +0.0 0.08 ± 10% perf-profile.self.cycles-pp.lru_note_cost
0.12 ± 12% +0.1 0.17 ± 14% perf-profile.self.cycles-pp._raw_spin_lock_irq
1.311e+10 -3.7% 1.262e+10 perf-stat.i.branch-instructions
1.148e+08 -4.1% 1.101e+08 perf-stat.i.branch-misses
4.887e+08 -5.1% 4.637e+08 perf-stat.i.cache-misses
6.388e+08 -4.3% 6.115e+08 perf-stat.i.cache-references
7634 -10.1% 6867 perf-stat.i.context-switches
2.24 +4.1% 2.33 perf-stat.i.cpi
329.58 +5.5% 347.60 perf-stat.i.cycles-between-cache-misses
1.598e+10 -3.8% 1.536e+10 perf-stat.i.dTLB-loads
8.453e+09 -4.6% 8.062e+09 perf-stat.i.dTLB-stores
2862085 -3.2% 2770116 perf-stat.i.iTLB-loads
6.452e+10 -3.9% 6.201e+10 perf-stat.i.instructions
0.46 -3.9% 0.45 perf-stat.i.ipc
2707617 -4.7% 2579975 perf-stat.i.major-faults
398.00 -4.0% 381.98 perf-stat.i.metric.M/sec
70003070 ± 2% -5.1% 66402385 ± 2% perf-stat.i.node-stores
2711171 -4.7% 2583543 perf-stat.i.page-faults
2.15 +4.1% 2.24 perf-stat.overall.cpi
283.80 +5.4% 299.24 perf-stat.overall.cycles-between-cache-misses
0.47 -3.9% 0.45 perf-stat.overall.ipc
23424 +1.1% 23674 perf-stat.overall.path-length
1.304e+10 -3.7% 1.256e+10 perf-stat.ps.branch-instructions
1.142e+08 -4.1% 1.095e+08 perf-stat.ps.branch-misses
4.864e+08 -5.2% 4.613e+08 perf-stat.ps.cache-misses
6.357e+08 -4.3% 6.083e+08 perf-stat.ps.cache-references
7594 -10.0% 6832 perf-stat.ps.context-switches
1.59e+10 -3.9% 1.528e+10 perf-stat.ps.dTLB-loads
8.412e+09 -4.7% 8.02e+09 perf-stat.ps.dTLB-stores
2847228 -3.2% 2755506 perf-stat.ps.iTLB-loads
6.42e+10 -3.9% 6.169e+10 perf-stat.ps.instructions
2694573 -4.8% 2566453 perf-stat.ps.major-faults
69668649 ± 2% -5.2% 66048410 ± 2% perf-stat.ps.node-stores
2698105 -4.7% 2570000 perf-stat.ps.page-faults
1.291e+13 -3.7% 1.243e+13 perf-stat.total.instructions
fio.read_bw_MBps
11000 +-------------------------------------------------------------------+
| + : : + + + :: +.+. |
10800 |-+ : :: : : : :: : +: : .+ ++. +. .+ .|
|.++. : + + : : +. .++.+.+ .+.+ : :.+.+ : + + +.+ + + |
10600 |-+ + + + + + + + + |
| |
10400 |-+ |
| O O O OO OO OO |
10200 |-+O O O O O O O O O O |
| O O O O |
10000 |-O O O O |
| O O O O O |
9800 |-+ O O |
| |
9600 +-------------------------------------------------------------------+
fio.read_iops
2.85e+06 +----------------------------------------------------------------+
| + + |
2.8e+06 |-+ :: + :: +.+ |
| +. : : :: + +.: : : : .+ .+ |
2.75e+06 |-++. : + : : : +. +. :+ .+. : + :.+ +.++.+ + +.|
|.+ + +.+ ++.+.++.+ +.+ + ++ + + |
2.7e+06 |-+ |
| |
2.65e+06 |-+ O O OO O |
| O O O O O O O O O |
2.6e+06 |-+ OO O O O O O O |
| O O O |
2.55e+06 |-O O O O O |
| O O O |
2.5e+06 +----------------------------------------------------------------+
fio.workload
5.7e+08 +-----------------------------------------------------------------+
| + + |
5.6e+08 |-+ :: + : : +.+ |
| +. : : :: + + : : + : .+ .+ |
5.5e+08 |-++. : + : : : .+ .+ + : .+ + + :+ +.++.+ + +.|
|.+ + +.+ +.++.+.++ +.+ + +.+ + + |
5.4e+08 |-+ |
| |
5.3e+08 |-+ O OO O O |
| O O O O O O O O O |
5.2e+08 |-+ OO O O O O O O |
| O O O |
5.1e+08 |-O O O O O |
| O O O |
5e+08 +-----------------------------------------------------------------+
fio.time.major_page_faults
5.6e+08 +-----------------------------------------------------------------+
| + + |
5.5e+08 |-+ + :: + + + : : .+.+ |
| +. :+ : : :: .+ +: .+ + :: : + +.++.+.+ .++.|
5.4e+08 |.+ + + +.+ : .+ .+.++.+ .+ + :.+ + + + + |
| + + + + |
5.3e+08 |-+ |
| O OO O O |
5.2e+08 |-+O O O O O O O O O |
| OO O O O O |
5.1e+08 |-+ O O O O |
| O O O O O O |
5e+08 |-+ O O O |
| O |
4.9e+08 +-----------------------------------------------------------------+
fio.time.file_system_inputs
4.45e+09 +----------------------------------------------------------------+
4.4e+09 |-+ :: + :: +.+ |
| +. : : :: + +.: : + :.+ .+ .+ |
4.35e+09 |-++. : + : : : +. +. :+ .+. : + :+ + +.+ + +.|
4.3e+09 |.+ + +.+ ++.+.++.+ +.+ + ++ + + |
| |
4.25e+09 |-+ |
4.2e+09 |-+ O O |
4.15e+09 |-+O O O O O O O |
| O O O O O O O O O O |
4.1e+09 |-+ O O O O |
4.05e+09 |-O O O O |
| O O O O O |
4e+09 |-+ O O |
3.95e+09 +----------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
---
0DAY/LKP+ Test Infrastructure Open Source Technology Center
https://lists.01.org/hyperkitty/list/lkp@lists.01.org Intel Corporation
Thanks,
Oliver Sang
View attachment "config-5.10.0-03419-g75cc3c9161cd" of type "text/plain" (171273 bytes)
View attachment "job-script" of type "text/plain" (8573 bytes)
View attachment "job.yaml" of type "text/plain" (5858 bytes)
View attachment "reproduce" of type "text/plain" (915 bytes)
Powered by blists - more mailing lists