[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20210318150245.GD17012@xsang-OptiPlex-9020>
Date: Thu, 18 Mar 2021 23:02:46 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Peter Xu <peterx@...hat.com>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
Mike Kravetz <mike.kravetz@...cle.com>,
Jason Gunthorpe <jgg@...pe.ca>,
Alexey Dobriyan <adobriyan@...il.com>,
Andrea Arcangeli <aarcange@...hat.com>,
Christoph Hellwig <hch@....de>,
Daniel Vetter <daniel@...ll.ch>,
David Airlie <airlied@...ux.ie>,
David Gibson <david@...son.dropbear.id.au>,
Gal Pressman <galpress@...zon.com>, Jan Kara <jack@...e.cz>,
Jann Horn <jannh@...gle.com>,
Kirill Shutemov <kirill@...temov.name>,
Kirill Tkhai <ktkhai@...tuozzo.com>,
Matthew Wilcox <willy@...radead.org>,
Miaohe Lin <linmiaohe@...wei.com>,
Mike Rapoport <rppt@...ux.vnet.ibm.com>,
Roland Scheidegger <sroland@...are.com>,
VMware Graphics <linux-graphics-maintainer@...are.com>,
Wei Zhang <wzam@...zon.com>,
Andrew Morton <akpm@...ux-foundation.org>,
LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org,
lkp@...el.com, ying.huang@...el.com, feng.tang@...el.com,
zhengjun.xing@...el.com
Subject: [hugetlb] 4eae4efa2c: vm-scalability.throughput 1.1% improvement
Greeting,
FYI, we noticed a 1.1% improvement of vm-scalability.throughput due to commit:
commit: 4eae4efa2c299f85b7ebfbeeda56c19c5eba2768 ("hugetlb: do early cow when page pinned on src mm")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
in testcase: vm-scalability
on test machine: 96 threads Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 256G memory
with following parameters:
runtime: 300s
size: 8T
test: anon-cow-seq-hugetlb
cpufreq_governor: performance
ucode: 0x5003006
test-description: The motivation behind this suite is to exercise functions and regions of the mm/ of the Linux kernel which are of interest to us.
test-url: https://git.kernel.org/cgit/linux/kernel/git/wfg/vm-scalability.git/
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml
bin/lkp run compatible-job.yaml
=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/runtime/size/tbox_group/test/testcase/ucode:
gcc-9/performance/x86_64-rhel-8.3/debian-10.4-x86_64-20200603.cgz/300s/8T/lkp-csl-2sp6/anon-cow-seq-hugetlb/vm-scalability/0x5003006
commit:
ca6eb14d64 ("mm: use is_cow_mapping() across tree where proper")
4eae4efa2c ("hugetlb: do early cow when page pinned on src mm")
ca6eb14d6453bea8 4eae4efa2c299f85b7ebfbeeda5
---------------- ---------------------------
%stddev %change %stddev
\ | \
0.93 ± 9% +0.1 1.06 ± 5% vm-scalability.stddev%
36639545 +1.1% 37036215 vm-scalability.throughput
90908 ± 4% -4.8% 86547 vm-scalability.time.involuntary_context_switches
12834 -1.2% 12676 vm-scalability.time.system_time
7026 ± 36% +119.0% 15385 ± 62% softirqs.NET_RX
68.53 ± 18% -31.0% 47.25 ± 14% sched_debug.cpu.nr_uninterruptible.max
-28.19 +23.9% -34.94 sched_debug.cpu.nr_uninterruptible.min
3928701 -1.0% 3888371 proc-vmstat.htlb_buddy_alloc_success
2.013e+09 -1.0% 1.992e+09 proc-vmstat.pgalloc_normal
2.011e+09 -0.9% 1.992e+09 proc-vmstat.pgfree
3069 ± 18% -12.2% 2694 ± 4% interrupts.CPU32.CAL:Function_call_interrupts
264.17 ± 41% -33.6% 175.33 ± 8% interrupts.CPU74.RES:Rescheduling_interrupts
264.50 ± 22% -33.1% 177.00 ± 18% interrupts.CPU77.RES:Rescheduling_interrupts
246.00 ± 28% -33.6% 163.33 ± 16% interrupts.CPU82.RES:Rescheduling_interrupts
219.67 ± 20% -24.4% 166.17 ± 14% interrupts.CPU90.RES:Rescheduling_interrupts
1.937e+11 -1.0% 1.917e+11 perf-stat.i.cpu-cycles
1.045e+08 -2.2% 1.022e+08 perf-stat.i.node-load-misses
0.92 ± 17% +0.6 1.54 ± 10% perf-stat.i.node-store-miss-rate%
409065 ± 16% +88.8% 772315 ± 11% perf-stat.i.node-store-misses
0.69 ± 18% +0.6 1.30 ± 9% perf-stat.overall.node-store-miss-rate%
1.045e+08 -2.2% 1.023e+08 perf-stat.ps.node-load-misses
408265 ± 17% +87.9% 767022 ± 11% perf-stat.ps.node-store-misses
0.46 ± 39% -56.4% 0.20 ± 60% perf-sched.sch_delay.avg.ms.wait_for_partner.fifo_open.do_dentry_open.do_open.isra
2578 ± 3% -19.8% 2067 ± 10% perf-sched.wait_and_delay.max.ms.devkmsg_read.vfs_read.ksys_read.do_syscall_64
2582 ± 3% -20.0% 2067 ± 10% perf-sched.wait_and_delay.max.ms.do_syslog.part.0.kmsg_read.vfs_read
2412 ± 3% -19.9% 1931 ± 11% perf-sched.wait_and_delay.max.ms.do_wait.kernel_wait4.__do_sys_wait4.do_syscall_64
2568 ± 3% -19.9% 2057 ± 10% perf-sched.wait_and_delay.max.ms.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.[unknown]
2362 ± 4% -23.6% 1804 ± 12% perf-sched.wait_and_delay.max.ms.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
2667 ± 3% -18.0% 2187 ± 9% perf-sched.wait_and_delay.max.ms.schedule_hrtimeout_range_clock.poll_schedule_timeout.constprop.0.do_select
2.80 ±186% -95.0% 0.14 ± 79% perf-sched.wait_time.avg.ms.wait_for_partner.fifo_open.do_dentry_open.do_open.isra
2578 ± 3% -19.8% 2067 ± 10% perf-sched.wait_time.max.ms.devkmsg_read.vfs_read.ksys_read.do_syscall_64
2580 ± 3% -19.9% 2067 ± 10% perf-sched.wait_time.max.ms.do_syslog.part.0.kmsg_read.vfs_read
2412 ± 3% -19.9% 1931 ± 11% perf-sched.wait_time.max.ms.do_wait.kernel_wait4.__do_sys_wait4.do_syscall_64
2568 ± 3% -19.9% 2057 ± 10% perf-sched.wait_time.max.ms.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.[unknown]
2362 ± 4% -23.6% 1804 ± 12% perf-sched.wait_time.max.ms.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
2667 ± 3% -18.0% 2187 ± 9% perf-sched.wait_time.max.ms.schedule_hrtimeout_range_clock.poll_schedule_timeout.constprop.0.do_select
176.12 ±209% -96.6% 5.98 ± 64% perf-sched.wait_time.max.ms.wait_for_partner.fifo_open.do_dentry_open.do_open.isra
vm-scalability.throughput
3.72e+07 +----------------------------------------------------------------+
| |
3.71e+07 |-+ O O |
| O O O O O |
3.7e+07 |-+O O O O OO |
| O O O |
3.69e+07 |-+ O O OO O O O |
| |
3.68e+07 |-+ + |
| +. :: |
3.67e+07 |-+ : + + : :.+ |
|+. .+ :+ + + + .+ +. +.++ + +.++. |
3.66e+07 |-+++.++ + : : :.+ + + ++.++.+ ++.++.+ +|
| : +.+ : + + |
3.65e+07 +----------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
---
0DAY/LKP+ Test Infrastructure Open Source Technology Center
https://lists.01.org/hyperkitty/list/lkp@lists.01.org Intel Corporation
Thanks,
Oliver Sang
View attachment "config-5.12.0-rc2-00348-g4eae4efa2c29" of type "text/plain" (172899 bytes)
View attachment "job-script" of type "text/plain" (8144 bytes)
View attachment "job.yaml" of type "text/plain" (5551 bytes)
View attachment "reproduce" of type "text/plain" (6686 bytes)
Powered by blists - more mailing lists