[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20201110061226.GB3197@xsang-OptiPlex-9020>
Date: Tue, 10 Nov 2020 14:12:26 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Matthew Wilcox <willy@...radead.org>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Alexey Dobriyan <adobriyan@...il.com>,
Chris Wilson <chris@...is-wilson.co.uk>,
Huang Ying <ying.huang@...el.com>,
Hugh Dickins <hughd@...gle.com>,
Jani Nikula <jani.nikula@...ux.intel.com>,
Johannes Weiner <hannes@...xchg.org>,
Matthew Auld <matthew.auld@...el.com>,
William Kucharski <william.kucharski@...cle.com>,
LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org,
lkp@...el.com, feng.tang@...el.com, zhengjun.xing@...el.com
Subject: [mm/shmem] 63ec1973dd: will-it-scale.per_process_ops 21.7%
improvement
Greeting,
FYI, we noticed a 21.7% improvement of will-it-scale.per_process_ops due to commit:
commit: 63ec1973ddf3eb70feb5728088ca190f1af449cb ("mm/shmem: return head page from find_lock_entry")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
in testcase: will-it-scale
on test machine: 144 threads Intel(R) Xeon(R) CPU E7-8890 v3 @ 2.50GHz with 512G memory
with following parameters:
nr_task: 16
mode: process
test: pread2
cpufreq_governor: performance
ucode: 0x16
test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
test-url: https://github.com/antonblanchard/will-it-scale
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
gcc-9/performance/x86_64-rhel-8.3/process/16/debian-10.4-x86_64-20200603.cgz/lkp-hsw-4ex1/pread2/will-it-scale/0x16
commit:
a6de4b4873 ("mm: convert find_get_entry to return the head page")
63ec1973dd ("mm/shmem: return head page from find_lock_entry")
a6de4b4873e1e352 63ec1973ddf3eb70feb5728088c
---------------- ---------------------------
%stddev %change %stddev
\ | \
80053 +21.7% 97454 will-it-scale.per_process_ops
1280867 +21.7% 1559276 will-it-scale.workload
0.01 ± 54% -0.0 0.00 ± 19% mpstat.cpu.all.iowait%
1954274 ± 3% -11.2% 1735981 ± 5% sched_debug.cpu.ttwu_count.max
29477 -17.6% 24300 ± 3% syscalls.sys_pread64.med
479602 -8.6% 438278 ± 2% vmstat.system.cs
258998 ± 21% +24.7% 322869 vmstat.system.in
10279 ± 70% -78.6% 2196 ±123% proc-vmstat.numa_pages_migrated
31474 ± 13% -68.1% 10037 ± 97% proc-vmstat.numa_pte_updates
10279 ± 70% -78.6% 2196 ±123% proc-vmstat.pgmigrate_success
1.665e+08 ±112% -74.9% 41836625 ± 6% cpuidle.C1.time
17298485 ± 11% -27.2% 12593589 ± 4% cpuidle.C1.usage
1.114e+08 ± 6% -18.5% 90779946 ± 3% cpuidle.POLL.time
53803934 ± 5% -14.8% 45853270 ± 2% cpuidle.POLL.usage
8728 ± 19% -75.6% 2130 ± 60% numa-meminfo.node0.Shmem
82567 ± 72% -78.0% 18194 ± 88% numa-meminfo.node2.AnonPages
83890 ± 72% -72.2% 23294 ± 55% numa-meminfo.node2.Inactive
83890 ± 72% -72.2% 23294 ± 55% numa-meminfo.node2.Inactive(anon)
2898 ± 18% -33.2% 1935 ± 4% numa-vmstat.node0.nr_mapped
2181 ± 19% -75.6% 532.25 ± 60% numa-vmstat.node0.nr_shmem
20640 ± 72% -78.0% 4549 ± 88% numa-vmstat.node2.nr_anon_pages
20971 ± 72% -72.2% 5825 ± 55% numa-vmstat.node2.nr_inactive_anon
20971 ± 72% -72.2% 5825 ± 55% numa-vmstat.node2.nr_zone_inactive_anon
483171 -8.7% 441176 ± 2% perf-stat.i.context-switches
4.171e+09 +3.3% 4.31e+09 perf-stat.i.dTLB-loads
0.49 +7.2% 0.53 ± 6% perf-stat.i.major-faults
56.40 ± 2% -4.4 51.99 ± 2% perf-stat.i.node-store-miss-rate%
55.77 ± 2% -4.2 51.53 ± 2% perf-stat.overall.node-store-miss-rate%
3651580 -18.4% 2980867 perf-stat.overall.path-length
481529 -8.7% 439650 ± 2% perf-stat.ps.context-switches
4.158e+09 +3.3% 4.296e+09 perf-stat.ps.dTLB-loads
0.49 +7.3% 0.53 ± 6% perf-stat.ps.major-faults
15220 ± 25% +27.9% 19472 ± 4% softirqs.CPU101.RCU
15303 ± 17% +27.1% 19453 ± 6% softirqs.CPU119.RCU
13229 ± 20% +50.6% 19921 ± 7% softirqs.CPU121.RCU
14908 ± 17% +26.3% 18822 ± 6% softirqs.CPU122.RCU
15075 ± 17% +30.7% 19702 ± 7% softirqs.CPU123.RCU
14145 ± 27% +38.8% 19632 ± 5% softirqs.CPU124.RCU
14960 ± 18% +31.8% 19714 ± 5% softirqs.CPU125.RCU
9974 ± 12% +17.1% 11675 ± 5% softirqs.CPU16.RCU
15073 ± 20% +27.6% 19239 ± 6% softirqs.CPU25.RCU
14667 ± 17% +26.7% 18583 ± 5% softirqs.CPU36.RCU
14454 ± 19% +27.3% 18406 ± 5% softirqs.CPU47.RCU
12940 ± 12% +27.1% 16440 ± 3% softirqs.CPU49.RCU
12899 ± 12% +21.0% 15606 ± 6% softirqs.CPU50.RCU
12766 ± 16% +28.9% 16453 ± 4% softirqs.CPU52.RCU
13003 ± 16% +26.1% 16396 ± 3% softirqs.CPU53.RCU
13677 ± 10% +18.2% 16168 ± 4% softirqs.CPU69.RCU
54809 ± 5% +17.8% 64565 ± 11% softirqs.CPU78.SCHED
46549 ± 14% +43.3% 66714 ± 22% softirqs.CPU84.SCHED
2.29 ± 2% -0.3 1.98 ± 8% perf-profile.calltrace.cycles-pp.__wake_up_common.wake_up_page_bit.shmem_file_read_iter.new_sync_read.vfs_read
2.04 ± 2% -0.3 1.73 ± 9% perf-profile.calltrace.cycles-pp.wake_page_function.__wake_up_common.wake_up_page_bit.shmem_file_read_iter.new_sync_read
1.89 ± 2% -0.3 1.60 ± 9% perf-profile.calltrace.cycles-pp.try_to_wake_up.wake_page_function.__wake_up_common.wake_up_page_bit.shmem_file_read_iter
1.38 ± 4% -0.3 1.10 ± 8% perf-profile.calltrace.cycles-pp.poll_idle.cpuidle_enter_state.cpuidle_enter.do_idle.cpu_startup_entry
0.68 ± 3% -0.2 0.44 ± 58% perf-profile.calltrace.cycles-pp.arch_stack_walk.stack_trace_save_tsk.__account_scheduler_latency.enqueue_entity.enqueue_task_fair
1.32 ± 2% -0.2 1.09 ± 10% perf-profile.calltrace.cycles-pp.ttwu_do_activate.try_to_wake_up.wake_page_function.__wake_up_common.wake_up_page_bit
1.31 ± 2% -0.2 1.09 ± 10% perf-profile.calltrace.cycles-pp.enqueue_task_fair.ttwu_do_activate.try_to_wake_up.wake_page_function.__wake_up_common
1.24 ± 2% -0.2 1.03 ± 10% perf-profile.calltrace.cycles-pp.enqueue_entity.enqueue_task_fair.ttwu_do_activate.try_to_wake_up.wake_page_function
0.97 ± 2% -0.2 0.81 ± 10% perf-profile.calltrace.cycles-pp.__account_scheduler_latency.enqueue_entity.enqueue_task_fair.ttwu_do_activate.try_to_wake_up
0.79 ± 3% -0.1 0.66 ± 10% perf-profile.calltrace.cycles-pp.stack_trace_save_tsk.__account_scheduler_latency.enqueue_entity.enqueue_task_fair.ttwu_do_activate
0.00 +0.7 0.66 ± 9% perf-profile.calltrace.cycles-pp.unlock_page.shmem_file_read_iter.new_sync_read.vfs_read.ksys_pread64
2.01 ± 7% +0.7 2.72 ± 9% perf-profile.calltrace.cycles-pp.find_get_entry.find_lock_entry.shmem_getpage_gfp.shmem_file_read_iter.new_sync_read
2.30 ± 2% -0.3 1.98 ± 8% perf-profile.children.cycles-pp.__wake_up_common
2.04 ± 2% -0.3 1.73 ± 9% perf-profile.children.cycles-pp.wake_page_function
1.43 ± 3% -0.3 1.14 ± 9% perf-profile.children.cycles-pp.poll_idle
1.89 ± 2% -0.3 1.60 ± 9% perf-profile.children.cycles-pp.try_to_wake_up
1.44 -0.2 1.25 ± 10% perf-profile.children.cycles-pp.ttwu_do_activate
1.44 -0.2 1.25 ± 10% perf-profile.children.cycles-pp.enqueue_task_fair
1.37 -0.2 1.20 ± 9% perf-profile.children.cycles-pp.enqueue_entity
1.08 ± 2% -0.1 0.95 ± 10% perf-profile.children.cycles-pp.__account_scheduler_latency
0.88 ± 3% -0.1 0.78 ± 10% perf-profile.children.cycles-pp.stack_trace_save_tsk
0.77 ± 4% -0.1 0.67 ± 9% perf-profile.children.cycles-pp.arch_stack_walk
0.07 ± 11% -0.0 0.06 ± 15% perf-profile.children.cycles-pp.delayacct_end
0.07 ± 7% -0.0 0.05 ± 8% perf-profile.children.cycles-pp.tick_nohz_idle_exit
0.06 -0.0 0.05 perf-profile.children.cycles-pp.orc_find
0.14 ± 6% +0.0 0.18 ± 10% perf-profile.children.cycles-pp.sysvec_call_function_single
0.13 ± 6% +0.0 0.17 ± 9% perf-profile.children.cycles-pp.__sysvec_call_function_single
0.14 ± 3% +0.0 0.19 ± 8% perf-profile.children.cycles-pp.sched_ttwu_pending
0.16 ± 5% +0.0 0.20 ± 10% perf-profile.children.cycles-pp.asm_sysvec_call_function_single
0.39 ± 7% +0.1 0.49 ± 9% perf-profile.children.cycles-pp.mark_page_accessed
0.43 ± 4% +0.2 0.68 ± 10% perf-profile.children.cycles-pp.unlock_page
2.01 ± 7% +0.7 2.72 ± 9% perf-profile.children.cycles-pp.find_get_entry
1.35 ± 4% -0.3 1.08 ± 10% perf-profile.self.cycles-pp.poll_idle
0.28 -0.0 0.24 ± 12% perf-profile.self.cycles-pp.unwind_next_frame
0.07 ± 6% +0.0 0.08 ± 5% perf-profile.self.cycles-pp.ftrace_syscall_enter
0.04 ± 58% +0.0 0.07 ± 11% perf-profile.self.cycles-pp.vfs_read
0.43 ± 5% +0.1 0.50 ± 7% perf-profile.self.cycles-pp.__entry_text_start
0.10 ± 11% +0.1 0.19 ± 8% perf-profile.self.cycles-pp.shmem_getpage_gfp
0.38 ± 8% +0.1 0.48 ± 9% perf-profile.self.cycles-pp.mark_page_accessed
0.43 ± 4% +0.2 0.67 ± 9% perf-profile.self.cycles-pp.unlock_page
1.27 ± 3% +0.5 1.72 ± 9% perf-profile.self.cycles-pp.find_lock_entry
1.99 ± 7% +0.7 2.69 ± 9% perf-profile.self.cycles-pp.find_get_entry
6945014 ± 3% +34.0% 9303461 ± 7% interrupts.CAL:Function_call_interrupts
314959 ± 11% +36.0% 428223 ± 12% interrupts.CPU1.CAL:Function_call_interrupts
350887 ± 8% +34.6% 472145 ± 12% interrupts.CPU10.CAL:Function_call_interrupts
74115 ± 4% -12.7% 64708 ± 11% interrupts.CPU10.RES:Rescheduling_interrupts
72.50 ± 20% +141.4% 175.00 ± 29% interrupts.CPU102.NMI:Non-maskable_interrupts
72.50 ± 20% +141.4% 175.00 ± 29% interrupts.CPU102.PMI:Performance_monitoring_interrupts
74818 ± 4% -23.0% 57627 ± 24% interrupts.CPU11.RES:Rescheduling_interrupts
6.25 ± 63% +1824.0% 120.25 ±156% interrupts.CPU112.RES:Rescheduling_interrupts
92.50 ± 46% +101.9% 186.75 ± 37% interrupts.CPU113.NMI:Non-maskable_interrupts
92.50 ± 46% +101.9% 186.75 ± 37% interrupts.CPU113.PMI:Performance_monitoring_interrupts
83.00 ± 14% +88.9% 156.75 ± 15% interrupts.CPU116.NMI:Non-maskable_interrupts
83.00 ± 14% +88.9% 156.75 ± 15% interrupts.CPU116.PMI:Performance_monitoring_interrupts
532.25 ± 46% +243.1% 1826 ± 91% interrupts.CPU117.CAL:Function_call_interrupts
534.50 ± 46% +63.6% 874.25 ± 26% interrupts.CPU118.CAL:Function_call_interrupts
77.00 ± 13% +72.1% 132.50 ± 13% interrupts.CPU126.NMI:Non-maskable_interrupts
77.00 ± 13% +72.1% 132.50 ± 13% interrupts.CPU126.PMI:Performance_monitoring_interrupts
143.00 ± 16% -37.1% 90.00 ± 37% interrupts.CPU128.NMI:Non-maskable_interrupts
143.00 ± 16% -37.1% 90.00 ± 37% interrupts.CPU128.PMI:Performance_monitoring_interrupts
75257 ± 3% -19.1% 60897 ± 20% interrupts.CPU13.RES:Rescheduling_interrupts
329977 ± 6% +47.5% 486818 ± 10% interrupts.CPU15.CAL:Function_call_interrupts
74198 ± 4% -14.7% 63277 ± 5% interrupts.CPU15.RES:Rescheduling_interrupts
191.50 ± 25% -45.7% 104.00 ± 24% interrupts.CPU17.NMI:Non-maskable_interrupts
191.50 ± 25% -45.7% 104.00 ± 24% interrupts.CPU17.PMI:Performance_monitoring_interrupts
563.25 ± 48% +187.6% 1619 ± 89% interrupts.CPU23.CAL:Function_call_interrupts
96.50 ± 29% +80.1% 173.75 ± 28% interrupts.CPU31.NMI:Non-maskable_interrupts
96.50 ± 29% +80.1% 173.75 ± 28% interrupts.CPU31.PMI:Performance_monitoring_interrupts
91.25 ± 33% +64.9% 150.50 ± 21% interrupts.CPU35.NMI:Non-maskable_interrupts
91.25 ± 33% +64.9% 150.50 ± 21% interrupts.CPU35.PMI:Performance_monitoring_interrupts
75114 ± 3% -21.4% 59021 ± 17% interrupts.CPU4.RES:Rescheduling_interrupts
72179 ± 11% -23.4% 55313 ± 20% interrupts.CPU5.RES:Rescheduling_interrupts
90.75 ± 17% +64.2% 149.00 ± 25% interrupts.CPU52.NMI:Non-maskable_interrupts
90.75 ± 17% +64.2% 149.00 ± 25% interrupts.CPU52.PMI:Performance_monitoring_interrupts
84.00 ± 10% +84.2% 154.75 ± 15% interrupts.CPU56.NMI:Non-maskable_interrupts
84.00 ± 10% +84.2% 154.75 ± 15% interrupts.CPU56.PMI:Performance_monitoring_interrupts
84.50 ± 12% +96.7% 166.25 ± 17% interrupts.CPU58.NMI:Non-maskable_interrupts
84.50 ± 12% +96.7% 166.25 ± 17% interrupts.CPU58.PMI:Performance_monitoring_interrupts
69878 ± 4% -17.6% 57574 ± 9% interrupts.CPU6.RES:Rescheduling_interrupts
102.00 ± 24% +54.2% 157.25 ± 18% interrupts.CPU60.NMI:Non-maskable_interrupts
102.00 ± 24% +54.2% 157.25 ± 18% interrupts.CPU60.PMI:Performance_monitoring_interrupts
85.25 ± 9% +82.7% 155.75 ± 17% interrupts.CPU61.NMI:Non-maskable_interrupts
85.25 ± 9% +82.7% 155.75 ± 17% interrupts.CPU61.PMI:Performance_monitoring_interrupts
101.75 ± 22% +20.9% 123.00 ± 16% interrupts.CPU63.NMI:Non-maskable_interrupts
101.75 ± 22% +20.9% 123.00 ± 16% interrupts.CPU63.PMI:Performance_monitoring_interrupts
72249 ± 3% -15.4% 61129 ± 14% interrupts.CPU7.RES:Rescheduling_interrupts
130478 ± 13% +80.0% 234882 ± 36% interrupts.CPU72.CAL:Function_call_interrupts
19711 ± 11% -16.9% 16379 ± 9% interrupts.CPU73.RES:Rescheduling_interrupts
2249 ± 40% +113.2% 4796 ± 39% interrupts.CPU75.NMI:Non-maskable_interrupts
2249 ± 40% +113.2% 4796 ± 39% interrupts.CPU75.PMI:Performance_monitoring_interrupts
72231 ± 5% +132.6% 168009 ± 32% interrupts.CPU76.CAL:Function_call_interrupts
86400 ± 17% +92.1% 166000 ± 41% interrupts.CPU77.CAL:Function_call_interrupts
85969 ± 18% +64.2% 141136 ± 23% interrupts.CPU78.CAL:Function_call_interrupts
69869 ± 2% +65.8% 115838 ± 21% interrupts.CPU79.CAL:Function_call_interrupts
2873 ± 28% +67.3% 4807 ± 38% interrupts.CPU79.NMI:Non-maskable_interrupts
2873 ± 28% +67.3% 4807 ± 38% interrupts.CPU79.PMI:Performance_monitoring_interrupts
354870 ± 5% +22.0% 432987 ± 10% interrupts.CPU8.CAL:Function_call_interrupts
4028 ± 38% +33.0% 5357 ± 31% interrupts.CPU8.NMI:Non-maskable_interrupts
4028 ± 38% +33.0% 5357 ± 31% interrupts.CPU8.PMI:Performance_monitoring_interrupts
85743 ± 16% +53.9% 131987 ± 21% interrupts.CPU81.CAL:Function_call_interrupts
74246 ± 9% +77.0% 131443 ± 40% interrupts.CPU82.CAL:Function_call_interrupts
2542 ± 29% +62.0% 4119 ± 25% interrupts.CPU82.NMI:Non-maskable_interrupts
2542 ± 29% +62.0% 4119 ± 25% interrupts.CPU82.PMI:Performance_monitoring_interrupts
77685 ± 7% +119.8% 170750 ± 51% interrupts.CPU84.CAL:Function_call_interrupts
83690 ± 13% +73.3% 145028 ± 54% interrupts.CPU86.CAL:Function_call_interrupts
78878 ± 17% +78.4% 140747 ± 28% interrupts.CPU87.CAL:Function_call_interrupts
74204 ± 6% -13.5% 64165 ± 7% interrupts.CPU9.RES:Rescheduling_interrupts
577.50 ± 47% +266.8% 2118 ± 89% interrupts.CPU93.CAL:Function_call_interrupts
131.50 ± 26% -37.5% 82.25 ± 12% interrupts.CPU94.NMI:Non-maskable_interrupts
131.50 ± 26% -37.5% 82.25 ± 12% interrupts.CPU94.PMI:Performance_monitoring_interrupts
536.25 ± 46% +61.4% 865.25 ± 26% interrupts.CPU99.CAL:Function_call_interrupts
will-it-scale.per_process_ops
105000 +------------------------------------------------------------------+
| |
100000 |-O O |
| O O O OO O |
| O O O O O |
95000 |-+ O O O O O O |
| O O O |
90000 |-+ |
| |
85000 |-+ |
| +.+ + |
| ++. .+ + +. + ++. .+ +.+. +. .++.+. .+. :+ .+ |
80000 |++ + ++.+.+ +.+ + + + + : + + ++.+.++ + + +.|
| + + |
75000 +------------------------------------------------------------------+
will-it-scale.workload
1.65e+06 +----------------------------------------------------------------+
| O |
1.6e+06 |-+O O O |
1.55e+06 |-+ O O O O O O OO |
| O O O O |
1.5e+06 |-+ OO O O O |
1.45e+06 |-+ O |
| |
1.4e+06 |-+ |
1.35e+06 |-+ + |
| +.+ +. : : |
1.3e+06 |.++. .+ : + + .+ .+.+ .+. +.+ .+. +.+.+ +. +. : +.+ |
1.25e+06 |-+ + +.+.++ + + + + + + +.+ +.+ + +.|
| |
1.2e+06 +----------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Oliver Sang
View attachment "config-5.9.0-02744-g63ec1973ddf3e" of type "text/plain" (170569 bytes)
View attachment "job-script" of type "text/plain" (7861 bytes)
View attachment "job.yaml" of type "text/plain" (5327 bytes)
View attachment "reproduce" of type "text/plain" (338 bytes)
Powered by blists - more mailing lists