[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190115032444.GP17624@shao2-debian>
Date: Tue, 15 Jan 2019 11:24:44 +0800
From: kernel test robot <rong.a.chen@...el.com>
To: Vincent Guittot <vincent.guittot@...aro.org>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@...el.com>,
Ulf Hansson <ulf.hansson@...aro.org>,
LKML <linux-kernel@...r.kernel.org>,
Linus Torvalds <torvalds@...ux-foundation.org>, lkp@...org
Subject: [LKP] [PM] 8234f6734c: will-it-scale.per_process_ops -3.6%
regression
Greeting,
FYI, we noticed a -3.6% regression of will-it-scale.per_process_ops due to commit:
commit: 8234f6734c5d74ac794e5517437f51c57d65f865 ("PM-runtime: Switch autosuspend over to using hrtimers")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
in testcase: will-it-scale
on test machine: 104 threads Skylake with 192G memory
with following parameters:
nr_task: 100%
mode: process
test: poll2
cpufreq_governor: performance
test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
test-url: https://github.com/antonblanchard/will-it-scale
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
gcc-7/performance/x86_64-rhel-7.2/process/100%/debian-x86_64-2018-04-03.cgz/lkp-skl-fpga01/poll2/will-it-scale
commit:
v4.20-rc7
8234f6734c ("PM-runtime: Switch autosuspend over to using hrtimers")
v4.20-rc7 8234f6734c5d74ac794e551743
---------------- --------------------------
fail:runs %reproduction fail:runs
| | |
:2 50% 1:4 dmesg.WARNING:at#for_ip_interrupt_entry/0x
%stddev %change %stddev
\ | \
240408 -3.6% 231711 will-it-scale.per_process_ops
25002520 -3.6% 24097991 will-it-scale.workload
351914 -1.7% 345882 interrupts.CAL:Function_call_interrupts
1.77 ± 45% -1.1 0.64 mpstat.cpu.idle%
106164 ± 24% -23.2% 81494 ± 28% numa-meminfo.node0.AnonHugePages
326430 ± 8% -11.3% 289513 softirqs.SCHED
1294 -2.0% 1268 vmstat.system.cs
3178 +48.4% 4716 ± 16% slabinfo.eventpoll_pwq.active_objs
3178 +48.4% 4716 ± 16% slabinfo.eventpoll_pwq.num_objs
336.32 -100.0% 0.00 uptime.boot
3192 -100.0% 0.00 uptime.idle
3.456e+08 ± 76% -89.9% 34913819 ± 62% cpuidle.C1E.time
747832 ± 72% -87.5% 93171 ± 45% cpuidle.C1E.usage
16209 ± 26% -38.2% 10021 ± 44% cpuidle.POLL.time
6352 ± 32% -39.5% 3843 ± 48% cpuidle.POLL.usage
885259 ± 2% -13.8% 763434 ± 7% numa-vmstat.node0.numa_hit
865117 ± 2% -13.9% 744992 ± 7% numa-vmstat.node0.numa_local
405085 ± 7% +38.0% 558905 ± 9% numa-vmstat.node1.numa_hit
254056 ± 11% +59.7% 405824 ± 13% numa-vmstat.node1.numa_local
738158 ± 73% -88.5% 85078 ± 47% turbostat.C1E
1.07 ± 76% -1.0 0.11 ± 62% turbostat.C1E%
1.58 ± 49% -65.4% 0.55 ± 6% turbostat.CPU%c1
0.15 ± 13% -35.0% 0.10 ± 38% turbostat.CPU%c6
153.97 ± 16% -54.7 99.31 turbostat.PKG_%
64141 +1.5% 65072 proc-vmstat.nr_anon_pages
19541 -7.0% 18178 ± 8% proc-vmstat.nr_shmem
18296 +1.1% 18506 proc-vmstat.nr_slab_reclaimable
713938 -2.3% 697489 proc-vmstat.numa_hit
693688 -2.4% 677228 proc-vmstat.numa_local
772220 -1.9% 757334 proc-vmstat.pgalloc_normal
798565 -1.8% 784042 proc-vmstat.pgfault
732336 -2.7% 712661 proc-vmstat.pgfree
20.33 ± 4% -7.0% 18.92 sched_debug.cfs_rq:/.runnable_load_avg.max
160603 -44.5% 89108 ± 38% sched_debug.cfs_rq:/.spread0.avg
250694 -29.3% 177358 ± 18% sched_debug.cfs_rq:/.spread0.max
1109 ± 4% -7.0% 1031 sched_debug.cfs_rq:/.util_avg.max
20.33 ± 4% -7.2% 18.88 sched_debug.cpu.cpu_load[0].max
-10.00 +35.0% -13.50 sched_debug.cpu.nr_uninterruptible.min
3.56 ± 10% +44.2% 5.14 ± 18% sched_debug.cpu.nr_uninterruptible.stddev
87.10 ± 24% -34.0% 57.44 ± 37% sched_debug.cpu.sched_goidle.avg
239.48 -25.6% 178.07 ± 18% sched_debug.cpu.sched_goidle.stddev
332.67 ± 7% -25.5% 247.83 ± 13% sched_debug.cpu.ttwu_count.min
231.67 ± 8% -15.4% 195.96 ± 12% sched_debug.cpu.ttwu_local.min
95.47 -95.5 0.00 perf-profile.calltrace.cycles-pp.poll
90.26 -90.3 0.00 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.poll
90.08 -90.1 0.00 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.poll
89.84 -89.8 0.00 perf-profile.calltrace.cycles-pp.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe.poll
88.04 -88.0 0.00 perf-profile.calltrace.cycles-pp.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe.poll
2.66 -0.1 2.54 perf-profile.calltrace.cycles-pp._copy_from_user.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
1.90 -0.1 1.81 perf-profile.calltrace.cycles-pp.copy_user_enhanced_fast_string._copy_from_user.do_sys_poll.__x64_sys_poll.do_syscall_64
2.56 +0.1 2.64 perf-profile.calltrace.cycles-pp.__fdget.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.00 +2.3 2.29 perf-profile.calltrace.cycles-pp.syscall_return_via_sysret
0.00 +2.3 2.34 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64
17.45 +3.8 21.24 perf-profile.calltrace.cycles-pp.__fget_light.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.00 +92.7 92.66 perf-profile.calltrace.cycles-pp.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.00 +94.5 94.51 perf-profile.calltrace.cycles-pp.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.00 +94.8 94.75 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.00 +94.9 94.92 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe
96.03 -96.0 0.00 perf-profile.children.cycles-pp.poll
90.29 -90.3 0.00 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
90.11 -90.1 0.00 perf-profile.children.cycles-pp.do_syscall_64
89.87 -89.9 0.00 perf-profile.children.cycles-pp.__x64_sys_poll
89.39 -89.4 0.00 perf-profile.children.cycles-pp.do_sys_poll
16.19 -16.2 0.00 perf-profile.children.cycles-pp.__fget_light
68.59 -68.6 0.00 perf-profile.self.cycles-pp.do_sys_poll
14.84 -14.8 0.00 perf-profile.self.cycles-pp.__fget_light
1.759e+13 -100.0% 0.00 perf-stat.branch-instructions
0.28 -0.3 0.00 perf-stat.branch-miss-rate%
4.904e+10 -100.0% 0.00 perf-stat.branch-misses
6.79 ± 3% -6.8 0.00 perf-stat.cache-miss-rate%
1.071e+08 ± 4% -100.0% 0.00 perf-stat.cache-misses
1.578e+09 -100.0% 0.00 perf-stat.cache-references
385311 ± 2% -100.0% 0.00 perf-stat.context-switches
1.04 -100.0% 0.00 perf-stat.cpi
8.643e+13 -100.0% 0.00 perf-stat.cpu-cycles
13787 -100.0% 0.00 perf-stat.cpu-migrations
0.00 ± 4% -0.0 0.00 perf-stat.dTLB-load-miss-rate%
23324811 ± 5% -100.0% 0.00 perf-stat.dTLB-load-misses
1.811e+13 -100.0% 0.00 perf-stat.dTLB-loads
0.00 -0.0 0.00 perf-stat.dTLB-store-miss-rate%
2478029 -100.0% 0.00 perf-stat.dTLB-store-misses
8.775e+12 -100.0% 0.00 perf-stat.dTLB-stores
99.66 -99.7 0.00 perf-stat.iTLB-load-miss-rate%
7.527e+09 -100.0% 0.00 perf-stat.iTLB-load-misses
25540468 ± 39% -100.0% 0.00 perf-stat.iTLB-loads
8.33e+13 -100.0% 0.00 perf-stat.instructions
11066 -100.0% 0.00 perf-stat.instructions-per-iTLB-miss
0.96 -100.0% 0.00 perf-stat.ipc
777357 -100.0% 0.00 perf-stat.minor-faults
81.69 -81.7 0.00 perf-stat.node-load-miss-rate%
20040093 -100.0% 0.00 perf-stat.node-load-misses
4491667 ± 7% -100.0% 0.00 perf-stat.node-loads
75.23 ± 10% -75.2 0.00 perf-stat.node-store-miss-rate%
3418662 ± 30% -100.0% 0.00 perf-stat.node-store-misses
1027183 ± 11% -100.0% 0.00 perf-stat.node-stores
777373 -100.0% 0.00 perf-stat.page-faults
3331644 -100.0% 0.00 perf-stat.path-length
will-it-scale.per_process_ops
242000 +-+----------------------------------------------------------------+
| +.+.. .+..+. .+.+..+.+.+. .+.+.. |
240000 +-+ + +.+ +.+..+ +..+ +.|
238000 +-+..+.+. .+. .+..+ |
| +. +.+ |
236000 +-+ |
| |
234000 +-+ |
| O O O O |
232000 +-+ O O O O O O O O O O O O O |
230000 +-+ O O O O O O |
| O |
228000 O-+ O O |
| O O |
226000 +-+----------------------------------------------------------------+
will-it-scale.workload
2.52e+07 +-+--------------------------------------------------------------+
| +..+. .+..+. .+. .+.+..+. .+..+. |
2.5e+07 +-+ + +.+ +.+.+. + +.+ +.|
2.48e+07 +-+.+..+. .+. .+.+ |
| + +..+ |
2.46e+07 +-+ |
2.44e+07 +-+ |
| |
2.42e+07 +-+ O O O O O O O O |
2.4e+07 +-+ O O O O O O O O O O |
| O O O O O O |
2.38e+07 O-+ O |
2.36e+07 +-O O O |
| |
2.34e+07 +-+--------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Rong Chen
View attachment "config-4.20.0-rc7-00001-g8234f6734" of type "text/plain" (168504 bytes)
View attachment "job-script" of type "text/plain" (7129 bytes)
View attachment "job.yaml" of type "text/plain" (4682 bytes)
View attachment "reproduce" of type "text/plain" (310 bytes)
Powered by blists - more mailing lists