[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200205123110.GN12867@shao2-debian>
Date: Wed, 5 Feb 2020 20:31:10 +0800
From: kernel test robot <rong.a.chen@...el.com>
To: Roman Sudarikov <roman.sudarikov@...ux.intel.com>
Cc: 0day robot <lkp@...el.com>, Kan Liang <kan.liang@...ux.intel.com>,
Alexander Antonov <alexander.antonov@...el.com>,
LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org
Subject: [perf x86] b77491648e: will-it-scale.per_process_ops -2.1% regression
Greeting,
FYI, we noticed a -2.1% regression of will-it-scale.per_process_ops due to commit:
commit: b77491648e6eb2f26b6edf5eaea859adc17f4dcc ("perf x86: Infrastructure for exposing an Uncore unit to PMON mapping")
https://github.com/0day-ci/linux/commits/roman-sudarikov-linux-intel-com/perf-x86-Exposing-IO-stack-to-IO-PMON-mapping-through-sysfs/20200118-075508
in testcase: will-it-scale
on test machine: 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 128G memory
with following parameters:
nr_task: 100%
mode: process
test: signal1
cpufreq_governor: performance
ucode: 0xb000038
test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
test-url: https://github.com/antonblanchard/will-it-scale
If you fix the issue, kindly add following tag
Reported-by: kernel test robot <rong.a.chen@...el.com>
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
gcc-7/performance/x86_64-rhel-7.6/process/100%/debian-x86_64-20191114.cgz/lkp-bdw-ep6/signal1/will-it-scale/0xb000038
commit:
v5.4
b77491648e ("perf x86: Infrastructure for exposing an Uncore unit to PMON mapping")
v5.4 b77491648e6eb2f26b6edf5eaea
---------------- ---------------------------
%stddev %change %stddev
\ | \
47986 -2.1% 46989 will-it-scale.per_process_ops
4222852 -2.1% 4135110 will-it-scale.workload
427194 ± 9% +13.8% 486344 ± 4% numa-vmstat.node1.numa_local
12.88 ± 2% -8.5% 11.79 ± 4% turbostat.RAMWatt
8846 ± 10% +23.9% 10964 ± 9% softirqs.CPU0.SCHED
14442 ± 4% -5.2% 13697 ± 5% softirqs.CPU71.RCU
78696 ± 9% +14.4% 89993 ± 8% sched_debug.cfs_rq:/.min_vruntime.stddev
78411 ± 9% +14.5% 89817 ± 8% sched_debug.cfs_rq:/.spread0.stddev
9.77 ± 4% +15.0% 11.23 ± 3% sched_debug.cpu.clock.stddev
9.77 ± 4% +15.0% 11.23 ± 3% sched_debug.cpu.clock_task.stddev
4.072e+09 -1.9% 3.996e+09 perf-stat.i.branch-instructions
44948352 -1.8% 44159252 perf-stat.i.branch-misses
35.25 +4.3 39.56 perf-stat.i.cache-miss-rate%
12569960 +5.2% 13223444 perf-stat.i.cache-misses
35888855 ± 2% -6.2% 33680305 ± 2% perf-stat.i.cache-references
11.75 +1.8% 11.96 perf-stat.i.cpi
19377 -5.0% 18403 perf-stat.i.cycles-between-cache-misses
27157347 -2.1% 26595986 perf-stat.i.dTLB-load-misses
6.739e+09 -2.0% 6.602e+09 perf-stat.i.dTLB-loads
27809165 -1.9% 27268405 perf-stat.i.dTLB-store-misses
5.461e+09 -1.9% 5.356e+09 perf-stat.i.dTLB-stores
2.072e+10 -1.9% 2.034e+10 perf-stat.i.instructions
0.09 -1.7% 0.08 perf-stat.i.ipc
917994 +2.6% 941599 perf-stat.i.node-load-misses
96.93 -1.1 95.81 perf-stat.i.node-store-miss-rate%
5499191 +5.0% 5774707 perf-stat.i.node-store-misses
169716 ± 8% +45.2% 246479 ± 6% perf-stat.i.node-stores
1.73 ± 2% -4.4% 1.66 ± 2% perf-stat.overall.MPKI
35.03 +4.2 39.27 perf-stat.overall.cache-miss-rate%
11.77 +1.8% 11.98 perf-stat.overall.cpi
19401 -5.0% 18428 perf-stat.overall.cycles-between-cache-misses
0.08 -1.8% 0.08 perf-stat.overall.ipc
97.01 -1.1 95.91 perf-stat.overall.node-store-miss-rate%
4.058e+09 -1.8% 3.983e+09 perf-stat.ps.branch-instructions
44798305 -1.7% 44014351 perf-stat.ps.branch-misses
12526500 +5.2% 13178368 perf-stat.ps.cache-misses
35771706 ± 2% -6.2% 33569906 ± 2% perf-stat.ps.cache-references
27063288 -2.1% 26505363 perf-stat.ps.dTLB-load-misses
6.716e+09 -2.0% 6.58e+09 perf-stat.ps.dTLB-loads
27712662 -1.9% 27175399 perf-stat.ps.dTLB-store-misses
5.442e+09 -1.9% 5.338e+09 perf-stat.ps.dTLB-stores
2.065e+10 -1.9% 2.027e+10 perf-stat.ps.instructions
914841 +2.6% 938399 perf-stat.ps.node-load-misses
5480102 +5.0% 5754996 perf-stat.ps.node-store-misses
169148 ± 8% +45.2% 245649 ± 6% perf-stat.ps.node-stores
6.242e+12 -1.6% 6.142e+12 perf-stat.total.instructions
481.50 ± 26% -41.7% 280.75 ± 28% interrupts.37:IR-PCI-MSI.1572868-edge.eth0-TxRx-3
772.75 ± 63% -70.0% 231.75 ± 28% interrupts.CPU1.RES:Rescheduling_interrupts
481.50 ± 26% -41.7% 280.75 ± 28% interrupts.CPU16.37:IR-PCI-MSI.1572868-edge.eth0-TxRx-3
954.25 ± 10% -71.8% 269.50 ± 76% interrupts.CPU19.RES:Rescheduling_interrupts
932.50 ± 48% -68.4% 294.75 ± 72% interrupts.CPU20.RES:Rescheduling_interrupts
583.75 ± 59% -79.5% 119.75 ± 54% interrupts.CPU21.RES:Rescheduling_interrupts
513.00 ± 42% +145.8% 1261 ± 17% interrupts.CPU22.RES:Rescheduling_interrupts
256.25 ± 40% +253.9% 906.75 ± 39% interrupts.CPU24.RES:Rescheduling_interrupts
475.25 ± 19% +133.5% 1109 ± 41% interrupts.CPU26.RES:Rescheduling_interrupts
734.50 ± 36% +99.1% 1462 ± 26% interrupts.CPU27.RES:Rescheduling_interrupts
905.75 ± 48% -64.9% 318.00 ± 85% interrupts.CPU3.RES:Rescheduling_interrupts
363.00 ± 35% +114.3% 777.75 ± 26% interrupts.CPU30.RES:Rescheduling_interrupts
6915 ± 24% -29.1% 4904 ± 34% interrupts.CPU37.NMI:Non-maskable_interrupts
6915 ± 24% -29.1% 4904 ± 34% interrupts.CPU37.PMI:Performance_monitoring_interrupts
436.50 ± 48% +166.7% 1164 ± 41% interrupts.CPU38.RES:Rescheduling_interrupts
6950 ± 24% -29.1% 4926 ± 34% interrupts.CPU39.NMI:Non-maskable_interrupts
6950 ± 24% -29.1% 4926 ± 34% interrupts.CPU39.PMI:Performance_monitoring_interrupts
6906 ± 24% -28.9% 4910 ± 35% interrupts.CPU41.NMI:Non-maskable_interrupts
6906 ± 24% -28.9% 4910 ± 35% interrupts.CPU41.PMI:Performance_monitoring_interrupts
216.00 ± 70% -76.6% 50.50 ± 22% interrupts.CPU46.RES:Rescheduling_interrupts
2607 ± 47% +51.4% 3948 ± 8% interrupts.CPU50.CAL:Function_call_interrupts
3220 ± 10% +22.4% 3940 ± 8% interrupts.CPU51.CAL:Function_call_interrupts
4914 ± 34% +59.9% 7855 interrupts.CPU56.NMI:Non-maskable_interrupts
4914 ± 34% +59.9% 7855 interrupts.CPU56.PMI:Performance_monitoring_interrupts
4937 ± 34% +59.7% 7885 interrupts.CPU58.NMI:Non-maskable_interrupts
4937 ± 34% +59.7% 7885 interrupts.CPU58.PMI:Performance_monitoring_interrupts
4919 ± 34% +59.6% 7849 interrupts.CPU59.NMI:Non-maskable_interrupts
4919 ± 34% +59.6% 7849 interrupts.CPU59.PMI:Performance_monitoring_interrupts
4925 ± 34% +59.9% 7878 interrupts.CPU61.NMI:Non-maskable_interrupts
4925 ± 34% +59.9% 7878 interrupts.CPU61.PMI:Performance_monitoring_interrupts
4906 ± 33% +60.3% 7867 interrupts.CPU63.NMI:Non-maskable_interrupts
4906 ± 33% +60.3% 7867 interrupts.CPU63.PMI:Performance_monitoring_interrupts
890.00 ± 75% -82.0% 160.00 ± 46% interrupts.CPU63.RES:Rescheduling_interrupts
135.00 ± 52% +911.7% 1365 ± 76% interrupts.CPU70.RES:Rescheduling_interrupts
110.25 ± 14% +388.7% 538.75 ± 30% interrupts.CPU71.RES:Rescheduling_interrupts
3285 ± 3% +15.4% 3791 ± 3% interrupts.CPU73.CAL:Function_call_interrupts
186.50 ± 60% +274.4% 698.25 ± 77% interrupts.CPU81.RES:Rescheduling_interrupts
1.22 ± 2% -0.2 1.02 perf-profile.calltrace.cycles-pp.recalc_sigpending.dequeue_signal.get_signal.do_signal.exit_to_usermode_loop
3.95 -0.2 3.79 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64
4.07 -0.2 3.92 perf-profile.calltrace.cycles-pp.syscall_return_via_sysret
1.93 -0.1 1.79 perf-profile.calltrace.cycles-pp.fpu__clear.do_signal.exit_to_usermode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.66 -0.1 0.59 ± 3% perf-profile.calltrace.cycles-pp.__set_task_blocked.__set_current_blocked.__x64_sys_rt_sigreturn.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.65 ± 2% -0.1 0.57 ± 2% perf-profile.calltrace.cycles-pp.recalc_sigpending.__set_task_blocked.__set_current_blocked.__x64_sys_rt_sigreturn.do_syscall_64
0.85 -0.1 0.79 ± 2% perf-profile.calltrace.cycles-pp.__set_current_blocked.__x64_sys_rt_sigreturn.do_syscall_64.entry_SYSCALL_64_after_hwframe
1.03 -0.0 0.98 perf-profile.calltrace.cycles-pp.signal_setup_done.do_signal.exit_to_usermode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.81 -0.0 0.76 ± 2% perf-profile.calltrace.cycles-pp.fpregs_mark_activate.fpu__clear.do_signal.exit_to_usermode_loop.do_syscall_64
0.98 -0.0 0.94 perf-profile.calltrace.cycles-pp.__set_current_blocked.signal_setup_done.do_signal.exit_to_usermode_loop.do_syscall_64
1.10 -0.0 1.07 perf-profile.calltrace.cycles-pp.copy_fpstate_to_sigframe.do_signal.exit_to_usermode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.52 +0.0 0.55 ± 3% perf-profile.calltrace.cycles-pp.fpregs_mark_activate.__fpu__restore_sig.__x64_sys_rt_sigreturn.do_syscall_64.entry_SYSCALL_64_after_hwframe
1.87 +0.1 1.95 perf-profile.calltrace.cycles-pp.__fpu__restore_sig.__x64_sys_rt_sigreturn.do_syscall_64.entry_SYSCALL_64_after_hwframe
23.79 +0.3 24.06 perf-profile.calltrace.cycles-pp.__sigqueue_alloc.__send_signal.do_send_sig_info.do_send_specific.do_tkill
24.02 +0.3 24.29 perf-profile.calltrace.cycles-pp.__send_signal.do_send_sig_info.do_send_specific.do_tkill.__x64_sys_tgkill
89.84 +0.3 90.14 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe
89.46 +0.3 89.78 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe
36.84 +0.4 37.20 perf-profile.calltrace.cycles-pp.exit_to_usermode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe
36.43 +0.4 36.80 perf-profile.calltrace.cycles-pp.do_signal.exit_to_usermode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe
25.64 +0.4 26.09 perf-profile.calltrace.cycles-pp.__x64_sys_tgkill.do_syscall_64.entry_SYSCALL_64_after_hwframe
25.58 +0.5 26.04 perf-profile.calltrace.cycles-pp.do_tkill.__x64_sys_tgkill.do_syscall_64.entry_SYSCALL_64_after_hwframe
25.35 +0.5 25.82 perf-profile.calltrace.cycles-pp.do_send_specific.do_tkill.__x64_sys_tgkill.do_syscall_64.entry_SYSCALL_64_after_hwframe
24.66 +0.5 25.18 perf-profile.calltrace.cycles-pp.do_send_sig_info.do_send_specific.do_tkill.__x64_sys_tgkill.do_syscall_64
30.40 +0.6 30.97 perf-profile.calltrace.cycles-pp.dequeue_signal.get_signal.do_signal.exit_to_usermode_loop.do_syscall_64
31.58 +0.6 32.18 perf-profile.calltrace.cycles-pp.get_signal.do_signal.exit_to_usermode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe
29.13 +0.8 29.91 perf-profile.calltrace.cycles-pp.__dequeue_signal.dequeue_signal.get_signal.do_signal.exit_to_usermode_loop
28.90 +0.8 29.68 perf-profile.calltrace.cycles-pp.__sigqueue_free.__dequeue_signal.dequeue_signal.get_signal.do_signal
3.46 -0.2 3.21 ± 2% perf-profile.children.cycles-pp.recalc_sigpending
3.95 -0.2 3.79 perf-profile.children.cycles-pp.entry_SYSCALL_64
4.42 -0.2 4.26 perf-profile.children.cycles-pp.syscall_return_via_sysret
1.93 -0.1 1.80 ± 2% perf-profile.children.cycles-pp.fpu__clear
3.62 -0.1 3.54 perf-profile.children.cycles-pp.__set_current_blocked
0.27 -0.1 0.21 ± 3% perf-profile.children.cycles-pp.fpregs_assert_state_consistent
0.84 -0.0 0.79 ± 2% perf-profile.children.cycles-pp._copy_from_user
1.03 -0.0 0.99 perf-profile.children.cycles-pp.signal_setup_done
0.34 -0.0 0.30 ± 5% perf-profile.children.cycles-pp.restore_altstack
0.73 -0.0 0.70 perf-profile.children.cycles-pp.__might_fault
1.11 -0.0 1.08 perf-profile.children.cycles-pp.copy_fpstate_to_sigframe
0.37 ± 2% -0.0 0.35 perf-profile.children.cycles-pp.___might_sleep
0.27 -0.0 0.26 perf-profile.children.cycles-pp.copy_user_enhanced_fast_string
1.89 +0.1 1.96 perf-profile.children.cycles-pp.__fpu__restore_sig
0.29 ± 7% +0.2 0.53 ± 6% perf-profile.children.cycles-pp.__lock_task_sighand
0.29 ± 7% +0.2 0.53 ± 5% perf-profile.children.cycles-pp._raw_spin_lock_irqsave
24.03 +0.3 24.29 perf-profile.children.cycles-pp.__send_signal
23.80 +0.3 24.06 perf-profile.children.cycles-pp.__sigqueue_alloc
90.00 +0.3 90.30 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
89.60 +0.3 89.92 perf-profile.children.cycles-pp.do_syscall_64
36.86 +0.4 37.22 perf-profile.children.cycles-pp.exit_to_usermode_loop
36.45 +0.4 36.81 perf-profile.children.cycles-pp.do_signal
25.65 +0.4 26.10 perf-profile.children.cycles-pp.__x64_sys_tgkill
25.59 +0.5 26.04 perf-profile.children.cycles-pp.do_tkill
25.36 +0.5 25.82 perf-profile.children.cycles-pp.do_send_specific
24.67 +0.5 25.19 perf-profile.children.cycles-pp.do_send_sig_info
30.41 +0.6 30.98 perf-profile.children.cycles-pp.dequeue_signal
31.60 +0.6 32.20 perf-profile.children.cycles-pp.get_signal
29.14 +0.8 29.92 perf-profile.children.cycles-pp.__dequeue_signal
28.90 +0.8 29.69 perf-profile.children.cycles-pp.__sigqueue_free
19.11 -0.4 18.75 perf-profile.self.cycles-pp.do_syscall_64
2.58 -0.2 2.34 perf-profile.self.cycles-pp.recalc_sigpending
3.95 -0.2 3.79 perf-profile.self.cycles-pp.entry_SYSCALL_64
4.41 -0.2 4.25 perf-profile.self.cycles-pp.syscall_return_via_sysret
0.95 -0.1 0.86 ± 2% perf-profile.self.cycles-pp.fpu__clear
0.25 -0.1 0.19 ± 2% perf-profile.self.cycles-pp.fpregs_assert_state_consistent
0.15 ± 2% -0.0 0.12 ± 6% perf-profile.self.cycles-pp._copy_from_user
0.74 -0.0 0.71 perf-profile.self.cycles-pp.copy_fpstate_to_sigframe
0.34 -0.0 0.31 perf-profile.self.cycles-pp.__x64_sys_rt_sigprocmask
0.46 ± 2% -0.0 0.44 ± 2% perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
0.36 ± 3% -0.0 0.34 perf-profile.self.cycles-pp.___might_sleep
0.26 -0.0 0.24 ± 2% perf-profile.self.cycles-pp.copy_user_enhanced_fast_string
1.10 +0.0 1.15 perf-profile.self.cycles-pp.__fpu__restore_sig
0.28 ± 6% +0.2 0.53 ± 5% perf-profile.self.cycles-pp._raw_spin_lock_irqsave
23.71 +0.3 23.96 perf-profile.self.cycles-pp.__sigqueue_alloc
12.65 +0.6 13.24 perf-profile.self.cycles-pp.__sigqueue_free
will-it-scale.per_process_ops
52000 +-+-----------------------------------------------------------------+
|.. |
51000 +-++.+..+.+ |
50000 +-+ : |
| : |
49000 +-+ : |
| +..+. .+.+..+.+..+..+.+..+.+..+.. |
48000 +-+ +..+. +.+..+..+.+..+.+..|
| O O O O O O |
47000 +-+ O O O O O O |
46000 +-+ |
| |
45000 +-+ O O |
O O O O O O O O O |
44000 +-+-----------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Rong Chen
View attachment "config-5.4.0-00001-gb77491648e6eb" of type "text/plain" (200664 bytes)
View attachment "job-script" of type "text/plain" (7883 bytes)
View attachment "job.yaml" of type "text/plain" (5468 bytes)
View attachment "reproduce" of type "text/plain" (311 bytes)
Powered by blists - more mailing lists