[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200205123216.GO12867@shao2-debian>
Date: Wed, 5 Feb 2020 20:32:16 +0800
From: kernel test robot <rong.a.chen@...el.com>
To: Jiri Olsa <jolsa@...hat.com>
Cc: Ingo Molnar <mingo@...nel.org>,
Vince Weaver <vincent.weaver@...ne.edu>,
Jiri Olsa <jolsa@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
Arnaldo Carvalho de Melo <acme@...hat.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
"Naveen N. Rao" <naveen.n.rao@...ux.vnet.ibm.com>,
Ravi Bangoria <ravi.bangoria@...ux.ibm.com>,
Stephane Eranian <eranian@...gle.com>,
Thomas Gleixner <tglx@...utronix.de>,
LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org
Subject: [perf/x86] 81ec3f3c4c: will-it-scale.per_process_ops -5.5% regression
Greeting,
FYI, we noticed a -5.5% regression of will-it-scale.per_process_ops due to commit:
commit: 81ec3f3c4c4d78f2d3b6689c9816bfbdf7417dbb ("perf/x86: Add check_period PMU callback")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
in testcase: will-it-scale
on test machine: 192 threads Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz with 192G memory
with following parameters:
nr_task: 100%
mode: process
test: signal1
cpufreq_governor: performance
ucode: 0x500002c
test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
test-url: https://github.com/antonblanchard/will-it-scale
If you fix the issue, kindly add following tag
Reported-by: kernel test robot <rong.a.chen@...el.com>
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
gcc-7/performance/x86_64-rhel-7.6/process/100%/debian-x86_64-20191114.cgz/lkp-csl-2ap4/signal1/will-it-scale/0x500002c
commit:
v5.0-rc6
81ec3f3c4c ("perf/x86: Add check_period PMU callback")
v5.0-rc6 81ec3f3c4c4d78f2d3b6689c981
---------------- ---------------------------
%stddev %change %stddev
\ | \
17987 -5.5% 16997 will-it-scale.per_process_ops
3453717 -5.5% 3263556 will-it-scale.workload
3.032e+08 ± 22% -56.2% 1.329e+08 ± 71% cpuidle.C6.time
435366 ± 25% -51.4% 211628 ± 35% cpuidle.C6.usage
5620 ± 50% -33.6% 3731 ± 4% softirqs.CPU187.RCU
7972 ± 42% -44.7% 4407 ± 3% softirqs.CPU51.RCU
381824 ± 27% -66.5% 128076 ± 91% turbostat.C6
0.48 ± 24% -0.3 0.18 ± 93% turbostat.C6%
640.83 ± 6% +14.9% 736.62 ± 6% sched_debug.cfs_rq:/.util_avg.min
56437 ± 9% -17.3% 46662 ± 7% sched_debug.cpu.nr_switches.max
5224 ± 5% -7.1% 4852 ± 6% sched_debug.cpu.nr_switches.stddev
54643 ± 9% -17.9% 44853 ± 7% sched_debug.cpu.sched_count.max
26160 ± 10% -17.6% 21557 ± 7% sched_debug.cpu.ttwu_count.max
25875 ± 9% -17.3% 21398 ± 7% sched_debug.cpu.ttwu_local.max
745.75 ± 16% -46.8% 396.75 ± 38% interrupts.33:PCI-MSI.524291-edge.eth0-TxRx-2
952.50 ±108% -93.2% 65.00 ± 50% interrupts.CPU1.RES:Rescheduling_interrupts
745.75 ± 16% -46.8% 396.75 ± 38% interrupts.CPU11.33:PCI-MSI.524291-edge.eth0-TxRx-2
740.50 ±171% -100.0% 0.25 ±173% interrupts.CPU166.RES:Rescheduling_interrupts
3207 ± 6% +27.7% 4095 ± 5% interrupts.CPU185.CAL:Function_call_interrupts
152.75 ±168% -99.2% 1.25 ± 34% interrupts.CPU50.RES:Rescheduling_interrupts
698.25 ±166% -98.7% 9.25 ±135% interrupts.CPU70.RES:Rescheduling_interrupts
3367 ± 2% +13.5% 3821 ± 6% interrupts.CPU89.CAL:Function_call_interrupts
202.75 ±117% -85.1% 30.25 ± 83% interrupts.CPU96.RES:Rescheduling_interrupts
71307 ± 3% -11.0% 63441 numa-vmstat.node2.nr_file_pages
7081 ± 3% -19.6% 5696 numa-vmstat.node2.nr_kernel_stack
1854 ± 7% -40.2% 1109 numa-vmstat.node2.nr_mapped
2524 ± 6% -18.1% 2068 ± 2% numa-vmstat.node2.nr_page_table_pages
5132 ± 15% -38.5% 3159 ± 10% numa-vmstat.node2.nr_slab_reclaimable
15668 ± 9% -22.2% 12192 ± 5% numa-vmstat.node2.nr_slab_unreclaimable
70254 ± 2% -9.9% 63317 numa-vmstat.node2.nr_unevictable
70254 ± 2% -9.9% 63317 numa-vmstat.node2.nr_zone_unevictable
361503 ± 20% -23.9% 274980 ± 6% numa-vmstat.node2.numa_hit
275672 ± 26% -31.3% 189397 ± 10% numa-vmstat.node2.numa_local
285230 ± 3% -11.0% 253766 numa-meminfo.node2.FilePages
1707 ± 11% -73.2% 457.50 ±118% numa-meminfo.node2.Inactive
20532 ± 15% -38.4% 12638 ± 10% numa-meminfo.node2.KReclaimable
7082 ± 3% -19.6% 5696 numa-meminfo.node2.KernelStack
7112 ± 5% -37.6% 4436 numa-meminfo.node2.Mapped
590073 ± 6% -15.9% 496353 ± 9% numa-meminfo.node2.MemUsed
10101 ± 6% -18.0% 8282 ± 2% numa-meminfo.node2.PageTables
20532 ± 15% -38.4% 12638 ± 10% numa-meminfo.node2.SReclaimable
62680 ± 9% -22.2% 48775 ± 5% numa-meminfo.node2.SUnreclaim
83213 ± 8% -26.2% 61414 ± 6% numa-meminfo.node2.Slab
281021 ± 2% -9.9% 253271 numa-meminfo.node2.Unevictable
3.322e+09 -5.1% 3.152e+09 perf-stat.i.branch-instructions
1.17 +0.0 1.18 perf-stat.i.branch-miss-rate%
38492834 -4.2% 36888923 perf-stat.i.branch-misses
42.83 -0.7 42.11 perf-stat.i.cache-miss-rate%
43547238 -6.0% 40916641 ± 2% perf-stat.i.cache-misses
1.014e+08 -4.5% 96860566 perf-stat.i.cache-references
34.91 +5.3% 36.76 perf-stat.i.cpi
13610 +6.2% 14457 ± 2% perf-stat.i.cycles-between-cache-misses
5.049e+09 -5.4% 4.777e+09 perf-stat.i.dTLB-loads
0.00 ± 8% +0.0 0.00 ± 2% perf-stat.i.dTLB-store-miss-rate%
17697 ± 5% +90.1% 33643 ± 3% perf-stat.i.dTLB-store-misses
3.162e+09 -5.3% 2.994e+09 perf-stat.i.dTLB-stores
26478258 -8.6% 24200892 perf-stat.i.iTLB-load-misses
1.682e+10 -5.1% 1.596e+10 perf-stat.i.instructions
640.74 +3.6% 664.01 ± 2% perf-stat.i.instructions-per-iTLB-miss
3611851 -4.9% 3435033 perf-stat.i.node-load-misses
7022617 -3.0% 6811102 perf-stat.i.node-store-misses
1.16 +0.0 1.17 perf-stat.overall.branch-miss-rate%
42.96 -0.7 42.24 perf-stat.overall.cache-miss-rate%
34.96 +5.3% 36.81 perf-stat.overall.cpi
13501 +6.4% 14364 ± 2% perf-stat.overall.cycles-between-cache-misses
0.00 ± 5% +0.0 0.00 ± 2% perf-stat.overall.dTLB-store-miss-rate%
635.29 +3.8% 659.66 perf-stat.overall.instructions-per-iTLB-miss
0.03 -5.0% 0.03 perf-stat.overall.ipc
3.308e+09 -5.1% 3.139e+09 perf-stat.ps.branch-instructions
38324306 -4.1% 36739072 perf-stat.ps.branch-misses
43365638 -6.0% 40745255 ± 2% perf-stat.ps.cache-misses
1.01e+08 -4.5% 96451003 perf-stat.ps.cache-references
5.028e+09 -5.4% 4.757e+09 perf-stat.ps.dTLB-loads
17634 ± 5% +90.1% 33526 ± 3% perf-stat.ps.dTLB-store-misses
3.149e+09 -5.3% 2.982e+09 perf-stat.ps.dTLB-stores
26369184 -8.6% 24103019 perf-stat.ps.iTLB-load-misses
1.675e+10 -5.1% 1.59e+10 perf-stat.ps.instructions
3597149 -4.9% 3420963 perf-stat.ps.node-load-misses
6994250 -3.0% 6784176 perf-stat.ps.node-store-misses
5.199e+12 -5.2% 4.931e+12 perf-stat.total.instructions
will-it-scale.per_process_ops
19500 +-+-----------------------------------------------------------------+
|..+.+..+. |
19000 +-+ +.. |
| |
18500 +-+ +..+. .+.+..+.+..+..+ |
| +..+. + |
18000 +-+ +..+.+..+..+.+..+..+ |
| |
17500 +-+ O O |
| O |
17000 +-+ O O O O O O O O O
| O O |
16500 O-+O O O O O O O O O |
| O O O |
16000 +-+-----------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Rong Chen
View attachment "config-5.0.0-rc6-00001-g81ec3f3c4c4d7" of type "text/plain" (187481 bytes)
View attachment "job-script" of type "text/plain" (7614 bytes)
View attachment "job.yaml" of type "text/plain" (5198 bytes)
View attachment "reproduce" of type "text/plain" (312 bytes)
Powered by blists - more mailing lists