[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20200630092313.GD5535@shao2-debian>
Date: Tue, 30 Jun 2020 17:23:13 +0800
From: kernel test robot <rong.a.chen@...el.com>
To: Stephane Eranian <eranian@...gle.com>
Cc: Ingo Molnar <mingo@...nel.org>,
Kim Phillips <kim.phillips@....com>,
Peter Zijlstra <peterz@...radead.org>,
LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org
Subject: [perf/x86/rapl] 16accae3d9: unixbench.score -4.1% regression
Greeting,
FYI, we noticed a -4.1% regression of unixbench.score due to commit:
commit: 16accae3d97f97d7f61c4ee5d0002bccdef59088 ("perf/x86/rapl: Fix RAPL config variable bug")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
in testcase: unixbench
on test machine: 192 threads Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz with 192G memory
with following parameters:
runtime: 300s
nr_task: 30%
test: context1
cpufreq_governor: performance
ucode: 0x5002f01
test-description: UnixBench is the original BYTE UNIX benchmark suite aims to test performance of Unix-like system.
test-url: https://github.com/kdlucas/byte-unixbench
If you fix the issue, kindly add following tag
Reported-by: kernel test robot <rong.a.chen@...el.com>
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
=========================================================================================
compiler/cpufreq_governor/kconfig/nr_task/rootfs/runtime/tbox_group/test/testcase/ucode:
gcc-9/performance/x86_64-rhel-7.6/30%/debian-x86_64-20191114.cgz/300s/lkp-csl-2ap2/context1/unixbench/0x5002f01
commit:
4e909124f8 (" Clean up various aspects of the vDSO code, no change in")
16accae3d9 ("perf/x86/rapl: Fix RAPL config variable bug")
4e909124f8ed54b1 16accae3d97f97d7f61c4ee5d00
---------------- ---------------------------
%stddev %change %stddev
\ | \
2667 -4.1% 2558 unixbench.score
102.78 -3.6% 99.07 unixbench.time.user_time
3.1e+08 -4.3% 2.967e+08 unixbench.time.voluntary_context_switches
4.177e+08 -4.1% 4.004e+08 unixbench.workload
0.00 ± 44% -0.0 0.00 ± 89% mpstat.cpu.all.soft%
13706 ±153% -97.3% 374.50 ± 32% softirqs.CPU13.NET_RX
3146596 -4.3% 3012265 vmstat.system.cs
4.33e+08 -12.8% 3.777e+08 cpuidle.C1.usage
2.488e+08 ± 4% +17.9% 2.933e+08 ± 9% cpuidle.C1E.usage
25592 +1.4% 25962 proc-vmstat.nr_slab_reclaimable
73992 +1.7% 75243 proc-vmstat.nr_slab_unreclaimable
1502 ± 8% -8.7% 1371 ± 5% sched_debug.cfs_rq:/.runnable_avg.max
1197650 ± 18% +64.0% 1964026 ± 17% sched_debug.cfs_rq:/.spread0.avg
3114959 ± 13% +36.4% 4247863 ± 7% sched_debug.cfs_rq:/.spread0.max
2131 ± 4% +18.6% 2529 ± 2% slabinfo.UNIX.active_objs
2131 ± 4% +18.6% 2529 ± 2% slabinfo.UNIX.num_objs
3881 ± 4% +13.2% 4394 ± 3% slabinfo.sock_inode_cache.active_objs
3881 ± 4% +13.2% 4394 ± 3% slabinfo.sock_inode_cache.num_objs
1256 ± 7% +14.7% 1441 ± 2% slabinfo.task_group.active_objs
1256 ± 7% +14.7% 1441 ± 2% slabinfo.task_group.num_objs
82214 ± 10% -12.3% 72134 ± 11% numa-vmstat.node0.nr_unevictable
82214 ± 10% -12.3% 72134 ± 11% numa-vmstat.node0.nr_zone_unevictable
841.25 ±100% +295.6% 3327 ± 35% numa-vmstat.node2.nr_inactive_anon
1064 ± 80% +243.0% 3652 ± 30% numa-vmstat.node2.nr_shmem
4706 ± 33% +56.4% 7358 ± 21% numa-vmstat.node2.nr_slab_reclaimable
16156 ± 10% +27.0% 20519 ± 8% numa-vmstat.node2.nr_slab_unreclaimable
841.25 ±100% +295.6% 3327 ± 35% numa-vmstat.node2.nr_zone_inactive_anon
328859 ± 10% -12.3% 288537 ± 11% numa-meminfo.node0.Unevictable
3432 ±101% +287.9% 13313 ± 35% numa-meminfo.node2.Inactive
3366 ±100% +295.4% 13312 ± 35% numa-meminfo.node2.Inactive(anon)
18824 ± 33% +56.4% 29438 ± 21% numa-meminfo.node2.KReclaimable
18824 ± 33% +56.4% 29438 ± 21% numa-meminfo.node2.SReclaimable
64624 ± 10% +27.0% 82084 ± 8% numa-meminfo.node2.SUnreclaim
4259 ± 80% +243.0% 14610 ± 30% numa-meminfo.node2.Shmem
83450 ± 15% +33.6% 111523 ± 7% numa-meminfo.node2.Slab
24989 ±153% -97.6% 596.50 ± 36% interrupts.34:PCI-MSI.524292-edge.eth0-TxRx-3
85471 ± 2% -15.6% 72153 ± 9% interrupts.CPU0.RES:Rescheduling_interrupts
110378 ± 4% -6.9% 102726 ± 6% interrupts.CPU103.RES:Rescheduling_interrupts
53.75 ±117% +514.4% 330.25 ± 59% interrupts.CPU121.TLB:TLB_shootdowns
24989 ±153% -97.6% 596.50 ± 36% interrupts.CPU13.34:PCI-MSI.524292-edge.eth0-TxRx-3
16.25 ± 74% +1889.2% 323.25 ± 88% interrupts.CPU133.TLB:TLB_shootdowns
48.00 ±105% +597.4% 334.75 ±105% interrupts.CPU136.TLB:TLB_shootdowns
104.00 ±110% +127.2% 236.25 ± 38% interrupts.CPU139.TLB:TLB_shootdowns
17.00 ± 34% +525.0% 106.25 ±123% interrupts.CPU143.TLB:TLB_shootdowns
98102 ± 4% +6.1% 104055 ± 3% interrupts.CPU150.RES:Rescheduling_interrupts
90645 ± 4% +9.8% 99492 ± 4% interrupts.CPU158.RES:Rescheduling_interrupts
88524 ± 3% +8.5% 96054 ± 4% interrupts.CPU162.RES:Rescheduling_interrupts
80176 ± 4% +12.5% 90225 ± 6% interrupts.CPU167.RES:Rescheduling_interrupts
125.00 ± 60% +255.2% 444.00 ± 31% interrupts.CPU171.TLB:TLB_shootdowns
2638 ± 21% +39.9% 3692 ± 15% interrupts.CPU172.NMI:Non-maskable_interrupts
2638 ± 21% +39.9% 3692 ± 15% interrupts.CPU172.PMI:Performance_monitoring_interrupts
2689 ± 29% +40.6% 3782 ± 3% interrupts.CPU179.NMI:Non-maskable_interrupts
2689 ± 29% +40.6% 3782 ± 3% interrupts.CPU179.PMI:Performance_monitoring_interrupts
21.75 ± 33% +452.9% 120.25 ± 98% interrupts.CPU179.TLB:TLB_shootdowns
2663 ± 17% +36.9% 3644 ± 4% interrupts.CPU180.NMI:Non-maskable_interrupts
2663 ± 17% +36.9% 3644 ± 4% interrupts.CPU180.PMI:Performance_monitoring_interrupts
3154 ± 6% -32.8% 2120 ± 22% interrupts.CPU27.NMI:Non-maskable_interrupts
3154 ± 6% -32.8% 2120 ± 22% interrupts.CPU27.PMI:Performance_monitoring_interrupts
3229 ± 6% -31.3% 2217 ± 30% interrupts.CPU28.NMI:Non-maskable_interrupts
3229 ± 6% -31.3% 2217 ± 30% interrupts.CPU28.PMI:Performance_monitoring_interrupts
3393 -33.2% 2268 ± 29% interrupts.CPU29.NMI:Non-maskable_interrupts
3393 -33.2% 2268 ± 29% interrupts.CPU29.PMI:Performance_monitoring_interrupts
120.00 ±110% +202.1% 362.50 ± 79% interrupts.CPU30.TLB:TLB_shootdowns
3446 ± 7% -31.4% 2364 ± 32% interrupts.CPU32.NMI:Non-maskable_interrupts
3446 ± 7% -31.4% 2364 ± 32% interrupts.CPU32.PMI:Performance_monitoring_interrupts
36.00 ± 75% +440.3% 194.50 ± 93% interrupts.CPU33.TLB:TLB_shootdowns
219.00 ± 61% +182.4% 618.50 ± 36% interrupts.CPU35.TLB:TLB_shootdowns
3418 ± 6% -39.0% 2084 ± 25% interrupts.CPU39.NMI:Non-maskable_interrupts
3418 ± 6% -39.0% 2084 ± 25% interrupts.CPU39.PMI:Performance_monitoring_interrupts
32.75 ± 83% +399.2% 163.50 ± 75% interrupts.CPU40.TLB:TLB_shootdowns
112.50 ±136% +279.3% 426.75 ± 52% interrupts.CPU41.TLB:TLB_shootdowns
683.75 ± 32% +128.4% 1561 ± 38% interrupts.CPU50.TLB:TLB_shootdowns
534.50 ± 43% +96.5% 1050 ± 30% interrupts.CPU51.TLB:TLB_shootdowns
67.50 ± 82% +265.9% 247.00 ± 46% interrupts.CPU70.TLB:TLB_shootdowns
110.00 ± 74% +326.8% 469.50 ± 31% interrupts.CPU78.TLB:TLB_shootdowns
98633 ± 4% -13.4% 85388 ± 3% interrupts.CPU79.RES:Rescheduling_interrupts
40.87 -0.7 40.20 perf-profile.calltrace.cycles-pp.cpuidle_enter.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64
40.77 -0.7 40.10 perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.do_idle.cpu_startup_entry.start_secondary
43.79 -0.7 43.12 perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64
43.79 -0.7 43.13 perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64
43.77 -0.7 43.11 perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64
44.05 -0.6 43.42 perf-profile.calltrace.cycles-pp.secondary_startup_64
1.45 ± 7% -0.2 1.30 ± 4% perf-profile.calltrace.cycles-pp.__irqentry_text_start.cpuidle_enter_state.cpuidle_enter.do_idle.cpu_startup_entry
1.09 ± 2% -0.1 1.04 ± 2% perf-profile.calltrace.cycles-pp.stack_trace_save_tsk.__account_scheduler_latency.update_stats_enqueue_sleeper.enqueue_entity.enqueue_task_fair
0.60 ± 2% -0.0 0.56 ± 2% perf-profile.calltrace.cycles-pp.unwind_next_frame.arch_stack_walk.stack_trace_save_tsk.__account_scheduler_latency.update_stats_enqueue_sleeper
0.50 +0.0 0.54 ± 4% perf-profile.calltrace.cycles-pp.dequeue_entity.dequeue_task_fair.__sched_text_start.schedule.pipe_read
47.21 +0.6 47.84 perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.__account_scheduler_latency.update_stats_enqueue_sleeper.enqueue_entity.enqueue_task_fair
46.71 +0.6 47.35 perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.__account_scheduler_latency.update_stats_enqueue_sleeper.enqueue_entity
43.79 -0.7 43.13 perf-profile.children.cycles-pp.start_secondary
41.12 -0.6 40.48 perf-profile.children.cycles-pp.cpuidle_enter
41.11 -0.6 40.47 perf-profile.children.cycles-pp.cpuidle_enter_state
44.05 -0.6 43.42 perf-profile.children.cycles-pp.do_idle
44.05 -0.6 43.42 perf-profile.children.cycles-pp.secondary_startup_64
44.05 -0.6 43.42 perf-profile.children.cycles-pp.cpu_startup_entry
0.10 ± 16% -0.1 0.03 ±100% perf-profile.children.cycles-pp.tick_irq_enter
1.36 -0.1 1.30 perf-profile.children.cycles-pp.stack_trace_save_tsk
1.17 -0.1 1.12 perf-profile.children.cycles-pp.arch_stack_walk
0.77 -0.1 0.72 perf-profile.children.cycles-pp.unwind_next_frame
0.12 ± 17% -0.0 0.07 ± 21% perf-profile.children.cycles-pp.irq_enter
0.28 -0.0 0.25 perf-profile.children.cycles-pp.select_task_rq_fair
0.15 ± 4% -0.0 0.13 ± 5% perf-profile.children.cycles-pp.sched_clock_cpu
0.12 -0.0 0.11 ± 4% perf-profile.children.cycles-pp.update_ts_time_stats
0.10 ± 5% +0.0 0.11 perf-profile.children.cycles-pp.stack_trace_consume_entry_nosched
0.21 ± 3% +0.0 0.23 ± 2% perf-profile.children.cycles-pp.__switch_to
0.22 +0.0 0.24 ± 2% perf-profile.children.cycles-pp.reweight_entity
0.03 ±100% +0.0 0.06 perf-profile.children.cycles-pp.kill_fasync
0.07 ± 6% +0.0 0.11 ± 4% perf-profile.children.cycles-pp._raw_spin_trylock
0.10 ± 5% +0.0 0.14 perf-profile.children.cycles-pp.rebalance_domains
0.38 ± 3% +0.1 0.43 ± 3% perf-profile.children.cycles-pp._raw_spin_unlock_irqrestore
47.61 +0.6 48.23 perf-profile.children.cycles-pp._raw_spin_lock_irqsave
47.40 +0.6 48.05 perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
0.49 -0.0 0.45 ± 3% perf-profile.self.cycles-pp.unwind_next_frame
0.22 ± 4% -0.0 0.19 ± 12% perf-profile.self.cycles-pp.ktime_get
0.35 -0.0 0.34 ± 2% perf-profile.self.cycles-pp.set_next_entity
0.22 +0.0 0.24 perf-profile.self.cycles-pp.reweight_entity
0.04 ± 57% +0.0 0.07 ± 6% perf-profile.self.cycles-pp.stack_trace_consume_entry_nosched
0.07 ± 6% +0.0 0.11 ± 4% perf-profile.self.cycles-pp._raw_spin_trylock
0.01 ±173% +0.0 0.06 perf-profile.self.cycles-pp.kill_fasync
47.40 +0.6 48.05 perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
7.523e+09 -2.8% 7.309e+09 perf-stat.i.branch-instructions
6.21 ± 28% +2.1 8.35 ± 20% perf-stat.i.cache-miss-rate%
5.942e+08 ± 2% -6.0% 5.587e+08 ± 2% perf-stat.i.cache-references
3162168 -4.3% 3025917 perf-stat.i.context-switches
1.546e+11 -1.0% 1.531e+11 perf-stat.i.cpu-cycles
984728 ± 44% -54.8% 444929 ± 90% perf-stat.i.dTLB-load-misses
8.643e+09 -3.0% 8.384e+09 perf-stat.i.dTLB-loads
142857 ± 45% -55.8% 63208 ± 98% perf-stat.i.dTLB-store-misses
3.804e+09 -4.1% 3.647e+09 perf-stat.i.dTLB-stores
44849258 +2.8% 46101148 perf-stat.i.iTLB-load-misses
23588895 -4.2% 22608828 perf-stat.i.iTLB-loads
3.325e+10 -3.0% 3.226e+10 perf-stat.i.instructions
785.28 ± 2% -9.5% 710.79 perf-stat.i.instructions-per-iTLB-miss
0.81 -1.0% 0.80 perf-stat.i.metric.GHz
1.09 ± 3% -35.7% 0.70 ± 4% perf-stat.i.metric.K/sec
107.30 -3.2% 103.83 perf-stat.i.metric.M/sec
96.69 +1.4 98.10 perf-stat.i.node-load-miss-rate%
5946250 +10.6% 6577000 perf-stat.i.node-load-misses
114205 ± 2% -54.1% 52372 ± 4% perf-stat.i.node-loads
5431765 -2.9% 5274750 perf-stat.i.node-store-misses
36683 ± 7% -22.3% 28520 ± 13% perf-stat.i.node-stores
17.87 ± 2% -3.1% 17.31 ± 2% perf-stat.overall.MPKI
5.52 ± 2% +0.3 5.84 ± 2% perf-stat.overall.cache-miss-rate%
4.65 +2.0% 4.74 perf-stat.overall.cpi
0.01 ± 44% -0.0 0.01 ± 90% perf-stat.overall.dTLB-load-miss-rate%
65.53 +1.6 67.09 perf-stat.overall.iTLB-load-miss-rate%
741.35 -5.6% 699.89 perf-stat.overall.instructions-per-iTLB-miss
0.22 -2.0% 0.21 perf-stat.overall.ipc
98.11 +1.1 99.21 perf-stat.overall.node-load-miss-rate%
31153 +1.3% 31555 perf-stat.overall.path-length
7.501e+09 -2.8% 7.29e+09 perf-stat.ps.branch-instructions
5.925e+08 ± 2% -6.0% 5.572e+08 ± 2% perf-stat.ps.cache-references
3152559 -4.3% 3018237 perf-stat.ps.context-switches
1.541e+11 -0.9% 1.527e+11 perf-stat.ps.cpu-cycles
984014 ± 44% -54.8% 444656 ± 90% perf-stat.ps.dTLB-load-misses
8.617e+09 -3.0% 8.362e+09 perf-stat.ps.dTLB-loads
142770 ± 45% -55.8% 63169 ± 98% perf-stat.ps.dTLB-store-misses
3.792e+09 -4.1% 3.638e+09 perf-stat.ps.dTLB-stores
44712199 +2.8% 45980702 perf-stat.ps.iTLB-load-misses
23519911 -4.1% 22554062 perf-stat.ps.iTLB-loads
3.315e+10 -2.9% 3.218e+10 perf-stat.ps.instructions
5927959 +10.7% 6559697 perf-stat.ps.node-load-misses
113911 ± 2% -54.1% 52287 ± 4% perf-stat.ps.node-loads
5415049 -2.8% 5260809 perf-stat.ps.node-store-misses
36616 ± 7% -22.3% 28468 ± 13% perf-stat.ps.node-stores
1.301e+13 -2.9% 1.264e+13 perf-stat.total.instructions
unixbench.score
2700 +--------------------------------------------------------------------+
|.+.. .+.+.+.. |
2650 |-+ +.+.+..+.+.+.+.. .+. .+.+..+.+.+.+..+. .+ +.|
| + +..+ +. .+.+.+. |
| +. |
2600 |-+ |
| O O O O |
2550 |-+ O O O O |
| |
2500 |-+ |
| O |
| O O O O O O |
2450 |-+ O |
| |
2400 +--------------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Rong Chen
View attachment "config-5.7.0-00916-g16accae3d97f9" of type "text/plain" (202910 bytes)
View attachment "job-script" of type "text/plain" (7468 bytes)
View attachment "job.yaml" of type "text/plain" (5014 bytes)
View attachment "reproduce" of type "text/plain" (294 bytes)
Powered by blists - more mailing lists