[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200212113514.GT12867@shao2-debian>
Date: Wed, 12 Feb 2020 19:35:14 +0800
From: kernel test robot <rong.a.chen@...el.com>
To: Kim Phillips <kim.phillips@....com>
Cc: Peter Zijlstra <peterz@...radead.org>,
LKML <linux-kernel@...r.kernel.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
lkp@...ts.01.org
Subject: [perf/x86/amd] 471af006a7: will-it-scale.per_process_ops -7.6%
regression
Greeting,
FYI, we noticed a -7.6% regression of will-it-scale.per_process_ops due to commit:
commit: 471af006a747f1c535c8a8c6c0973c320fe01b22 ("perf/x86/amd: Constrain Large Increment per Cycle events")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
in testcase: will-it-scale
on test machine: 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 128G memory
with following parameters:
nr_task: 16
mode: process
test: mmap1
cpufreq_governor: performance
ucode: 0xb000038
test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
test-url: https://github.com/antonblanchard/will-it-scale
If you fix the issue, kindly add following tag
Reported-by: kernel test robot <rong.a.chen@...el.com>
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
gcc-7/performance/x86_64-rhel-7.6/process/16/debian-x86_64-20191114.cgz/lkp-bdw-ep6/mmap1/will-it-scale/0xb000038
commit:
1e0f17724a ("perf/x86/intel/rapl: Add Comet Lake support")
471af006a7 ("perf/x86/amd: Constrain Large Increment per Cycle events")
1e0f17724a74c8e9 471af006a747f1c535c8a8c6c09
---------------- ---------------------------
%stddev %change %stddev
\ | \
140938 -7.6% 130210 will-it-scale.per_process_ops
2255019 -7.6% 2083371 will-it-scale.workload
837580 ±166% -96.5% 28917 ± 3% cpuidle.C1.usage
7166976 ± 6% -14.5% 6128128 ± 8% meminfo.DirectMap2M
836885 ±166% -96.7% 27303 ± 5% turbostat.C1
14685 ± 16% -45.3% 8033 ± 59% numa-meminfo.node0.Inactive
14479 ± 17% -45.6% 7878 ± 59% numa-meminfo.node0.Inactive(anon)
3619 ± 17% -45.5% 1971 ± 59% numa-vmstat.node0.nr_inactive_anon
3619 ± 17% -45.5% 1971 ± 59% numa-vmstat.node0.nr_zone_inactive_anon
0.92 ± 37% +72.7% 1.58 ± 15% sched_debug.cfs_rq:/.nr_spread_over.max
0.18 ± 31% +80.5% 0.32 ± 22% sched_debug.cfs_rq:/.nr_spread_over.stddev
823.50 ± 11% +21.3% 999.25 ± 6% slabinfo.task_group.active_objs
823.50 ± 11% +21.3% 999.25 ± 6% slabinfo.task_group.num_objs
28461 ± 45% +226.7% 92971 ± 52% softirqs.CPU2.SCHED
140190 ± 16% -22.1% 109224 ± 4% softirqs.CPU26.TIMER
22312 ± 59% +118.2% 48678 ± 35% softirqs.CPU27.SCHED
137328 ± 19% -24.1% 104166 ± 4% softirqs.CPU30.TIMER
136600 ± 19% -24.8% 102717 ± 4% softirqs.CPU31.TIMER
138650 ± 18% -26.7% 101647 ± 7% softirqs.CPU32.TIMER
157659 ± 7% -24.6% 118910 ± 10% softirqs.CPU36.TIMER
11534 ± 93% -58.8% 4756 softirqs.CPU52.SCHED
140233 ± 17% -23.3% 107504 ± 4% softirqs.CPU68.TIMER
140175 ± 17% -21.5% 110106 ± 4% softirqs.CPU69.TIMER
137837 ± 18% -21.7% 107989 ± 3% softirqs.CPU71.TIMER
136668 ± 19% -24.1% 103782 ± 4% softirqs.CPU74.TIMER
135588 ± 19% -24.9% 101806 ± 4% softirqs.CPU75.TIMER
21298 ± 59% +103.1% 43251 softirqs.CPU8.SCHED
6.983e+09 -4.2% 6.688e+09 perf-stat.i.branch-instructions
1.67 +6.0% 1.77 perf-stat.i.cpi
8.465e+09 -5.3% 8.019e+09 perf-stat.i.dTLB-loads
14870770 ± 14% -25.5% 11075131 ± 8% perf-stat.i.iTLB-load-misses
3.001e+10 -4.4% 2.869e+10 perf-stat.i.instructions
2068 ± 13% +26.5% 2616 ± 8% perf-stat.i.instructions-per-iTLB-miss
0.60 -5.7% 0.56 perf-stat.i.ipc
1.67 +6.0% 1.77 perf-stat.overall.cpi
2056 ± 13% +26.9% 2610 ± 8% perf-stat.overall.instructions-per-iTLB-miss
0.60 -5.7% 0.56 perf-stat.overall.ipc
4007121 +3.5% 4145942 perf-stat.overall.path-length
6.96e+09 -4.2% 6.666e+09 perf-stat.ps.branch-instructions
8.436e+09 -5.3% 7.992e+09 perf-stat.ps.dTLB-loads
14820211 ± 14% -25.5% 11037555 ± 8% perf-stat.ps.iTLB-load-misses
2.99e+10 -4.4% 2.859e+10 perf-stat.ps.instructions
9.036e+12 -4.4% 8.638e+12 perf-stat.total.instructions
0.80 ± 16% +0.2 1.04 ± 9% perf-profile.calltrace.cycles-pp.menu_select.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64
0.61 ± 58% +0.3 0.93 ± 5% perf-profile.calltrace.cycles-pp.__hrtimer_run_queues.hrtimer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt.cpuidle_enter_state
1.06 ± 18% +0.3 1.38 ± 3% perf-profile.calltrace.cycles-pp.hrtimer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter
0.09 ± 15% +0.0 0.12 ± 6% perf-profile.children.cycles-pp.find_next_bit
0.06 ± 58% +0.0 0.10 ± 5% perf-profile.children.cycles-pp.__remove_hrtimer
0.10 ± 10% +0.0 0.14 ± 8% perf-profile.children.cycles-pp.native_write_msr
0.10 ± 12% +0.0 0.14 ± 8% perf-profile.children.cycles-pp.lapic_next_deadline
0.06 ± 58% +0.0 0.10 ± 4% perf-profile.children.cycles-pp.__hrtimer_next_event_base
0.01 ±173% +0.0 0.05 ± 9% perf-profile.children.cycles-pp.interrupt_entry
0.15 ± 21% +0.0 0.20 ± 5% perf-profile.children.cycles-pp.perf_mux_hrtimer_handler
0.20 ± 10% +0.0 0.24 ± 2% perf-profile.children.cycles-pp.clockevents_program_event
0.15 ± 15% +0.0 0.20 ± 7% perf-profile.children.cycles-pp.__next_timer_interrupt
0.22 ± 20% +0.1 0.29 ± 5% perf-profile.children.cycles-pp.get_next_timer_interrupt
0.15 ± 22% +0.1 0.22 ± 13% perf-profile.children.cycles-pp.tick_irq_enter
0.32 ± 16% +0.1 0.40 ± 3% perf-profile.children.cycles-pp.tick_nohz_next_event
0.18 ± 20% +0.1 0.27 ± 12% perf-profile.children.cycles-pp.irq_enter
0.41 ± 15% +0.1 0.51 ± 4% perf-profile.children.cycles-pp.tick_nohz_get_sleep_length
0.84 ± 20% +0.2 1.06 ± 4% perf-profile.children.cycles-pp.__hrtimer_run_queues
0.81 ± 16% +0.2 1.05 ± 9% perf-profile.children.cycles-pp.menu_select
1.23 ± 17% +0.3 1.56 ± 3% perf-profile.children.cycles-pp.hrtimer_interrupt
0.11 ± 9% +0.0 0.13 ± 7% perf-profile.self.cycles-pp.__x86_indirect_thunk_rax
0.09 +0.0 0.11 ± 11% perf-profile.self.cycles-pp.read_tsc
0.07 ± 17% +0.0 0.09 ± 5% perf-profile.self.cycles-pp.__next_timer_interrupt
0.08 ± 17% +0.0 0.11 ± 7% perf-profile.self.cycles-pp.find_next_bit
0.04 ± 57% +0.0 0.07 ± 7% perf-profile.self.cycles-pp.__hrtimer_run_queues
0.10 ± 10% +0.0 0.14 ± 8% perf-profile.self.cycles-pp.native_write_msr
0.01 ±173% +0.0 0.05 ± 9% perf-profile.self.cycles-pp.interrupt_entry
0.03 ±100% +0.0 0.07 perf-profile.self.cycles-pp.hrtimer_interrupt
0.00 +0.1 0.05 perf-profile.self.cycles-pp.perf_mux_hrtimer_handler
0.25 ± 14% +0.1 0.32 ± 5% perf-profile.self.cycles-pp.cpuidle_enter_state
0.99 ± 8% +0.2 1.17 ± 8% perf-profile.self.cycles-pp.percpu_counter_add_batch
4969 ± 21% -46.7% 2649 ± 27% interrupts.CPU10.NMI:Non-maskable_interrupts
4969 ± 21% -46.7% 2649 ± 27% interrupts.CPU10.PMI:Performance_monitoring_interrupts
4973 ± 21% -35.1% 3227 ± 48% interrupts.CPU13.NMI:Non-maskable_interrupts
4973 ± 21% -35.1% 3227 ± 48% interrupts.CPU13.PMI:Performance_monitoring_interrupts
412.50 ± 31% +66.5% 687.00 ± 5% interrupts.CPU25.NMI:Non-maskable_interrupts
412.50 ± 31% +66.5% 687.00 ± 5% interrupts.CPU25.PMI:Performance_monitoring_interrupts
396.75 ± 35% +69.0% 670.50 ± 7% interrupts.CPU27.NMI:Non-maskable_interrupts
396.75 ± 35% +69.0% 670.50 ± 7% interrupts.CPU27.PMI:Performance_monitoring_interrupts
487.00 ± 26% +39.6% 680.00 ± 5% interrupts.CPU31.NMI:Non-maskable_interrupts
487.00 ± 26% +39.6% 680.00 ± 5% interrupts.CPU31.PMI:Performance_monitoring_interrupts
495.75 ± 31% +39.2% 690.25 ± 5% interrupts.CPU32.NMI:Non-maskable_interrupts
495.75 ± 31% +39.2% 690.25 ± 5% interrupts.CPU32.PMI:Performance_monitoring_interrupts
502.75 ± 31% +38.9% 698.50 ± 7% interrupts.CPU33.NMI:Non-maskable_interrupts
502.75 ± 31% +38.9% 698.50 ± 7% interrupts.CPU33.PMI:Performance_monitoring_interrupts
534.75 ± 25% +34.5% 719.00 ± 4% interrupts.CPU34.NMI:Non-maskable_interrupts
534.75 ± 25% +34.5% 719.00 ± 4% interrupts.CPU34.PMI:Performance_monitoring_interrupts
157.50 ±110% -99.7% 0.50 ±100% interrupts.CPU36.RES:Rescheduling_interrupts
537.75 ± 27% +35.4% 728.25 ± 9% interrupts.CPU38.NMI:Non-maskable_interrupts
537.75 ± 27% +35.4% 728.25 ± 9% interrupts.CPU38.PMI:Performance_monitoring_interrupts
512.25 ± 26% +37.2% 703.00 ± 5% interrupts.CPU39.NMI:Non-maskable_interrupts
512.25 ± 26% +37.2% 703.00 ± 5% interrupts.CPU39.PMI:Performance_monitoring_interrupts
2977 ± 11% +32.1% 3934 ± 14% interrupts.CPU52.CAL:Function_call_interrupts
3185 ± 13% +32.5% 4221 ± 13% interrupts.CPU53.CAL:Function_call_interrupts
515.25 ± 31% +37.3% 707.50 ± 5% interrupts.CPU66.NMI:Non-maskable_interrupts
515.25 ± 31% +37.3% 707.50 ± 5% interrupts.CPU66.PMI:Performance_monitoring_interrupts
493.75 ± 34% +42.2% 702.25 ± 4% interrupts.CPU67.NMI:Non-maskable_interrupts
493.75 ± 34% +42.2% 702.25 ± 4% interrupts.CPU67.PMI:Performance_monitoring_interrupts
509.00 ± 29% +37.1% 698.00 ± 5% interrupts.CPU68.NMI:Non-maskable_interrupts
509.00 ± 29% +37.1% 698.00 ± 5% interrupts.CPU68.PMI:Performance_monitoring_interrupts
4967 ± 21% -35.0% 3229 ± 48% interrupts.CPU7.NMI:Non-maskable_interrupts
4967 ± 21% -35.0% 3229 ± 48% interrupts.CPU7.PMI:Performance_monitoring_interrupts
478.25 ± 34% +46.8% 702.00 ± 9% interrupts.CPU72.NMI:Non-maskable_interrupts
478.25 ± 34% +46.8% 702.00 ± 9% interrupts.CPU72.PMI:Performance_monitoring_interrupts
465.00 ± 30% +38.4% 643.75 ± 4% interrupts.CPU74.NMI:Non-maskable_interrupts
465.00 ± 30% +38.4% 643.75 ± 4% interrupts.CPU74.PMI:Performance_monitoring_interrupts
4978 ± 21% -42.2% 2877 interrupts.CPU9.NMI:Non-maskable_interrupts
4978 ± 21% -42.2% 2877 interrupts.CPU9.PMI:Performance_monitoring_interrupts
will-it-scale.per_process_ops
142000 +-+----------------------------------------------------------------+
|..+..+..+..+..+..+..+..+. +. +..+..+..+..+..+..+..+..+..+..|
140000 +-+ |
| |
138000 O-+ O O O O |
| O O |
136000 +-+ |
| |
134000 +-+ |
| |
132000 +-+ O O |
| O O O O O O O O O |
130000 +-+ O O O O O
| |
128000 +-+----------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Rong Chen
View attachment "config-5.5.0-rc3-00050-g471af006a747f" of type "text/plain" (202322 bytes)
View attachment "job-script" of type "text/plain" (7891 bytes)
View attachment "job.yaml" of type "text/plain" (5473 bytes)
View attachment "reproduce" of type "text/plain" (309 bytes)
Powered by blists - more mailing lists