[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20191027051812.GJ29418@shao2-debian>
Date: Sun, 27 Oct 2019 13:18:12 +0800
From: kernel test robot <rong.a.chen@...el.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: mingo@...nel.org, peterz@...radead.org,
linux-kernel@...r.kernel.org, acme@...nel.org,
mark.rutland@....com, alexander.shishkin@...ux.intel.com,
jolsa@...hat.com, namhyung@...nel.org, andi@...stfloor.org,
kan.liang@...ux.intel.com, lkp@...ts.01.org
Subject: [perf] 06e0dbcfd3: phoronix-test-suite.mbw.0.mib_s 12.6%
improvement
Greeting,
FYI, we noticed a 12.6% improvement of phoronix-test-suite.mbw.0.mib_s due to commit:
commit: 06e0dbcfd33c53ac0046e5a1f93f7b8d71c40fc7 ("[PATCH 2/3] perf: Optimize perf_init_event()")
url: https://github.com/0day-ci/linux/commits/Peter-Zijlstra/Various-optimizations-for-event-creation/20191024-170638
in testcase: phoronix-test-suite
on test machine: 16 threads Intel(R) Xeon(R) CPU X5570 @ 2.93GHz with 48G memory
with following parameters:
test: mbw-1.0.0
cpufreq_governor: performance
test-description: The Phoronix Test Suite is the most comprehensive testing and benchmarking platform available that provides an extensible framework for which new tests can be easily added.
test-url: http://www.phoronix-test-suite.com/
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase:
gcc-7/performance/x86_64-rhel-7.6/debian-x86_64-phoronix/lkp-nhm-2ep1/mbw-1.0.0/phoronix-test-suite
commit:
c204d011d5 ("perf: Optimize perf_install_in_event()")
06e0dbcfd3 ("perf: Optimize perf_init_event()")
c204d011d597993a 06e0dbcfd33c53ac0046e5a1f93
---------------- ---------------------------
fail:runs %reproduction fail:runs
| | |
:4 75% 3:4 dmesg.BUG:scheduling_while_atomic
:4 180% 7:4 perf-profile.children.cycles-pp.error_entry
%stddev %change %stddev
\ | \
4480 ± 2% +12.6% 5044 phoronix-test-suite.mbw.0.mib_s
4905 +6.5% 5224 ± 2% phoronix-test-suite.mbw.1.mib_s
123842 ± 6% -25.8% 91945 ± 24% numa-meminfo.node1.AnonHugePages
21915 ± 24% -85.6% 3149 ± 9% vmstat.system.in
5945 ± 3% +25.9% 7484 ± 13% slabinfo.filp.num_objs
1205 ± 2% -22.8% 930.75 slabinfo.kmalloc-2k.active_objs
1224 ± 3% -21.9% 956.75 slabinfo.kmalloc-2k.num_objs
5763 -0.8% 5718 proc-vmstat.nr_kernel_stack
3199 -1.6% 3147 proc-vmstat.numa_other
2812 ±158% +874.4% 27405 ± 90% proc-vmstat.numa_pages_migrated
2812 ±158% +874.4% 27405 ± 90% proc-vmstat.pgmigrate_success
1152705 ± 86% -92.4% 87218 ± 47% cpuidle.C1.time
58509 ± 90% -91.8% 4816 ± 52% cpuidle.C1.usage
8002146 ± 95% -98.3% 139101 ± 42% cpuidle.C1E.time
132689 ± 99% -98.9% 1406 ± 36% cpuidle.C1E.usage
7.89e+08 ± 13% -91.1% 70529991 ± 26% cpuidle.C3.time
900688 ± 9% -91.9% 72507 ± 29% cpuidle.C3.usage
48753838 ± 82% +2009.0% 1.028e+09 ± 48% cpuidle.C6.time
48056 ± 76% +171.2% 130330 ± 33% cpuidle.C6.usage
9943122 ± 97% -97.9% 206756 ± 46% cpuidle.POLL.time
18354 ± 58% -59.9% 7364 ± 39% cpuidle.POLL.usage
71992 ± 7% -59.0% 29486 ± 35% interrupts.0:IO-APIC.2-edge.timer
59.25 ± 31% +124.9% 133.25 ± 63% interrupts.37:PCI-MSI.524291-edge.eth0-rx-2
71992 ± 7% -59.0% 29486 ± 35% interrupts.CPU0.0:IO-APIC.2-edge.timer
27089 ± 56% -79.1% 5658 ± 66% interrupts.CPU0.LOC:Local_timer_interrupts
13.50 ±164% +1431.5% 206.75 ± 96% interrupts.CPU0.TLB:TLB_shootdowns
79092 ± 13% -88.4% 9162 ± 40% interrupts.CPU1.LOC:Local_timer_interrupts
641.25 ± 34% +154.9% 1634 ± 20% interrupts.CPU1.RES:Rescheduling_interrupts
75334 ± 14% -85.2% 11173 ± 45% interrupts.CPU10.LOC:Local_timer_interrupts
194.25 ± 86% -77.0% 44.75 ± 91% interrupts.CPU10.RES:Rescheduling_interrupts
75022 ± 16% -88.4% 8685 ± 36% interrupts.CPU11.LOC:Local_timer_interrupts
568.75 ±115% -94.2% 33.00 ± 43% interrupts.CPU11.RES:Rescheduling_interrupts
75043 ± 16% -83.8% 12142 ± 61% interrupts.CPU12.LOC:Local_timer_interrupts
74930 ± 16% -85.3% 10999 ± 33% interrupts.CPU13.LOC:Local_timer_interrupts
1.25 ±131% +12840.0% 161.75 ±169% interrupts.CPU13.TLB:TLB_shootdowns
59.25 ± 31% +124.9% 133.25 ± 63% interrupts.CPU14.37:PCI-MSI.524291-edge.eth0-rx-2
75615 ± 14% -84.7% 11539 ± 27% interrupts.CPU14.LOC:Local_timer_interrupts
501.50 ± 56% -94.6% 27.25 ± 25% interrupts.CPU14.RES:Rescheduling_interrupts
78040 ± 11% -88.0% 9388 ± 45% interrupts.CPU15.LOC:Local_timer_interrupts
74913 ± 16% -86.2% 10367 ± 26% interrupts.CPU2.LOC:Local_timer_interrupts
76688 ± 16% -89.6% 7975 ± 29% interrupts.CPU3.LOC:Local_timer_interrupts
75758 ± 14% -85.6% 10937 ± 19% interrupts.CPU4.LOC:Local_timer_interrupts
75875 ± 17% -87.6% 9380 ± 13% interrupts.CPU5.LOC:Local_timer_interrupts
74896 ± 15% -84.4% 11680 ± 34% interrupts.CPU6.LOC:Local_timer_interrupts
76311 ± 13% -81.9% 13845 ± 36% interrupts.CPU7.LOC:Local_timer_interrupts
75970 ± 16% -84.6% 11704 ± 49% interrupts.CPU8.LOC:Local_timer_interrupts
77046 ± 15% -90.8% 7092 ± 16% interrupts.CPU9.LOC:Local_timer_interrupts
1167625 ± 15% -86.1% 161731 ± 28% interrupts.LOC:Local_timer_interrupts
12892 ± 8% -52.1% 6173 ± 4% softirqs.CPU0.SCHED
31018 ± 4% -51.8% 14957 ± 14% softirqs.CPU0.TIMER
10756 ± 6% -43.7% 6051 ± 9% softirqs.CPU1.SCHED
30437 ± 10% -61.5% 11724 ± 18% softirqs.CPU1.TIMER
9770 ± 6% -66.5% 3268 ± 19% softirqs.CPU10.SCHED
27722 ± 7% -60.0% 11091 ± 19% softirqs.CPU10.TIMER
10417 ± 7% -66.0% 3541 ± 10% softirqs.CPU11.SCHED
30529 ± 11% -65.2% 10636 ± 11% softirqs.CPU11.TIMER
10187 ± 5% -66.2% 3440 ± 15% softirqs.CPU12.SCHED
28111 ± 6% -57.9% 11835 ± 32% softirqs.CPU12.TIMER
9790 ± 8% -67.8% 3157 ± 17% softirqs.CPU13.SCHED
29315 ± 7% -53.0% 13765 ± 22% softirqs.CPU13.TIMER
9583 ± 7% -63.6% 3488 ± 25% softirqs.CPU14.SCHED
29318 ± 10% -49.5% 14793 ± 12% softirqs.CPU14.TIMER
10303 ± 9% -69.8% 3107 ± 8% softirqs.CPU15.SCHED
35959 ± 16% -66.2% 12147 ± 15% softirqs.CPU15.TIMER
10358 ± 4% -57.9% 4358 ± 25% softirqs.CPU2.SCHED
28406 ± 10% -58.0% 11921 ± 23% softirqs.CPU2.TIMER
10459 ± 9% -57.7% 4429 ± 7% softirqs.CPU3.SCHED
27497 ± 11% -59.5% 11143 ± 7% softirqs.CPU3.TIMER
9960 ± 12% -65.8% 3409 ± 19% softirqs.CPU4.SCHED
27949 ± 5% -60.0% 11190 ± 6% softirqs.CPU4.TIMER
10009 ± 9% -62.2% 3778 ± 9% softirqs.CPU5.SCHED
27090 ± 6% -56.4% 11808 ± 7% softirqs.CPU5.TIMER
9224 ± 12% -69.8% 2783 ± 22% softirqs.CPU6.SCHED
26628 ± 13% -65.6% 9157 ± 20% softirqs.CPU6.TIMER
10475 ± 11% -68.9% 3254 ± 5% softirqs.CPU7.SCHED
30529 ± 15% -55.8% 13502 ± 20% softirqs.CPU7.TIMER
10705 ± 13% -69.8% 3230 ± 13% softirqs.CPU8.SCHED
27870 ± 9% -62.1% 10558 ± 22% softirqs.CPU8.TIMER
9213 ± 10% -65.0% 3224 ± 9% softirqs.CPU9.SCHED
29536 ± 10% -67.5% 9606 ± 9% softirqs.CPU9.TIMER
66127 ± 12% -37.1% 41572 ± 13% softirqs.RCU
164112 ± 5% -63.0% 60699 ± 8% softirqs.SCHED
467926 ± 6% -59.4% 189843 ± 10% softirqs.TIMER
16.76 ± 23% -100.0% 0.00 perf-stat.i.MPKI
5.023e+08 ± 19% -100.0% 0.00 perf-stat.i.branch-instructions
2.35 ± 6% -2.3 0.00 perf-stat.i.branch-miss-rate%
18797300 ± 12% -100.0% 0.00 perf-stat.i.branch-misses
30.32 ± 4% -30.3 0.00 perf-stat.i.cache-miss-rate%
17828018 ± 15% -100.0% 0.00 perf-stat.i.cache-misses
28884105 ± 9% -100.0% 0.00 perf-stat.i.cache-references
3.14 ± 5% -100.0% 0.00 perf-stat.i.cpi
6.031e+09 ± 15% -100.0% 0.00 perf-stat.i.cpu-cycles
5052 ± 26% -100.0% 0.00 perf-stat.i.cycles-between-cache-misses
0.09 ± 21% -0.1 0.00 perf-stat.i.dTLB-load-miss-rate%
747496 ± 4% -100.0% 0.00 perf-stat.i.dTLB-load-misses
9.926e+08 ± 7% -100.0% 0.00 perf-stat.i.dTLB-loads
0.18 ± 4% -0.2 0.00 perf-stat.i.dTLB-store-miss-rate%
713615 ± 3% -100.0% 0.00 perf-stat.i.dTLB-store-misses
8.47e+08 ± 3% -100.0% 0.00 perf-stat.i.dTLB-stores
0.02 ± 13% -0.0 0.00 perf-stat.i.iTLB-load-miss-rate%
291714 ± 14% -100.0% 0.00 perf-stat.i.iTLB-load-misses
2.416e+09 ± 15% -100.0% 0.00 perf-stat.i.iTLB-loads
2.39e+09 ± 15% -100.0% 0.00 perf-stat.i.instructions
9722 ± 2% -100.0% 0.00 perf-stat.i.instructions-per-iTLB-miss
0.37 ± 4% -100.0% 0.00 perf-stat.i.ipc
12.36 ± 19% -100.0% 0.00 perf-stat.overall.MPKI
3.81 ± 12% -3.8 0.00 perf-stat.overall.branch-miss-rate%
61.30 ± 5% -61.3 0.00 perf-stat.overall.cache-miss-rate%
2.53 -100.0% 0.00 perf-stat.overall.cpi
350.57 ± 22% -100.0% 0.00 perf-stat.overall.cycles-between-cache-misses
0.08 ± 7% -0.1 0.00 perf-stat.overall.dTLB-load-miss-rate%
0.08 ± 5% -0.1 0.00 perf-stat.overall.dTLB-store-miss-rate%
0.01 ± 2% -0.0 0.00 perf-stat.overall.iTLB-load-miss-rate%
8204 ± 2% -100.0% 0.00 perf-stat.overall.instructions-per-iTLB-miss
0.40 -100.0% 0.00 perf-stat.overall.ipc
4.932e+08 ± 19% -100.0% 0.00 perf-stat.ps.branch-instructions
18451216 ± 12% -100.0% 0.00 perf-stat.ps.branch-misses
17425026 ± 15% -100.0% 0.00 perf-stat.ps.cache-misses
28269421 ± 9% -100.0% 0.00 perf-stat.ps.cache-references
5.942e+09 ± 15% -100.0% 0.00 perf-stat.ps.cpu-cycles
732958 ± 3% -100.0% 0.00 perf-stat.ps.dTLB-load-misses
9.783e+08 ± 7% -100.0% 0.00 perf-stat.ps.dTLB-loads
700749 ± 2% -100.0% 0.00 perf-stat.ps.dTLB-store-misses
8.398e+08 ± 3% -100.0% 0.00 perf-stat.ps.dTLB-stores
285931 ± 14% -100.0% 0.00 perf-stat.ps.iTLB-load-misses
2.375e+09 ± 14% -100.0% 0.00 perf-stat.ps.iTLB-loads
2.349e+09 ± 15% -100.0% 0.00 perf-stat.ps.instructions
1.334e+11 ± 7% -100.0% 0.00 perf-stat.total.instructions
17.55 ±116% -13.5 4.06 ±173% perf-profile.calltrace.cycles-pp.smp_call_function_single.event_function_call.perf_remove_from_context.perf_event_release_kernel.perf_release
16.45 ± 63% -11.8 4.62 ±173% perf-profile.calltrace.cycles-pp.task_work_run.do_exit.do_group_exit.get_signal.do_signal
16.45 ± 63% -11.8 4.62 ±173% perf-profile.calltrace.cycles-pp.__fput.task_work_run.do_exit.do_group_exit.get_signal
16.45 ± 63% -11.8 4.62 ±173% perf-profile.calltrace.cycles-pp.perf_release.__fput.task_work_run.do_exit.do_group_exit
16.45 ± 63% -11.8 4.62 ±173% perf-profile.calltrace.cycles-pp.perf_event_release_kernel.perf_release.__fput.task_work_run.do_exit
12.50 ±101% -8.4 4.06 ±173% perf-profile.calltrace.cycles-pp.event_function_call.perf_remove_from_context.perf_event_release_kernel.perf_release.__fput
12.50 ±101% -8.4 4.06 ±173% perf-profile.calltrace.cycles-pp.perf_remove_from_context.perf_event_release_kernel.perf_release.__fput.task_work_run
7.63 ± 73% -7.6 0.00 perf-profile.calltrace.cycles-pp.drm_client_buffer_vmap.drm_fb_helper_dirty_work.process_one_work.worker_thread.kthread
7.63 ± 73% -7.6 0.00 perf-profile.calltrace.cycles-pp.drm_gem_vmap.drm_client_buffer_vmap.drm_fb_helper_dirty_work.process_one_work.worker_thread
7.63 ± 73% -7.6 0.00 perf-profile.calltrace.cycles-pp.drm_gem_vram_object_vmap.drm_gem_vmap.drm_client_buffer_vmap.drm_fb_helper_dirty_work.process_one_work
7.63 ± 73% -7.6 0.00 perf-profile.calltrace.cycles-pp.drm_gem_vram_kmap.drm_gem_vram_object_vmap.drm_gem_vmap.drm_client_buffer_vmap.drm_fb_helper_dirty_work
7.63 ± 73% -7.6 0.00 perf-profile.calltrace.cycles-pp.ttm_bo_kmap.drm_gem_vram_kmap.drm_gem_vram_object_vmap.drm_gem_vmap.drm_client_buffer_vmap
7.63 ± 73% -7.6 0.00 perf-profile.calltrace.cycles-pp.__ioremap_caller.ttm_bo_kmap.drm_gem_vram_kmap.drm_gem_vram_object_vmap.drm_gem_vmap
7.62 ± 73% -7.6 0.00 perf-profile.calltrace.cycles-pp.on_each_cpu.flush_tlb_kernel_range.pmd_free_pte_page.ioremap_page_range.__ioremap_caller
7.62 ± 73% -7.6 0.00 perf-profile.calltrace.cycles-pp.smp_call_function_many.on_each_cpu.flush_tlb_kernel_range.pmd_free_pte_page.ioremap_page_range
7.62 ± 73% -7.6 0.00 perf-profile.calltrace.cycles-pp.ioremap_page_range.__ioremap_caller.ttm_bo_kmap.drm_gem_vram_kmap.drm_gem_vram_object_vmap
7.62 ± 73% -7.6 0.00 perf-profile.calltrace.cycles-pp.pmd_free_pte_page.ioremap_page_range.__ioremap_caller.ttm_bo_kmap.drm_gem_vram_kmap
7.62 ± 73% -7.6 0.00 perf-profile.calltrace.cycles-pp.flush_tlb_kernel_range.pmd_free_pte_page.ioremap_page_range.__ioremap_caller.ttm_bo_kmap
8.46 ± 74% -4.9 3.56 ±173% perf-profile.calltrace.cycles-pp.drm_fb_helper_dirty_work.process_one_work.worker_thread.kthread.ret_from_fork
4.87 ± 57% -1.5 3.38 ±173% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.__tick_broadcast_oneshot_control.intel_idle.cpuidle_enter_state
4.88 ± 57% -1.5 3.40 ±173% perf-profile.calltrace.cycles-pp._raw_spin_lock.__tick_broadcast_oneshot_control.intel_idle.cpuidle_enter_state.cpuidle_enter
4.97 ± 57% -1.5 3.49 ±173% perf-profile.calltrace.cycles-pp.__tick_broadcast_oneshot_control.intel_idle.cpuidle_enter_state.cpuidle_enter.do_idle
26.94 ± 51% -14.7 12.21 ±110% perf-profile.children.cycles-pp.smp_call_function_single
16.45 ± 63% -11.8 4.62 ±173% perf-profile.children.cycles-pp.perf_release
16.45 ± 63% -11.8 4.62 ±173% perf-profile.children.cycles-pp.perf_event_release_kernel
16.46 ± 63% -11.8 4.66 ±172% perf-profile.children.cycles-pp.__fput
16.46 ± 63% -11.8 4.68 ±171% perf-profile.children.cycles-pp.task_work_run
9.22 ± 41% -8.6 0.60 ±161% perf-profile.children.cycles-pp.on_each_cpu
9.22 ± 41% -8.6 0.60 ±160% perf-profile.children.cycles-pp.smp_call_function_many
12.50 ±101% -8.4 4.06 ±173% perf-profile.children.cycles-pp.perf_remove_from_context
7.63 ± 73% -7.6 0.04 ±173% perf-profile.children.cycles-pp.drm_client_buffer_vmap
7.63 ± 73% -7.6 0.04 ±173% perf-profile.children.cycles-pp.drm_gem_vmap
7.63 ± 73% -7.6 0.04 ±173% perf-profile.children.cycles-pp.drm_gem_vram_object_vmap
7.63 ± 73% -7.6 0.04 ±173% perf-profile.children.cycles-pp.drm_gem_vram_kmap
7.63 ± 73% -7.6 0.04 ±173% perf-profile.children.cycles-pp.ttm_bo_kmap
7.63 ± 73% -7.6 0.04 ±173% perf-profile.children.cycles-pp.__ioremap_caller
7.62 ± 73% -7.6 0.03 ±173% perf-profile.children.cycles-pp.pmd_free_pte_page
7.62 ± 73% -7.6 0.03 ±173% perf-profile.children.cycles-pp.flush_tlb_kernel_range
7.62 ± 73% -7.6 0.03 ±173% perf-profile.children.cycles-pp.ioremap_page_range
8.46 ± 74% -4.9 3.56 ±173% perf-profile.children.cycles-pp.drm_fb_helper_dirty_work
3.50 ± 71% -2.8 0.71 ±173% perf-profile.children.cycles-pp.irq_work_run
3.50 ± 71% -2.8 0.71 ±173% perf-profile.children.cycles-pp.printk
3.50 ± 71% -2.8 0.74 ±173% perf-profile.children.cycles-pp.irq_work_run_list
3.39 ± 75% -2.7 0.71 ±173% perf-profile.children.cycles-pp.irq_work_interrupt
3.39 ± 75% -2.7 0.71 ±173% perf-profile.children.cycles-pp.smp_irq_work_interrupt
4.90 ± 57% -1.5 3.39 ±173% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
4.99 ± 57% -1.5 3.52 ±173% perf-profile.children.cycles-pp.__tick_broadcast_oneshot_control
0.02 ±173% +7.0 6.99 ±114% perf-profile.children.cycles-pp.do_filp_open
0.02 ±173% +7.0 6.99 ±114% perf-profile.children.cycles-pp.path_openat
0.00 +7.0 6.98 ±114% perf-profile.children.cycles-pp.do_sys_open
9.18 ± 41% -8.6 0.60 ±161% perf-profile.self.cycles-pp.smp_call_function_many
3.50 ± 71% -3.5 0.00 perf-profile.self.cycles-pp.vprintk_emit
4.90 ± 57% -1.5 3.39 ±173% perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
phoronix-test-suite.mbw.0.mib_s
6000 +-+------------------------------------------------------------------+
| |
5000 O-O O O O O O O O O O O O O O O O O O O O O O |
| .+.|
|.+.+.+.+.+.+.+.+.+.+.+.+.+.+.+.+..+.+ + + + +.+.+.+.+.+.+ |
4000 +-+ : : : : : |
| : : : : : |
3000 +-+ : : : : : : : : |
| : : : : : : : : |
2000 +-+ : : : : : : : : |
| : : : : : : : : |
| : : : : : : : : |
1000 +-+ : : : : |
| : : : : |
0 +-+------------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Rong Chen
View attachment "config-5.4.0-rc3-00149-g06e0dbcfd33c5" of type "text/plain" (200562 bytes)
View attachment "job-script" of type "text/plain" (6984 bytes)
View attachment "job.yaml" of type "text/plain" (4577 bytes)
View attachment "reproduce" of type "text/plain" (254 bytes)
Powered by blists - more mailing lists