[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200305062138.GI5972@shao2-debian>
Date: Thu, 5 Mar 2020 14:21:38 +0800
From: kernel test robot <rong.a.chen@...el.com>
To: Arnd Bergmann <arnd@...db.de>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
Thomas Gleixner <tglx@...utronix.de>,
Deepa Dinamani <deepa.kernel@...il.com>,
Andrew Morton <akpm@...ux-foundation.org>,
LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org
Subject: [y2038] 412c53a680: will-it-scale.per_process_ops 11.7% improvement
Greeting,
FYI, we noticed a 11.7% improvement of will-it-scale.per_process_ops due to commit:
commit: 412c53a680a97cb1ae2c0ab60230e193bee86387 ("y2038: remove unused time32 interfaces")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
in testcase: will-it-scale
on test machine: 72 threads Intel(R) Xeon(R) Gold 6139 CPU @ 2.30GHz with 128G memory
with following parameters:
nr_task: 100%
mode: process
test: mmap2
cpufreq_governor: performance
ucode: 0x2000065
test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
test-url: https://github.com/antonblanchard/will-it-scale
In addition to that, the commit also has significant impact on the following tests:
+------------------+---------------------------------------------+
| testcase: change | unixbench: unixbench.score 3.8% improvement |
| test machine | 104 threads Skylake with 192G memory |
| test parameters | cpufreq_governor=performance |
| | nr_task=30% |
| | runtime=300s |
| | test=context1 |
| | ucode=0x2000065 |
+------------------+---------------------------------------------+
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
gcc-7/performance/x86_64-rhel-7.6/process/100%/debian-x86_64-20191114.cgz/lkp-skl-2sp7/mmap2/will-it-scale/0x2000065
commit:
595abbaff5 ("y2038: remove ktime to/from timespec/timeval conversion")
412c53a680 ("y2038: remove unused time32 interfaces")
595abbaff5db1214 412c53a680a97cb1ae2c0ab6023
---------------- ---------------------------
%stddev %change %stddev
\ | \
16079 +11.7% 17954 will-it-scale.per_process_ops
1157721 +11.7% 1292722 will-it-scale.workload
535.67 ±141% +201.4% 1614 meminfo.Mlocked
1.01 +0.1 1.12 mpstat.cpu.all.usr%
61269544 ± 54% +137.2% 1.453e+08 ± 48% cpuidle.C1E.time
143915 ± 36% +133.1% 335431 ± 57% cpuidle.C1E.usage
2261 ± 2% -15.1% 1920 ± 5% slabinfo.fsnotify_mark_connector.active_objs
2261 ± 2% -15.1% 1920 ± 5% slabinfo.fsnotify_mark_connector.num_objs
1649 ± 8% -12.5% 1443 ± 2% slabinfo.skbuff_ext_cache.active_objs
1649 ± 8% -12.5% 1443 ± 2% slabinfo.skbuff_ext_cache.num_objs
14258 ± 12% +25.5% 17895 ± 13% numa-meminfo.node0.Mapped
4132 ± 15% +20.3% 4970 numa-meminfo.node0.PageTables
5989 ± 10% -10.3% 5374 ± 4% numa-meminfo.node1.KernelStack
17133 ± 11% -18.9% 13895 ± 17% numa-meminfo.node1.Mapped
4799 ± 12% -18.1% 3928 numa-meminfo.node1.PageTables
7936 +1.4% 8050 proc-vmstat.nr_mapped
133.67 ±141% +201.5% 403.00 proc-vmstat.nr_mlock
18121 ± 2% +5.7% 19153 ± 3% proc-vmstat.nr_shmem
743293 +1.2% 752255 proc-vmstat.pgalloc_normal
700933 +1.4% 710819 proc-vmstat.pgfree
3601 ± 12% +27.9% 4605 ± 12% numa-vmstat.node0.nr_mapped
1033 ± 15% +20.3% 1243 numa-vmstat.node0.nr_page_table_pages
156534 ± 6% -33.2% 104509 ± 64% numa-vmstat.node0.numa_other
5989 ± 10% -10.3% 5375 ± 4% numa-vmstat.node1.nr_kernel_stack
56.00 ±141% +278.0% 211.67 ± 13% numa-vmstat.node1.nr_mlock
1199 ± 12% -18.1% 982.33 numa-vmstat.node1.nr_page_table_pages
5818 ± 35% +43.9% 8370 ± 6% interrupts.CPU26.NMI:Non-maskable_interrupts
5818 ± 35% +43.9% 8370 ± 6% interrupts.CPU26.PMI:Performance_monitoring_interrupts
414.67 ± 23% -24.8% 311.67 interrupts.CPU52.RES:Rescheduling_interrupts
5819 ± 35% +43.8% 8366 ± 6% interrupts.CPU56.NMI:Non-maskable_interrupts
5819 ± 35% +43.8% 8366 ± 6% interrupts.CPU56.PMI:Performance_monitoring_interrupts
5818 ± 35% +43.8% 8364 ± 6% interrupts.CPU59.NMI:Non-maskable_interrupts
5818 ± 35% +43.8% 8364 ± 6% interrupts.CPU59.PMI:Performance_monitoring_interrupts
471.33 ± 16% -26.0% 349.00 ± 6% interrupts.CPU9.RES:Rescheduling_interrupts
127.33 ± 3% +11.8% 142.33 ± 2% interrupts.IWI:IRQ_work_interrupts
10312 ± 92% -67.7% 3333 ± 6% sched_debug.cfs_rq:/.load.stddev
27.16 ± 7% -16.2% 22.76 ± 8% sched_debug.cfs_rq:/.load_avg.avg
47.12 ± 6% -17.5% 38.89 ± 10% sched_debug.cfs_rq:/.load_avg.stddev
8.99 ± 22% -46.6% 4.80 ± 41% sched_debug.cfs_rq:/.removed.load_avg.avg
37.49 ± 10% -27.4% 27.22 ± 21% sched_debug.cfs_rq:/.removed.load_avg.stddev
413.16 ± 23% -46.4% 221.38 ± 42% sched_debug.cfs_rq:/.removed.runnable_sum.avg
1722 ± 10% -27.2% 1253 ± 22% sched_debug.cfs_rq:/.removed.runnable_sum.stddev
2.71 ± 22% -41.0% 1.60 ± 16% sched_debug.cfs_rq:/.removed.util_avg.avg
1480 ± 4% +11.4% 1648 sched_debug.cpu.curr->pid.min
0.19 ± 2% -13.4% 0.17 ± 7% sched_debug.cpu.nr_running.stddev
32567 ± 5% +13.4% 36931 ± 4% sched_debug.cpu.nr_switches.max
62048 ± 5% +21.0% 75097 ± 10% softirqs.CPU10.RCU
60821 +9.1% 66347 softirqs.CPU17.RCU
59451 ± 3% +19.2% 70891 ± 2% softirqs.CPU22.RCU
59548 +7.6% 64046 ± 2% softirqs.CPU34.RCU
59706 +16.4% 69478 ± 6% softirqs.CPU41.RCU
61173 ± 4% +22.1% 74662 ± 12% softirqs.CPU46.RCU
59827 +21.6% 72779 ± 14% softirqs.CPU5.RCU
60645 +11.0% 67300 ± 5% softirqs.CPU53.RCU
58533 ± 2% +9.0% 63779 ± 2% softirqs.CPU57.RCU
60026 ± 2% +15.7% 69444 ± 5% softirqs.CPU58.RCU
61127 ± 2% +11.4% 68125 ± 4% softirqs.CPU63.RCU
4.413e+09 +7.3% 4.733e+09 perf-stat.i.branch-instructions
19144349 +5.4% 20170738 perf-stat.i.branch-misses
41.28 -0.5 40.80 perf-stat.i.cache-miss-rate%
22256771 +3.9% 23133318 ± 2% perf-stat.i.cache-misses
53886473 +5.1% 56658720 ± 2% perf-stat.i.cache-references
11.77 -7.4% 10.90 perf-stat.i.cpi
9929 -4.3% 9499 ± 2% perf-stat.i.cycles-between-cache-misses
0.05 +0.0 0.05 ± 3% perf-stat.i.dTLB-load-miss-rate%
2370068 +11.2% 2635764 perf-stat.i.dTLB-load-misses
5.229e+09 +7.8% 5.638e+09 perf-stat.i.dTLB-loads
9837 ± 3% +11.8% 10995 ± 6% perf-stat.i.dTLB-store-misses
1.718e+09 +10.8% 1.903e+09 perf-stat.i.dTLB-stores
94.23 -9.1 85.12 perf-stat.i.iTLB-load-miss-rate%
2419173 -33.5% 1607709 perf-stat.i.iTLB-load-misses
146272 ± 3% +94.5% 284553 ± 8% perf-stat.i.iTLB-loads
1.88e+10 +7.4% 2.019e+10 perf-stat.i.instructions
7824 +61.4% 12627 perf-stat.i.instructions-per-iTLB-miss
0.09 +8.0% 0.09 perf-stat.i.ipc
5549135 +6.7% 5919869 perf-stat.i.node-load-misses
4803604 +6.0% 5091055 perf-stat.i.node-store-misses
0.43 -0.0 0.43 perf-stat.overall.branch-miss-rate%
41.30 -0.5 40.82 perf-stat.overall.cache-miss-rate%
11.76 -7.2% 10.91 perf-stat.overall.cpi
9937 -4.1% 9530 ± 2% perf-stat.overall.cycles-between-cache-misses
0.05 +0.0 0.05 perf-stat.overall.dTLB-load-miss-rate%
94.30 -9.3 84.96 perf-stat.overall.iTLB-load-miss-rate%
7770 +61.7% 12560 perf-stat.overall.instructions-per-iTLB-miss
0.09 +7.8% 0.09 perf-stat.overall.ipc
4887274 -3.3% 4723803 perf-stat.overall.path-length
4.398e+09 +7.3% 4.717e+09 perf-stat.ps.branch-instructions
19086955 +5.4% 20108226 perf-stat.ps.branch-misses
22182689 +3.9% 23057159 ± 2% perf-stat.ps.cache-misses
53712718 +5.2% 56486162 ± 2% perf-stat.ps.cache-references
2362866 +11.3% 2630144 perf-stat.ps.dTLB-load-misses
5.211e+09 +7.8% 5.62e+09 perf-stat.ps.dTLB-loads
9852 ± 3% +12.8% 11114 ± 6% perf-stat.ps.dTLB-store-misses
1.712e+09 +10.8% 1.897e+09 perf-stat.ps.dTLB-stores
2411038 -33.5% 1602293 perf-stat.ps.iTLB-load-misses
145843 ± 3% +94.6% 283843 ± 8% perf-stat.ps.iTLB-loads
1.873e+10 +7.4% 2.013e+10 perf-stat.ps.instructions
5530540 +6.7% 5900087 perf-stat.ps.node-load-misses
4787463 +6.0% 5073936 perf-stat.ps.node-store-misses
5.658e+12 +7.9% 6.107e+12 perf-stat.total.instructions
47.65 -0.8 46.85 perf-profile.calltrace.cycles-pp.percpu_counter_add_batch.__vm_enough_memory.mmap_region.do_mmap.vm_mmap_pgoff
47.66 -0.8 46.85 perf-profile.calltrace.cycles-pp.__vm_enough_memory.mmap_region.do_mmap.vm_mmap_pgoff.ksys_mmap_pgoff
49.09 -0.8 48.31 perf-profile.calltrace.cycles-pp.do_mmap.vm_mmap_pgoff.ksys_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe
48.84 -0.8 48.06 perf-profile.calltrace.cycles-pp.mmap_region.do_mmap.vm_mmap_pgoff.ksys_mmap_pgoff.do_syscall_64
47.18 -0.8 46.40 perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.percpu_counter_add_batch.__vm_enough_memory.mmap_region.do_mmap
49.24 -0.8 48.47 perf-profile.calltrace.cycles-pp.vm_mmap_pgoff.ksys_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe.mmap64
49.31 -0.8 48.55 perf-profile.calltrace.cycles-pp.ksys_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe.mmap64
46.98 -0.8 46.23 perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.percpu_counter_add_batch.__vm_enough_memory.mmap_region
49.62 -0.7 48.89 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.mmap64
49.63 -0.7 48.90 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.mmap64
50.01 -0.7 49.31 perf-profile.calltrace.cycles-pp.mmap64
1.17 +0.1 1.28 perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.unmap_region.__do_munmap.__vm_munmap
1.27 +0.1 1.40 perf-profile.calltrace.cycles-pp.unmap_vmas.unmap_region.__do_munmap.__vm_munmap.__x64_sys_munmap
1.69 +0.2 1.84 perf-profile.calltrace.cycles-pp.unmap_region.__do_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64
46.89 +0.5 47.36 perf-profile.calltrace.cycles-pp.percpu_counter_add_batch.__do_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64
46.41 +0.5 46.89 perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.percpu_counter_add_batch.__do_munmap.__vm_munmap.__x64_sys_munmap
46.21 +0.5 46.70 perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.percpu_counter_add_batch.__do_munmap.__vm_munmap
0.00 +0.5 0.53 perf-profile.calltrace.cycles-pp.___might_sleep.unmap_page_range.unmap_vmas.unmap_region.__do_munmap
49.26 +0.6 49.89 perf-profile.calltrace.cycles-pp.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.munmap
49.29 +0.6 49.92 perf-profile.calltrace.cycles-pp.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.munmap
48.83 +0.6 49.48 perf-profile.calltrace.cycles-pp.__do_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe
49.59 +0.7 50.26 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.munmap
49.60 +0.7 50.27 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.munmap
49.94 +0.7 50.64 perf-profile.calltrace.cycles-pp.munmap
47.66 -0.8 46.85 perf-profile.children.cycles-pp.__vm_enough_memory
49.10 -0.8 48.32 perf-profile.children.cycles-pp.do_mmap
48.84 -0.8 48.07 perf-profile.children.cycles-pp.mmap_region
49.24 -0.8 48.48 perf-profile.children.cycles-pp.vm_mmap_pgoff
49.31 -0.8 48.55 perf-profile.children.cycles-pp.ksys_mmap_pgoff
50.03 -0.7 49.33 perf-profile.children.cycles-pp.mmap64
94.55 -0.3 94.21 perf-profile.children.cycles-pp.percpu_counter_add_batch
93.61 -0.3 93.31 perf-profile.children.cycles-pp._raw_spin_lock_irqsave
93.19 -0.3 92.92 perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
99.23 -0.1 99.18 perf-profile.children.cycles-pp.do_syscall_64
99.26 -0.0 99.21 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
0.46 ± 2% -0.0 0.43 perf-profile.children.cycles-pp.vm_area_alloc
0.43 ± 3% -0.0 0.40 perf-profile.children.cycles-pp.kmem_cache_alloc
0.31 ± 2% -0.0 0.29 perf-profile.children.cycles-pp.apic_timer_interrupt
0.06 +0.0 0.07 perf-profile.children.cycles-pp.down_write_killable
0.08 ± 6% +0.0 0.09 perf-profile.children.cycles-pp.prepend_path
0.06 +0.0 0.07 ± 6% perf-profile.children.cycles-pp.shmem_mmap
0.10 ± 4% +0.0 0.12 ± 3% perf-profile.children.cycles-pp.perf_iterate_sb
0.05 +0.0 0.07 perf-profile.children.cycles-pp.unlink_file_vma
0.33 +0.0 0.35 perf-profile.children.cycles-pp.syscall_return_via_sysret
0.40 ± 4% +0.0 0.43 perf-profile.children.cycles-pp.perf_event_mmap
0.08 ± 5% +0.0 0.11 perf-profile.children.cycles-pp.free_pgtables
0.34 +0.0 0.38 perf-profile.children.cycles-pp.entry_SYSCALL_64
0.00 +0.1 0.05 perf-profile.children.cycles-pp.up_write
0.00 +0.1 0.05 perf-profile.children.cycles-pp.touch_atime
0.53 +0.1 0.60 perf-profile.children.cycles-pp.___might_sleep
1.25 +0.1 1.35 perf-profile.children.cycles-pp.unmap_page_range
1.28 +0.1 1.40 perf-profile.children.cycles-pp.unmap_vmas
1.69 +0.2 1.85 perf-profile.children.cycles-pp.unmap_region
49.29 +0.6 49.93 perf-profile.children.cycles-pp.__x64_sys_munmap
49.26 +0.6 49.90 perf-profile.children.cycles-pp.__vm_munmap
48.84 +0.6 49.48 perf-profile.children.cycles-pp.__do_munmap
49.97 +0.7 50.67 perf-profile.children.cycles-pp.munmap
93.19 -0.3 92.92 perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
0.42 ± 2% -0.0 0.38 perf-profile.self.cycles-pp._raw_spin_lock_irqsave
0.40 ± 3% -0.0 0.38 perf-profile.self.cycles-pp.kmem_cache_alloc
0.12 -0.0 0.11 perf-profile.self.cycles-pp._raw_spin_unlock_irqrestore
0.33 +0.0 0.35 perf-profile.self.cycles-pp.syscall_return_via_sysret
0.30 +0.0 0.33 perf-profile.self.cycles-pp.entry_SYSCALL_64
0.61 ± 2% +0.0 0.66 perf-profile.self.cycles-pp.unmap_page_range
0.58 +0.1 0.64 perf-profile.self.cycles-pp.do_syscall_64
0.50 +0.1 0.56 perf-profile.self.cycles-pp.___might_sleep
will-it-scale.per_process_ops
20000 +-------------------------------------------------------------------+
18000 |-+O O O O O O O O O O O O O O O O |
| |
16000 |..+.+..+.+..+..+.+..+..+.+..+.+..+..+.+..+.+..+..+.+..+..+.+..+ |
14000 |-+ : |
| : :|
12000 |-+ : :|
10000 |-+ : :|
8000 |-+ : :|
| : : |
6000 |-+ : : |
4000 |-+ : : |
| :: |
2000 |-+ : |
0 +-------------------------------------------------------------------+
will-it-scale.workload
1.4e+06 +-----------------------------------------------------------------+
| O O O O O O O O O O O O O O O O |
1.2e+06 |..+.+.. .+..+.+..+.+..+.+..+. .+..+.+..+. .+.+..+.+..+.+..+ |
| + +. +. : |
1e+06 |-+ : |
| : :|
800000 |-+ : :|
| : :|
600000 |-+ : :|
| : : |
400000 |-+ : : |
| :: |
200000 |-+ :: |
| : |
0 +-----------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
***************************************************************************************************
lkp-skl-fpga01: 104 threads Skylake with 192G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/nr_task/rootfs/runtime/tbox_group/test/testcase/ucode:
gcc-7/performance/x86_64-rhel-7.6/30%/debian-x86_64-20191114.cgz/300s/lkp-skl-fpga01/context1/unixbench/0x2000065
commit:
595abbaff5 ("y2038: remove ktime to/from timespec/timeval conversion")
412c53a680 ("y2038: remove unused time32 interfaces")
595abbaff5db1214 412c53a680a97cb1ae2c0ab6023
---------------- ---------------------------
fail:runs %reproduction fail:runs
| | |
1:4 -25% :4 dmesg.WARNING:at#for_ip_interrupt_entry/0x
%stddev %change %stddev
\ | \
3093 +3.8% 3209 unixbench.score
471.47 +36.6% 644.15 ± 40% unixbench.time.user_time
3.659e+08 +3.9% 3.801e+08 unixbench.time.voluntary_context_switches
4.838e+08 +3.8% 5.02e+08 unixbench.workload
3715604 +3.9% 3859617 vmstat.system.cs
1493573 -1.1% 1477832 proc-vmstat.numa_local
1620946 -1.3% 1599963 proc-vmstat.pgalloc_normal
1586478 -1.6% 1561368 proc-vmstat.pgfree
4505528 ± 8% -11.7% 3977138 ± 9% sched_debug.cfs_rq:/.MIN_vruntime.max
47683 ± 7% -11.2% 42356 ± 7% sched_debug.cfs_rq:/.exec_clock.min
4505528 ± 8% -11.7% 3977138 ± 9% sched_debug.cfs_rq:/.max_vruntime.max
2750853 ± 7% -11.4% 2435892 ± 7% sched_debug.cfs_rq:/.min_vruntime.min
0.31 ± 13% -33.1% 0.21 ± 25% sched_debug.cfs_rq:/.nr_spread_over.avg
3.18 ± 18% -28.1% 2.29 ± 22% sched_debug.cfs_rq:/.nr_spread_over.max
0.67 ± 11% -27.6% 0.48 ± 22% sched_debug.cfs_rq:/.nr_spread_over.stddev
133566 ± 48% +65.9% 221588 ± 31% sched_debug.cfs_rq:/.runnable_weight.max
899.95 ±173% +824.1% 8316 ±110% sched_debug.cpu.max_idle_balance_cost.stddev
317.01 ± 5% -9.2% 287.99 ± 7% sched_debug.cpu.ttwu_local.min
6.175e+09 +3.2% 6.375e+09 perf-stat.i.branch-instructions
6.846e+08 +2.4% 7.011e+08 perf-stat.i.cache-references
3734905 +3.9% 3879483 perf-stat.i.context-switches
7.582e+09 +3.4% 7.839e+09 perf-stat.i.dTLB-loads
4.468e+09 +3.6% 4.63e+09 perf-stat.i.dTLB-stores
4308069 +3.9% 4477501 perf-stat.i.iTLB-load-misses
28346961 +4.7% 29683898 perf-stat.i.iTLB-loads
2.823e+10 +3.3% 2.916e+10 perf-stat.i.instructions
3745153 +21.4% 4545801 perf-stat.i.node-load-misses
33138 +33.5% 44236 perf-stat.i.node-loads
3740245 +3.3% 3862099 perf-stat.i.node-store-misses
3.02 -2.9% 2.94 perf-stat.overall.cpi
0.33 +3.0% 0.34 perf-stat.overall.ipc
6.157e+09 +3.3% 6.357e+09 perf-stat.ps.branch-instructions
6.826e+08 +2.4% 6.991e+08 perf-stat.ps.cache-references
3723897 +3.9% 3868660 perf-stat.ps.context-switches
7.56e+09 +3.4% 7.817e+09 perf-stat.ps.dTLB-loads
4.454e+09 +3.7% 4.617e+09 perf-stat.ps.dTLB-stores
4295642 +3.9% 4465151 perf-stat.ps.iTLB-load-misses
28265235 +4.7% 29603109 perf-stat.ps.iTLB-loads
2.814e+10 +3.3% 2.908e+10 perf-stat.ps.instructions
3734083 +21.4% 4532943 perf-stat.ps.node-load-misses
33087 +33.4% 44149 perf-stat.ps.node-loads
3729103 +3.3% 3851108 perf-stat.ps.node-store-misses
1.103e+13 +3.3% 1.14e+13 perf-stat.total.instructions
372.25 ± 48% +114.4% 798.00 ± 28% interrupts.41:PCI-MSI.67633156-edge.eth0-TxRx-3
2920 ± 6% -12.3% 2561 ± 10% interrupts.CPU1.NMI:Non-maskable_interrupts
2920 ± 6% -12.3% 2561 ± 10% interrupts.CPU1.PMI:Performance_monitoring_interrupts
2691 ± 6% +18.2% 3179 ± 5% interrupts.CPU103.NMI:Non-maskable_interrupts
2691 ± 6% +18.2% 3179 ± 5% interrupts.CPU103.PMI:Performance_monitoring_interrupts
33.25 ± 37% +542.9% 213.75 ± 64% interrupts.CPU14.TLB:TLB_shootdowns
71.50 ±115% +202.8% 216.50 ± 83% interrupts.CPU16.TLB:TLB_shootdowns
3048 ± 13% -26.0% 2256 ± 14% interrupts.CPU18.NMI:Non-maskable_interrupts
3048 ± 13% -26.0% 2256 ± 14% interrupts.CPU18.PMI:Performance_monitoring_interrupts
2743 ± 4% -14.8% 2337 ± 18% interrupts.CPU19.NMI:Non-maskable_interrupts
2743 ± 4% -14.8% 2337 ± 18% interrupts.CPU19.PMI:Performance_monitoring_interrupts
26.00 ± 61% +1373.1% 383.00 ± 50% interrupts.CPU23.TLB:TLB_shootdowns
229.00 ± 45% +87.3% 429.00 ± 29% interrupts.CPU29.TLB:TLB_shootdowns
372.25 ± 48% +114.4% 798.00 ± 28% interrupts.CPU33.41:PCI-MSI.67633156-edge.eth0-TxRx-3
28.75 ± 50% +360.9% 132.50 ± 94% interrupts.CPU39.TLB:TLB_shootdowns
41.00 ±101% +600.0% 287.00 ± 37% interrupts.CPU49.TLB:TLB_shootdowns
39.75 ± 79% +140.9% 95.75 ± 13% interrupts.CPU50.TLB:TLB_shootdowns
3103 ± 9% -16.6% 2589 ± 10% interrupts.CPU53.NMI:Non-maskable_interrupts
3103 ± 9% -16.6% 2589 ± 10% interrupts.CPU53.PMI:Performance_monitoring_interrupts
163.75 ± 58% -58.2% 68.50 ±100% interrupts.CPU59.TLB:TLB_shootdowns
2469 ± 20% +25.1% 3089 ± 6% interrupts.CPU63.NMI:Non-maskable_interrupts
2469 ± 20% +25.1% 3089 ± 6% interrupts.CPU63.PMI:Performance_monitoring_interrupts
69.25 ±107% +210.1% 214.75 ± 57% interrupts.CPU7.TLB:TLB_shootdowns
49.75 ± 33% +165.3% 132.00 ± 74% interrupts.CPU76.TLB:TLB_shootdowns
27.75 ± 82% +272.1% 103.25 ± 66% interrupts.CPU78.TLB:TLB_shootdowns
2754 ± 23% +23.0% 3387 ± 7% interrupts.CPU84.NMI:Non-maskable_interrupts
2754 ± 23% +23.0% 3387 ± 7% interrupts.CPU84.PMI:Performance_monitoring_interrupts
22.50 ± 42% +687.8% 177.25 ±132% interrupts.CPU84.TLB:TLB_shootdowns
19.75 ± 35% +812.7% 180.25 ± 74% interrupts.CPU85.TLB:TLB_shootdowns
15.75 ± 18% +303.2% 63.50 ± 81% interrupts.CPU99.TLB:TLB_shootdowns
31.35 -0.6 30.72 perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.__account_scheduler_latency.enqueue_entity.enqueue_task_fair.activate_task
30.60 -0.6 29.98 perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.__account_scheduler_latency.enqueue_entity.enqueue_task_fair
34.93 -0.5 34.42 perf-profile.calltrace.cycles-pp.__account_scheduler_latency.enqueue_entity.enqueue_task_fair.activate_task.ttwu_do_activate
36.88 -0.5 36.42 perf-profile.calltrace.cycles-pp.enqueue_entity.enqueue_task_fair.activate_task.ttwu_do_activate.try_to_wake_up
37.21 -0.4 36.77 perf-profile.calltrace.cycles-pp.activate_task.ttwu_do_activate.try_to_wake_up.autoremove_wake_function.__wake_up_common
37.18 -0.4 36.75 perf-profile.calltrace.cycles-pp.enqueue_task_fair.activate_task.ttwu_do_activate.try_to_wake_up.autoremove_wake_function
37.22 -0.4 36.79 perf-profile.calltrace.cycles-pp.ttwu_do_activate.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock
35.18 -0.4 34.79 perf-profile.calltrace.cycles-pp.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
35.30 -0.4 34.93 perf-profile.calltrace.cycles-pp.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
39.42 -0.4 39.06 perf-profile.calltrace.cycles-pp.autoremove_wake_function.__wake_up_common.__wake_up_common_lock.pipe_write.new_sync_write
39.26 -0.4 38.91 perf-profile.calltrace.cycles-pp.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock.pipe_write
39.64 -0.3 39.30 perf-profile.calltrace.cycles-pp.__wake_up_common.__wake_up_common_lock.pipe_write.new_sync_write.vfs_write
39.95 -0.3 39.62 perf-profile.calltrace.cycles-pp.__wake_up_common_lock.pipe_write.new_sync_write.vfs_write.ksys_write
41.14 -0.3 40.83 perf-profile.calltrace.cycles-pp.new_sync_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
41.08 -0.3 40.77 perf-profile.calltrace.cycles-pp.pipe_write.new_sync_write.vfs_write.ksys_write.do_syscall_64
0.79 ± 7% -0.1 0.69 ± 9% perf-profile.calltrace.cycles-pp.apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter.do_idle.cpu_startup_entry
0.73 ± 8% -0.1 0.64 ± 9% perf-profile.calltrace.cycles-pp.smp_apic_timer_interrupt.apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter.do_idle
0.69 +0.0 0.72 perf-profile.calltrace.cycles-pp.syscall_return_via_sysret
0.70 ± 2% +0.0 0.73 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64
0.75 +0.0 0.78 perf-profile.calltrace.cycles-pp.tick_nohz_next_event.tick_nohz_get_sleep_length.menu_select.do_idle.cpu_startup_entry
0.98 ± 2% +0.0 1.02 perf-profile.calltrace.cycles-pp.set_next_entity.pick_next_task_fair.__sched_text_start.schedule_idle.do_idle
1.26 +0.0 1.31 perf-profile.calltrace.cycles-pp.dequeue_entity.dequeue_task_fair.__sched_text_start.schedule.pipe_read
1.47 +0.1 1.52 perf-profile.calltrace.cycles-pp.dequeue_task_fair.__sched_text_start.schedule.pipe_read.new_sync_read
2.38 +0.1 2.50 perf-profile.calltrace.cycles-pp.__sched_text_start.schedule_idle.do_idle.cpu_startup_entry.start_secondary
2.43 +0.1 2.56 perf-profile.calltrace.cycles-pp.schedule_idle.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64
5.47 +0.1 5.61 perf-profile.calltrace.cycles-pp.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
3.32 +0.1 3.47 perf-profile.calltrace.cycles-pp.schedule.pipe_read.new_sync_read.vfs_read.ksys_read
5.29 +0.2 5.46 perf-profile.calltrace.cycles-pp.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
5.94 ± 2% +0.2 6.14 perf-profile.calltrace.cycles-pp.new_sync_read.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
5.83 ± 2% +0.2 6.04 perf-profile.calltrace.cycles-pp.pipe_read.new_sync_read.vfs_read.ksys_read.do_syscall_64
3.11 ± 8% +0.3 3.38 perf-profile.calltrace.cycles-pp.__sched_text_start.schedule.pipe_read.new_sync_read.vfs_read
0.13 ±173% +0.4 0.53 perf-profile.calltrace.cycles-pp.update_curr.dequeue_entity.dequeue_task_fair.__sched_text_start.schedule
2.47 ± 9% +0.4 2.89 perf-profile.calltrace.cycles-pp.arch_stack_walk.stack_trace_save_tsk.__account_scheduler_latency.enqueue_entity.enqueue_task_fair
30.79 -0.6 30.14 perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
32.21 -0.6 31.62 perf-profile.children.cycles-pp._raw_spin_lock_irqsave
34.94 -0.5 34.43 perf-profile.children.cycles-pp.__account_scheduler_latency
36.94 -0.5 36.48 perf-profile.children.cycles-pp.enqueue_entity
37.19 -0.4 36.75 perf-profile.children.cycles-pp.enqueue_task_fair
37.22 -0.4 36.78 perf-profile.children.cycles-pp.activate_task
37.23 -0.4 36.79 perf-profile.children.cycles-pp.ttwu_do_activate
39.27 -0.4 38.91 perf-profile.children.cycles-pp.try_to_wake_up
39.65 -0.4 39.29 perf-profile.children.cycles-pp.__wake_up_common
39.42 -0.4 39.06 perf-profile.children.cycles-pp.autoremove_wake_function
39.95 -0.3 39.62 perf-profile.children.cycles-pp.__wake_up_common_lock
41.16 -0.3 40.85 perf-profile.children.cycles-pp.new_sync_write
41.08 -0.3 40.77 perf-profile.children.cycles-pp.pipe_write
41.45 -0.3 41.16 perf-profile.children.cycles-pp.vfs_write
41.59 -0.3 41.32 perf-profile.children.cycles-pp.ksys_write
0.05 +0.0 0.06 perf-profile.children.cycles-pp.apparmor_file_permission
0.17 ± 4% +0.0 0.18 ± 2% perf-profile.children.cycles-pp.tick_nohz_idle_enter
0.33 +0.0 0.35 ± 3% perf-profile.children.cycles-pp.select_idle_sibling
0.41 +0.0 0.43 ± 2% perf-profile.children.cycles-pp.__next_timer_interrupt
0.25 ± 3% +0.0 0.27 ± 3% perf-profile.children.cycles-pp.rcu_idle_exit
0.38 +0.0 0.40 ± 2% perf-profile.children.cycles-pp.__switch_to_asm
0.49 ± 2% +0.0 0.51 perf-profile.children.cycles-pp.update_rq_clock
0.23 ± 3% +0.0 0.26 ± 3% perf-profile.children.cycles-pp.common_file_perm
0.57 +0.0 0.60 ± 2% perf-profile.children.cycles-pp._raw_spin_unlock_irqrestore
0.45 ± 2% +0.0 0.48 ± 2% perf-profile.children.cycles-pp.stack_trace_consume_entry_nosched
0.14 ± 13% +0.0 0.17 ± 7% perf-profile.children.cycles-pp.clockevents_program_event
0.70 ± 2% +0.0 0.73 perf-profile.children.cycles-pp.update_curr
0.44 ± 4% +0.0 0.48 ± 2% perf-profile.children.cycles-pp.copy_page_to_iter
0.76 +0.0 0.80 perf-profile.children.cycles-pp.tick_nohz_next_event
1.00 ± 2% +0.0 1.04 perf-profile.children.cycles-pp.set_next_entity
0.86 +0.0 0.90 perf-profile.children.cycles-pp.entry_SYSCALL_64
0.94 +0.0 0.98 perf-profile.children.cycles-pp.syscall_return_via_sysret
0.33 ± 2% +0.0 0.38 ± 3% perf-profile.children.cycles-pp.security_file_permission
0.00 +0.1 0.05 perf-profile.children.cycles-pp.tick_nohz_tick_stopped
1.45 ± 2% +0.1 1.51 perf-profile.children.cycles-pp.pick_next_task_fair
0.36 ± 8% +0.1 0.42 ± 6% perf-profile.children.cycles-pp.ktime_get
0.00 +0.1 0.06 perf-profile.children.cycles-pp.__x64_sys_read
1.25 ± 3% +0.1 1.31 perf-profile.children.cycles-pp.update_load_avg
1.56 ± 2% +0.1 1.63 perf-profile.children.cycles-pp.dequeue_entity
2.83 +0.1 2.91 perf-profile.children.cycles-pp.arch_stack_walk
1.78 ± 2% +0.1 1.88 perf-profile.children.cycles-pp.dequeue_task_fair
2.45 +0.1 2.57 perf-profile.children.cycles-pp.schedule_idle
3.33 +0.1 3.47 perf-profile.children.cycles-pp.schedule
5.94 ± 2% +0.2 6.14 perf-profile.children.cycles-pp.new_sync_read
5.85 +0.2 6.06 perf-profile.children.cycles-pp.pipe_read
6.58 ± 2% +0.2 6.80 perf-profile.children.cycles-pp.ksys_read
6.39 ± 2% +0.2 6.63 perf-profile.children.cycles-pp.vfs_read
5.70 +0.2 5.95 perf-profile.children.cycles-pp.__sched_text_start
30.79 -0.6 30.14 perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
0.09 -0.0 0.08 ± 5% perf-profile.self.cycles-pp.__fsnotify_parent
0.09 -0.0 0.08 perf-profile.self.cycles-pp.ksys_read
0.08 ± 5% +0.0 0.09 perf-profile.self.cycles-pp.in_sched_functions
0.15 ± 2% +0.0 0.17 ± 7% perf-profile.self.cycles-pp.pipe_write
0.21 ± 2% +0.0 0.23 ± 3% perf-profile.self.cycles-pp._find_next_bit
0.20 ± 2% +0.0 0.22 ± 3% perf-profile.self.cycles-pp.common_file_perm
0.38 +0.0 0.40 ± 2% perf-profile.self.cycles-pp.__switch_to_asm
0.04 ± 57% +0.0 0.06 perf-profile.self.cycles-pp.apparmor_file_permission
0.30 ± 2% +0.0 0.32 ± 2% perf-profile.self.cycles-pp.stack_trace_consume_entry_nosched
0.14 ± 3% +0.0 0.16 ± 7% perf-profile.self.cycles-pp.vfs_read
0.18 ± 2% +0.0 0.20 ± 4% perf-profile.self.cycles-pp.__account_scheduler_latency
1.57 +0.0 1.61 perf-profile.self.cycles-pp._raw_spin_lock_irqsave
1.15 +0.0 1.18 perf-profile.self.cycles-pp.__sched_text_start
0.84 +0.0 0.88 perf-profile.self.cycles-pp.entry_SYSCALL_64
0.94 +0.0 0.98 perf-profile.self.cycles-pp.syscall_return_via_sysret
0.00 +0.1 0.05 perf-profile.self.cycles-pp.ksys_write
0.00 +0.1 0.05 perf-profile.self.cycles-pp.rcu_idle_exit
0.18 ± 13% +0.1 0.23 ± 10% perf-profile.self.cycles-pp.ktime_get
0.00 +0.1 0.06 ± 7% perf-profile.self.cycles-pp.__x64_sys_read
1.59 ± 2% +0.1 1.66 perf-profile.self.cycles-pp.do_syscall_64
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Rong Chen
View attachment "config-5.6.0-rc2-00057-g412c53a680a97" of type "text/plain" (203572 bytes)
View attachment "job-script" of type "text/plain" (7578 bytes)
View attachment "job.yaml" of type "text/plain" (5246 bytes)
View attachment "reproduce" of type "text/plain" (309 bytes)
Powered by blists - more mailing lists