[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210107134723.GA28532@xsang-OptiPlex-9020>
Date: Thu, 7 Jan 2021 21:47:23 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Al Viro <viro@...iv.linux.org.uk>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...nel.org>, Borislav Petkov <bp@...en8.de>,
Peter Zijlstra <peterz@...radead.org>,
LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org,
lkp@...el.com, ying.huang@...el.com, feng.tang@...el.com,
zhengjun.xing@...el.com
Subject: [x86] d55564cfc2: will-it-scale.per_thread_ops -5.8% regression
Greeting,
FYI, we noticed a -5.8% regression of will-it-scale.per_thread_ops due to commit:
commit: d55564cfc222326e944893eff0c4118353e349ec ("x86: Make __put_user() generate an out-of-line call")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
in testcase: will-it-scale
on test machine: 48 threads Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz with 112G memory
with following parameters:
nr_task: 50%
mode: thread
test: poll2
cpufreq_governor: performance
ucode: 0x42e
test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
test-url: https://github.com/antonblanchard/will-it-scale
In addition to that, the commit also has significant impact on the following tests:
+------------------+---------------------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_process_ops -6.2% regression |
| test machine | 48 threads Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz with 112G memory |
| test parameters | cpufreq_governor=performance |
| | mode=process |
| | nr_task=100% |
| | test=poll2 |
| | ucode=0x42e |
+------------------+---------------------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_process_ops -6.8% regression |
| test machine | 192 threads Intel(R) Xeon(R) CPU @ 2.20GHz with 192G memory |
| test parameters | cpufreq_governor=performance |
| | mode=process |
| | nr_task=100% |
| | test=poll2 |
| | ucode=0x5002f01 |
+------------------+---------------------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_process_ops -7.3% regression |
| test machine | 192 threads Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz with 192G memory |
| test parameters | cpufreq_governor=performance |
| | mode=process |
| | nr_task=100% |
| | test=poll2 |
| | ucode=0x16 |
+------------------+---------------------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_thread_ops -3.6% regression |
| test machine | 144 threads Intel(R) Xeon(R) CPU E7-8890 v3 @ 2.50GHz with 512G memory |
| test parameters | cpufreq_governor=performance |
| | mode=thread |
| | nr_task=16 |
| | test=poll2 |
| | ucode=0x16 |
+------------------+---------------------------------------------------------------------------+
If you fix the issue, kindly add following tag
Reported-by: kernel test robot <oliver.sang@...el.com>
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
gcc-9/performance/x86_64-rhel-8.3/thread/50%/debian-10.4-x86_64-20200603.cgz/lkp-ivb-2ep1/poll2/will-it-scale/0x42e
commit:
ea6f043fc9 ("x86: Make __get_user() generate an out-of-line call")
d55564cfc2 ("x86: Make __put_user() generate an out-of-line call")
ea6f043fc9847e67 d55564cfc222326e944893eff0c
---------------- ---------------------------
%stddev %change %stddev
\ | \
6600273 -5.8% 6218737 will-it-scale.24.threads
275010 -5.8% 259113 will-it-scale.per_thread_ops
6600273 -5.8% 6218737 will-it-scale.workload
11069 ±105% +196.1% 32775 ± 35% numa-numastat.node1.other_node
0.01 ± 8% +21.4% 0.01 ± 6% perf-sched.sch_delay.avg.ms.__sched_text_start.__sched_text_start.devkmsg_read.vfs_read.ksys_read
0.00 ± 23% +50.0% 0.00 ± 11% perf-sched.sch_delay.avg.ms.__sched_text_start.__sched_text_start.schedule_hrtimeout_range_clock.poll_schedule_timeout.constprop
24562 ± 4% +10.3% 27098 ± 2% slabinfo.filp.active_objs
25333 ± 4% +10.0% 27863 slabinfo.filp.num_objs
16632 ± 2% -2.9% 16151 proc-vmstat.nr_active_anon
19941 -2.4% 19466 proc-vmstat.nr_shmem
16632 ± 2% -2.9% 16151 proc-vmstat.nr_zone_active_anon
7246 ± 87% +333.9% 31446 ± 49% softirqs.CPU25.SCHED
19452 ± 6% -28.5% 13915 ± 17% softirqs.CPU40.RCU
4067 ± 14% +257.3% 14533 ± 99% softirqs.CPU44.SCHED
19591 ± 7% -21.7% 15339 ± 25% softirqs.CPU46.RCU
0.00 +1.0 0.98 ± 3% perf-profile.calltrace.cycles-pp.__put_user_nocheck_2.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.07 ± 5% +0.0 0.09 ± 9% perf-profile.children.cycles-pp.vprintk_emit
0.07 ± 5% +0.0 0.09 ± 9% perf-profile.children.cycles-pp.console_unlock
0.07 +0.0 0.08 ± 5% perf-profile.children.cycles-pp.serial8250_console_write
0.07 ± 6% +0.0 0.08 ± 5% perf-profile.children.cycles-pp.uart_console_write
0.53 ± 5% +0.1 0.59 ± 2% perf-profile.children.cycles-pp.asm_call_sysvec_on_stack
0.00 +1.8 1.77 ± 3% perf-profile.children.cycles-pp.__put_user_nocheck_2
0.00 +1.6 1.64 ± 3% perf-profile.self.cycles-pp.__put_user_nocheck_2
11.79 ± 8% +2.4 14.22 ± 2% perf-profile.self.cycles-pp.do_sys_poll
2.349e+10 +4.2% 2.449e+10 perf-stat.i.branch-instructions
0.21 -0.0 0.19 perf-stat.i.branch-miss-rate%
45979592 -6.1% 43181339 perf-stat.i.branch-misses
2.36e+10 -2.4% 2.304e+10 perf-stat.i.dTLB-loads
0.10 ± 4% -0.0 0.09 perf-stat.i.dTLB-store-miss-rate%
14580547 ± 4% -8.3% 13364460 perf-stat.i.dTLB-store-misses
7364953 -5.1% 6985875 perf-stat.i.iTLB-load-misses
346056 ± 3% -8.7% 315837 perf-stat.i.iTLB-loads
9.903e+10 -1.1% 9.791e+10 perf-stat.i.instructions
13434 +4.3% 14007 perf-stat.i.instructions-per-iTLB-miss
0.20 -0.0 0.18 perf-stat.overall.branch-miss-rate%
0.10 ± 4% -0.0 0.09 perf-stat.overall.dTLB-store-miss-rate%
13447 +4.2% 14016 perf-stat.overall.instructions-per-iTLB-miss
4517015 +5.0% 4744020 perf-stat.overall.path-length
2.341e+10 +4.2% 2.44e+10 perf-stat.ps.branch-instructions
45857713 -6.1% 43060109 perf-stat.ps.branch-misses
2.352e+10 -2.4% 2.296e+10 perf-stat.ps.dTLB-loads
14530174 ± 4% -8.3% 13319056 perf-stat.ps.dTLB-store-misses
7339560 -5.1% 6961988 perf-stat.ps.iTLB-load-misses
344856 ± 3% -8.7% 314759 perf-stat.ps.iTLB-loads
9.869e+10 -1.1% 9.758e+10 perf-stat.ps.instructions
1830 ± 19% -36.4% 1163 ± 35% interrupts.CPU0.CAL:Function_call_interrupts
131.00 ±172% +331.9% 565.75 ± 57% interrupts.CPU1.TLB:TLB_shootdowns
3444 ± 82% +72.3% 5935 ± 41% interrupts.CPU10.NMI:Non-maskable_interrupts
3444 ± 82% +72.3% 5935 ± 41% interrupts.CPU10.PMI:Performance_monitoring_interrupts
6463 ± 29% -40.4% 3850 ± 14% interrupts.CPU17.NMI:Non-maskable_interrupts
6463 ± 29% -40.4% 3850 ± 14% interrupts.CPU17.PMI:Performance_monitoring_interrupts
1268 ± 20% +53.2% 1942 ± 22% interrupts.CPU2.CAL:Function_call_interrupts
1242 ± 51% +90.4% 2365 ± 52% interrupts.CPU22.CAL:Function_call_interrupts
27.50 ± 37% +206.4% 84.25 ± 73% interrupts.CPU22.RES:Rescheduling_interrupts
1439 ± 14% -29.1% 1019 ± 26% interrupts.CPU25.CAL:Function_call_interrupts
6907 ± 32% -53.8% 3194 ± 17% interrupts.CPU25.NMI:Non-maskable_interrupts
6907 ± 32% -53.8% 3194 ± 17% interrupts.CPU25.PMI:Performance_monitoring_interrupts
170.50 ± 51% -56.7% 73.75 ± 90% interrupts.CPU25.RES:Rescheduling_interrupts
596.50 ± 39% -71.8% 168.00 ±171% interrupts.CPU25.TLB:TLB_shootdowns
3916 ± 30% -45.6% 2130 ± 32% interrupts.CPU3.NMI:Non-maskable_interrupts
3916 ± 30% -45.6% 2130 ± 32% interrupts.CPU3.PMI:Performance_monitoring_interrupts
5969 ± 25% -58.5% 2477 ± 46% interrupts.CPU34.NMI:Non-maskable_interrupts
5969 ± 25% -58.5% 2477 ± 46% interrupts.CPU34.PMI:Performance_monitoring_interrupts
1345 ± 78% -86.7% 179.50 ±172% interrupts.CPU34.TLB:TLB_shootdowns
6131 ± 31% -49.0% 3129 ± 36% interrupts.CPU4.NMI:Non-maskable_interrupts
6131 ± 31% -49.0% 3129 ± 36% interrupts.CPU4.PMI:Performance_monitoring_interrupts
722.50 ± 4% -52.0% 346.50 ±100% interrupts.CPU4.TLB:TLB_shootdowns
1526 ± 5% -27.1% 1112 ± 23% interrupts.CPU40.CAL:Function_call_interrupts
7314 ± 24% -56.7% 3166 ± 35% interrupts.CPU40.NMI:Non-maskable_interrupts
7314 ± 24% -56.7% 3166 ± 35% interrupts.CPU40.PMI:Performance_monitoring_interrupts
5411 ± 31% -28.8% 3853 ± 14% interrupts.CPU46.NMI:Non-maskable_interrupts
5411 ± 31% -28.8% 3853 ± 14% interrupts.CPU46.PMI:Performance_monitoring_interrupts
will-it-scale.24.threads
7e+06 +-------------------------------------------------------------------+
|..+..+..+.+ +..+..+..+.+ O O |
6e+06 |-+O O O : O : O O O O O O O O O O O O |
| : : |
5e+06 |-+ : : |
| : : |
4e+06 |-+ : : |
| : : |
3e+06 |-+ : : |
| : : |
2e+06 |-+ : : |
| : : |
1e+06 |-+ : |
| : |
0 +-------------------------------------------------------------------+
will-it-scale.per_thread_ops
300000 +------------------------------------------------------------------+
|..+..+.+..+ +..+.+..+..+ |
250000 |-+O O O : O : O O O O O O O O O O O O O O |
| : : |
| : : |
200000 |-+ : : |
| : : |
150000 |-+ : : |
| : : |
100000 |-+ : : |
| : : |
| : : |
50000 |-+ : |
| : |
0 +------------------------------------------------------------------+
will-it-scale.workload
7e+06 +-------------------------------------------------------------------+
|..+..+..+.+ +..+..+..+.+ O O |
6e+06 |-+O O O : O : O O O O O O O O O O O O |
| : : |
5e+06 |-+ : : |
| : : |
4e+06 |-+ : : |
| : : |
3e+06 |-+ : : |
| : : |
2e+06 |-+ : : |
| : : |
1e+06 |-+ : |
| : |
0 +-------------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
***************************************************************************************************
lkp-ivb-2ep1: 48 threads Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz with 112G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
gcc-9/performance/x86_64-rhel-8.3/process/100%/debian-10.4-x86_64-20200603.cgz/lkp-ivb-2ep1/poll2/will-it-scale/0x42e
commit:
ea6f043fc9 ("x86: Make __get_user() generate an out-of-line call")
d55564cfc2 ("x86: Make __put_user() generate an out-of-line call")
ea6f043fc9847e67 d55564cfc222326e944893eff0c
---------------- ---------------------------
%stddev %change %stddev
\ | \
14927808 -6.2% 14002190 will-it-scale.48.processes
310995 -6.2% 291711 will-it-scale.per_process_ops
14927808 -6.2% 14002190 will-it-scale.workload
873.22 ± 2% -4.2% 836.55 boot-time.idle
28240 ± 2% +3.7% 29282 proc-vmstat.nr_slab_unreclaimable
6829 ± 3% -12.7% 5965 ± 4% numa-meminfo.node0.KernelStack
5160 ± 5% +17.4% 6057 ± 4% numa-meminfo.node1.KernelStack
29987 ± 12% -16.0% 25186 ± 9% softirqs.CPU46.RCU
28923 ± 5% -11.9% 25496 ± 6% softirqs.CPU9.RCU
6829 ± 3% -12.6% 5965 ± 4% numa-vmstat.node0.nr_kernel_stack
5160 ± 5% +17.4% 6058 ± 4% numa-vmstat.node1.nr_kernel_stack
476376 ± 20% +30.7% 622825 ± 11% numa-vmstat.node1.numa_local
1135 ± 7% +22.6% 1391 ± 3% slabinfo.dmaengine-unmap-16.active_objs
1135 ± 7% +22.6% 1391 ± 3% slabinfo.dmaengine-unmap-16.num_objs
857.50 ± 5% +15.0% 986.50 ± 2% slabinfo.task_group.active_objs
857.50 ± 5% +15.0% 986.50 ± 2% slabinfo.task_group.num_objs
98.79 ± 10% +16.9% 115.50 ± 6% sched_debug.cfs_rq:/.runnable_avg.stddev
63.89 ± 14% +23.3% 78.81 ± 16% sched_debug.cfs_rq:/.util_avg.stddev
745060 ± 7% -14.8% 634464 ± 8% sched_debug.cpu.avg_idle.avg
1273832 ± 17% -18.5% 1038314 ± 6% sched_debug.cpu.avg_idle.max
2154 ± 10% +188.1% 6207 ±101% sched_debug.cpu.avg_idle.min
0.09 ± 29% +57.1% 0.14 ± 13% perf-sched.sch_delay.avg.ms.__sched_text_start.__sched_text_start.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe
0.77 ± 16% -50.5% 0.38 ± 25% perf-sched.sch_delay.avg.ms.__sched_text_start.__sched_text_start.preempt_schedule_common._cond_resched.stop_one_cpu
6.77 ± 6% +19.6% 8.09 ± 6% perf-sched.sch_delay.avg.ms.__sched_text_start.__sched_text_start.wait_for_partner.fifo_open.do_dentry_open
7.24 ± 6% +15.0% 8.33 ± 5% perf-sched.wait_and_delay.avg.ms.__sched_text_start.__sched_text_start.wait_for_partner.fifo_open.do_dentry_open
118.91 ± 15% -55.8% 52.50 ± 15% perf-sched.wait_and_delay.max.ms.__sched_text_start.__sched_text_start.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe
5138 ± 2% +22.2% 6278 ± 13% perf-sched.wait_and_delay.max.ms.__sched_text_start.__sched_text_start.worker_thread.kthread.ret_from_fork
0.03 ± 57% +228.3% 0.11 ± 42% perf-sched.wait_time.avg.ms.__sched_text_start.__sched_text_start.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_irq_work
717.87 ±173% +106.8% 1484 ± 99% perf-sched.wait_time.avg.ms.__sched_text_start.__sched_text_start.preempt_schedule_common._cond_resched.__alloc_pages_nodemask
0.48 ± 25% -50.8% 0.23 ± 39% perf-sched.wait_time.avg.ms.__sched_text_start.__sched_text_start.wait_for_partner.fifo_open.do_dentry_open
0.06 ± 48% +290.5% 0.22 ± 30% perf-sched.wait_time.max.ms.__sched_text_start.__sched_text_start.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_irq_work
118.91 ± 15% -55.8% 52.50 ± 15% perf-sched.wait_time.max.ms.__sched_text_start.__sched_text_start.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe
1397 ±173% +112.0% 2962 ± 99% perf-sched.wait_time.max.ms.__sched_text_start.__sched_text_start.preempt_schedule_common._cond_resched.__alloc_pages_nodemask
5138 ± 2% +22.2% 6278 ± 13% perf-sched.wait_time.max.ms.__sched_text_start.__sched_text_start.worker_thread.kthread.ret_from_fork
73378 +3.5% 75925 interrupts.CAL:Function_call_interrupts
6339 ± 30% -34.7% 4142 interrupts.CPU1.NMI:Non-maskable_interrupts
6339 ± 30% -34.7% 4142 interrupts.CPU1.PMI:Performance_monitoring_interrupts
1109 ± 39% -33.4% 739.00 ± 5% interrupts.CPU1.RES:Rescheduling_interrupts
596.75 ± 66% -42.5% 343.00 ± 2% interrupts.CPU10.RES:Rescheduling_interrupts
4903 ± 26% +55.2% 7610 ± 14% interrupts.CPU12.NMI:Non-maskable_interrupts
4903 ± 26% +55.2% 7610 ± 14% interrupts.CPU12.PMI:Performance_monitoring_interrupts
1485 ± 46% -36.3% 946.00 ± 12% interrupts.CPU13.RES:Rescheduling_interrupts
900.50 ± 16% +99.1% 1792 ± 10% interrupts.CPU2.RES:Rescheduling_interrupts
396.50 ± 7% -13.6% 342.75 ± 3% interrupts.CPU33.RES:Rescheduling_interrupts
7258 ± 24% -28.8% 5171 ± 34% interrupts.CPU34.NMI:Non-maskable_interrupts
7258 ± 24% -28.8% 5171 ± 34% interrupts.CPU34.PMI:Performance_monitoring_interrupts
860.25 +7.4% 923.75 ± 4% interrupts.CPU44.CAL:Function_call_interrupts
327.00 ± 3% +22.7% 401.25 ± 13% interrupts.CPU45.RES:Rescheduling_interrupts
1708 ± 32% -34.8% 1114 ± 20% interrupts.CPU5.CAL:Function_call_interrupts
3.377e+10 +9.8% 3.708e+10 perf-stat.i.branch-instructions
0.29 -0.0 0.25 perf-stat.i.branch-miss-rate%
94797779 -8.5% 86775592 perf-stat.i.branch-misses
3.762e+10 -1.3% 3.714e+10 perf-stat.i.dTLB-loads
2.076e+10 +2.5% 2.127e+10 perf-stat.i.dTLB-stores
13777539 -13.2% 11957147 ± 3% perf-stat.i.iTLB-load-misses
12274 +15.7% 14203 ± 3% perf-stat.i.instructions-per-iTLB-miss
1920 +3.6% 1990 perf-stat.i.metric.M/sec
0.28 -0.0 0.23 perf-stat.overall.branch-miss-rate%
12281 +15.6% 14199 ± 3% perf-stat.overall.instructions-per-iTLB-miss
3412651 +6.8% 3645734 perf-stat.overall.path-length
3.365e+10 +9.8% 3.695e+10 perf-stat.ps.branch-instructions
94514447 -8.5% 86507575 perf-stat.ps.branch-misses
3.749e+10 -1.3% 3.701e+10 perf-stat.ps.dTLB-loads
2.069e+10 +2.5% 2.119e+10 perf-stat.ps.dTLB-stores
13728170 -13.2% 11914029 ± 3% perf-stat.ps.iTLB-load-misses
33.31 -1.8 31.51 perf-profile.calltrace.cycles-pp.__fget_light.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
73.13 -0.8 72.34 perf-profile.calltrace.cycles-pp.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe.__poll
8.05 -0.4 7.66 perf-profile.calltrace.cycles-pp.testcase
4.26 -0.3 3.92 perf-profile.calltrace.cycles-pp.__fdget.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
5.85 -0.3 5.60 perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.__poll
2.68 ± 5% -0.2 2.45 ± 3% perf-profile.calltrace.cycles-pp.__check_object_size.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
2.96 ± 2% -0.2 2.77 ± 2% perf-profile.calltrace.cycles-pp.copy_user_enhanced_fast_string._copy_from_user.do_sys_poll.__x64_sys_poll.do_syscall_64
1.51 -0.2 1.35 perf-profile.calltrace.cycles-pp.__kmalloc.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.80 ± 4% +0.0 0.84 perf-profile.calltrace.cycles-pp.__might_fault._copy_from_user.do_sys_poll.__x64_sys_poll.do_syscall_64
3.99 +0.1 4.13 perf-profile.calltrace.cycles-pp.__entry_text_start.__poll
83.08 +0.2 83.24 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__poll
90.96 +0.4 91.34 perf-profile.calltrace.cycles-pp.__poll
76.51 +0.6 77.14 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__poll
75.59 +0.8 76.37 perf-profile.calltrace.cycles-pp.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe.__poll
0.00 +7.0 6.97 perf-profile.calltrace.cycles-pp.__put_user_nocheck_2.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
31.93 -1.9 30.00 perf-profile.children.cycles-pp.__fget_light
8.13 -0.4 7.73 perf-profile.children.cycles-pp.testcase
4.21 -0.3 3.89 perf-profile.children.cycles-pp.__fdget
2.80 ± 4% -0.3 2.54 ± 3% perf-profile.children.cycles-pp.__check_object_size
5.89 -0.3 5.64 perf-profile.children.cycles-pp.syscall_exit_to_user_mode
3.00 ± 2% -0.2 2.80 ± 2% perf-profile.children.cycles-pp.copy_user_enhanced_fast_string
1.60 -0.2 1.43 perf-profile.children.cycles-pp.__kmalloc
0.16 ± 2% -0.1 0.03 ±100% perf-profile.children.cycles-pp.__x86_indirect_thunk_rax
0.36 ± 5% -0.1 0.29 ± 13% perf-profile.children.cycles-pp.syscall_enter_from_user_mode
0.52 -0.1 0.46 ± 3% perf-profile.children.cycles-pp.__check_heap_object
0.18 ± 4% -0.0 0.13 ± 3% perf-profile.children.cycles-pp._cond_resched
0.10 ± 5% -0.0 0.08 perf-profile.children.cycles-pp.__x86_retpoline_rax
0.19 ± 3% +0.0 0.21 ± 2% perf-profile.children.cycles-pp.poll_freewait
0.86 ± 3% +0.0 0.90 perf-profile.children.cycles-pp.__might_fault
4.00 +0.1 4.15 perf-profile.children.cycles-pp.__entry_text_start
83.21 +0.2 83.40 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
91.58 +0.3 91.90 perf-profile.children.cycles-pp.__poll
76.77 +0.5 77.31 perf-profile.children.cycles-pp.do_syscall_64
75.63 +0.8 76.41 perf-profile.children.cycles-pp.__x64_sys_poll
74.72 +0.9 75.61 perf-profile.children.cycles-pp.do_sys_poll
0.00 +5.8 5.77 perf-profile.children.cycles-pp.__put_user_nocheck_2
29.62 -1.8 27.84 perf-profile.self.cycles-pp.__fget_light
7.89 -0.4 7.51 perf-profile.self.cycles-pp.testcase
5.78 -0.2 5.53 perf-profile.self.cycles-pp.syscall_exit_to_user_mode
2.96 ± 2% -0.2 2.75 ± 2% perf-profile.self.cycles-pp.copy_user_enhanced_fast_string
2.15 -0.2 1.96 ± 2% perf-profile.self.cycles-pp.__fdget
0.61 ± 2% -0.2 0.44 perf-profile.self.cycles-pp.do_syscall_64
1.09 ± 7% -0.1 0.98 ± 4% perf-profile.self.cycles-pp.__check_object_size
0.73 ± 2% -0.1 0.63 ± 3% perf-profile.self.cycles-pp.__x64_sys_poll
0.57 ± 3% -0.1 0.47 ± 2% perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
0.85 ± 2% -0.1 0.77 ± 2% perf-profile.self.cycles-pp.__kmalloc
0.30 ± 4% -0.1 0.24 ± 12% perf-profile.self.cycles-pp.syscall_enter_from_user_mode
0.49 -0.0 0.45 ± 2% perf-profile.self.cycles-pp.__check_heap_object
0.09 -0.0 0.06 perf-profile.self.cycles-pp._cond_resched
0.15 ± 3% -0.0 0.14 ± 5% perf-profile.self.cycles-pp.poll_select_set_timeout
3.55 +0.2 3.74 perf-profile.self.cycles-pp.__entry_text_start
0.00 +3.6 3.58 perf-profile.self.cycles-pp.__put_user_nocheck_2
***************************************************************************************************
lkp-csl-2ap1: 192 threads Intel(R) Xeon(R) CPU @ 2.20GHz with 192G memory
=========================================================================================
bs/compiler/cpufreq_governor/disk/fs/ioengine/kconfig/nr_task/rootfs/runtime/rw/tbox_group/test_size/testcase/ucode:
4k/gcc-9/performance/1SSD/btrfs/sync/x86_64-rhel-8.3/8/debian-10.4-x86_64-20200603.cgz/300s/randwrite/lkp-csl-2ap1/256g/fio-basic/0x4003003
commit:
ea6f043fc9 ("x86: Make __get_user() generate an out-of-line call")
d55564cfc2 ("x86: Make __put_user() generate an out-of-line call")
ea6f043fc9847e67 d55564cfc222326e944893eff0c
---------------- ---------------------------
fail:runs %reproduction fail:runs
| | |
1:2 -50% :2 kmsg.ACPI_Error
0:2 -1% 0:2 perf-profile.children.cycles-pp.error_entry
***************************************************************************************************
lkp-csl-2ap3: 192 threads Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz with 192G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
gcc-9/performance/x86_64-rhel-8.3/process/100%/debian-10.4-x86_64-20200603.cgz/lkp-csl-2ap3/poll2/will-it-scale/0x5002f01
commit:
ea6f043fc9 ("x86: Make __get_user() generate an out-of-line call")
d55564cfc2 ("x86: Make __put_user() generate an out-of-line call")
ea6f043fc9847e67 d55564cfc222326e944893eff0c
---------------- ---------------------------
%stddev %change %stddev
\ | \
49799766 -6.8% 46397591 will-it-scale.192.processes
259373 -6.8% 241653 will-it-scale.per_process_ops
49799766 -6.8% 46397591 will-it-scale.workload
5355 ± 3% -2.8% 5203 boot-time.idle
219459 ± 5% -10.0% 197460 ± 2% numa-numastat.node2.local_node
20202 ± 33% +53.8% 31071 numa-numastat.node2.other_node
5399 ± 13% +25.8% 6794 slabinfo.khugepaged_mm_slot.active_objs
5399 ± 13% +25.8% 6794 slabinfo.khugepaged_mm_slot.num_objs
27584 ± 3% +4.4% 28788 proc-vmstat.nr_active_anon
31838 ± 3% +3.9% 33095 proc-vmstat.nr_shmem
27584 ± 3% +4.4% 28788 proc-vmstat.nr_zone_active_anon
4438 ± 96% -97.2% 123.12 ± 23% sched_debug.cfs_rq:/.load_avg.max
322.01 ± 95% -96.2% 12.19 ± 24% sched_debug.cfs_rq:/.load_avg.stddev
161.08 ± 3% -11.3% 142.88 ± 5% sched_debug.cfs_rq:/.util_est_enqueued.stddev
2008 ± 52% -89.0% 221.50 ± 70% numa-meminfo.node2.Active
2008 ± 52% -89.0% 221.50 ± 70% numa-meminfo.node2.Active(anon)
9747 ± 10% -21.8% 7622 ± 11% numa-meminfo.node2.PageTables
79271 ± 36% +77.4% 140623 ± 25% numa-meminfo.node3.AnonPages
87506 ± 36% +68.2% 147211 ± 24% numa-meminfo.node3.Inactive
87506 ± 36% +68.2% 147211 ± 24% numa-meminfo.node3.Inactive(anon)
278145 +6.8% 297050 ± 6% numa-meminfo.node3.Unevictable
501.75 ± 52% -89.0% 55.00 ± 71% numa-vmstat.node2.nr_active_anon
2434 ± 10% -21.7% 1905 ± 11% numa-vmstat.node2.nr_page_table_pages
501.75 ± 52% -89.0% 55.00 ± 71% numa-vmstat.node2.nr_zone_active_anon
638194 ± 13% -22.7% 493421 ± 8% numa-vmstat.node2.numa_hit
525818 ± 16% -29.6% 369990 ± 10% numa-vmstat.node2.numa_local
112375 ± 5% +9.8% 123431 numa-vmstat.node2.numa_other
19778 ± 36% +78.0% 35206 ± 25% numa-vmstat.node3.nr_anon_pages
21798 ± 36% +69.4% 36921 ± 24% numa-vmstat.node3.nr_inactive_anon
69536 +6.8% 74262 ± 6% numa-vmstat.node3.nr_unevictable
21798 ± 36% +69.4% 36921 ± 24% numa-vmstat.node3.nr_zone_inactive_anon
69536 +6.8% 74262 ± 6% numa-vmstat.node3.nr_zone_unevictable
307.75 +31.2% 403.75 ± 31% interrupts.CPU105.RES:Rescheduling_interrupts
305.75 +46.0% 446.25 ± 45% interrupts.CPU114.RES:Rescheduling_interrupts
318.00 ± 4% +82.6% 580.75 ± 69% interrupts.CPU12.RES:Rescheduling_interrupts
2428 ± 15% +41.3% 3433 ± 18% interrupts.CPU122.CAL:Function_call_interrupts
434.75 ± 34% -29.3% 307.25 interrupts.CPU136.RES:Rescheduling_interrupts
363.00 ± 5% +32.0% 479.25 ± 33% interrupts.CPU191.RES:Rescheduling_interrupts
6365 ± 33% -17.3% 5263 ± 34% interrupts.CPU23.NMI:Non-maskable_interrupts
6365 ± 33% -17.3% 5263 ± 34% interrupts.CPU23.PMI:Performance_monitoring_interrupts
324.25 ± 3% +18.7% 384.75 ± 18% interrupts.CPU3.RES:Rescheduling_interrupts
427.25 ± 26% -26.9% 312.50 ± 3% interrupts.CPU39.RES:Rescheduling_interrupts
6491 ± 33% -17.6% 5347 ± 34% interrupts.CPU78.NMI:Non-maskable_interrupts
6491 ± 33% -17.6% 5347 ± 34% interrupts.CPU78.PMI:Performance_monitoring_interrupts
326.25 ± 4% -4.8% 310.75 ± 4% interrupts.CPU83.RES:Rescheduling_interrupts
362.50 ± 13% -13.4% 314.00 ± 4% interrupts.CPU88.RES:Rescheduling_interrupts
8654 -38.2% 5350 ± 34% interrupts.CPU93.NMI:Non-maskable_interrupts
8654 -38.2% 5350 ± 34% interrupts.CPU93.PMI:Performance_monitoring_interrupts
411.00 ± 16% -19.8% 329.50 ± 4% interrupts.CPU95.RES:Rescheduling_interrupts
165.25 ± 6% +32.8% 219.50 ± 3% interrupts.IWI:IRQ_work_interrupts
0.08 ± 9% -46.2% 0.04 ± 13% perf-stat.i.MPKI
1.124e+11 +9.0% 1.226e+11 perf-stat.i.branch-instructions
0.28 -0.1 0.23 perf-stat.i.branch-miss-rate%
3.02e+08 -13.0% 2.626e+08 perf-stat.i.branch-misses
11.73 -1.9 9.87 perf-stat.i.cache-miss-rate%
4146579 ± 2% -65.7% 1420631 ± 3% perf-stat.i.cache-misses
35384957 ± 2% -60.1% 14124161 ± 2% perf-stat.i.cache-references
141836 ± 2% +219.1% 452664 ± 3% perf-stat.i.cycles-between-cache-misses
628080 ± 2% -29.6% 441936 ± 5% perf-stat.i.dTLB-load-misses
1.284e+11 -2.2% 1.255e+11 perf-stat.i.dTLB-loads
5.923e+10 +3.1% 6.108e+10 perf-stat.i.dTLB-stores
22557203 -12.5% 19727021 perf-stat.i.iTLB-load-misses
25065 +12.9% 28294 perf-stat.i.instructions-per-iTLB-miss
1563 +3.0% 1610 perf-stat.i.metric.M/sec
1187563 ± 3% -77.5% 266628 perf-stat.i.node-load-misses
136499 ± 7% -70.9% 39734 ± 2% perf-stat.i.node-loads
98.41 -3.3 95.10 perf-stat.i.node-store-miss-rate%
387351 ± 3% -73.3% 103454 perf-stat.i.node-store-misses
9110 ± 7% +10.6% 10079 ± 7% perf-stat.i.node-stores
0.06 ± 2% -59.7% 0.03 ± 2% perf-stat.overall.MPKI
0.27 -0.1 0.21 perf-stat.overall.branch-miss-rate%
11.71 -1.7 10.03 perf-stat.overall.cache-miss-rate%
138134 ± 2% +189.6% 400066 ± 3% perf-stat.overall.cycles-between-cache-misses
0.00 ± 2% -0.0 0.00 perf-stat.overall.dTLB-load-miss-rate%
24933 +13.7% 28356 perf-stat.overall.instructions-per-iTLB-miss
89.60 -3.3 86.26 perf-stat.overall.node-load-miss-rate%
97.67 -6.6 91.03 perf-stat.overall.node-store-miss-rate%
3404327 +6.7% 3632027 perf-stat.overall.path-length
1.121e+11 +9.0% 1.222e+11 perf-stat.ps.branch-instructions
3.01e+08 -13.0% 2.618e+08 perf-stat.ps.branch-misses
4136846 ± 2% -65.6% 1421212 ± 3% perf-stat.ps.cache-misses
35332816 ± 2% -59.9% 14168003 ± 2% perf-stat.ps.cache-references
632868 ± 2% -28.1% 454876 perf-stat.ps.dTLB-load-misses
1.28e+11 -2.2% 1.251e+11 perf-stat.ps.dTLB-loads
5.902e+10 +3.1% 6.087e+10 perf-stat.ps.dTLB-stores
22483939 -12.6% 19660586 perf-stat.ps.iTLB-load-misses
1183807 ± 3% -77.5% 265769 perf-stat.ps.node-load-misses
137518 ± 7% -69.2% 42343 ± 2% perf-stat.ps.node-loads
386094 ± 3% -73.3% 103120 perf-stat.ps.node-store-misses
9169 ± 7% +10.9% 10169 ± 6% perf-stat.ps.node-stores
95.69 -0.2 95.47 perf-profile.calltrace.cycles-pp.__poll
2.68 ± 2% -0.1 2.58 perf-profile.calltrace.cycles-pp.__fdget.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.76 -0.0 0.75 perf-profile.calltrace.cycles-pp.__entry_text_start.__poll
0.53 +0.1 0.59 perf-profile.calltrace.cycles-pp.__might_fault._copy_from_user.do_sys_poll.__x64_sys_poll.do_syscall_64
1.17 +0.1 1.23 perf-profile.calltrace.cycles-pp.__check_object_size.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
2.62 +0.1 2.70 perf-profile.calltrace.cycles-pp._copy_from_user.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.74 +0.1 0.85 perf-profile.calltrace.cycles-pp.__kmalloc.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
3.80 ± 2% +0.2 4.04 ± 3% perf-profile.calltrace.cycles-pp.testcase
20.09 ± 2% +2.8 22.93 perf-profile.calltrace.cycles-pp.__fget_light.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
91.94 +5.2 97.17 perf-profile.calltrace.cycles-pp.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe.__poll
0.00 +38.3 38.28 perf-profile.calltrace.cycles-pp.__put_user_nocheck_2.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
96.15 -0.2 95.91 perf-profile.children.cycles-pp.__poll
2.68 -0.1 2.56 perf-profile.children.cycles-pp.__fdget
0.22 -0.0 0.18 ± 2% perf-profile.children.cycles-pp.__might_sleep
0.15 ± 3% -0.0 0.12 ± 4% perf-profile.children.cycles-pp.poll_freewait
0.09 -0.0 0.07 perf-profile.children.cycles-pp.poll_select_set_timeout
0.18 -0.0 0.17 perf-profile.children.cycles-pp.syscall_enter_from_user_mode
0.29 ± 4% +0.0 0.32 ± 3% perf-profile.children.cycles-pp.__check_heap_object
0.10 +0.0 0.13 ± 3% perf-profile.children.cycles-pp.check_stack_object
0.57 +0.1 0.62 perf-profile.children.cycles-pp.__might_fault
0.32 ± 3% +0.1 0.39 perf-profile.children.cycles-pp.___might_sleep
1.20 +0.1 1.28 perf-profile.children.cycles-pp.__check_object_size
2.65 +0.1 2.74 perf-profile.children.cycles-pp._copy_from_user
0.79 +0.1 0.90 perf-profile.children.cycles-pp.__kmalloc
3.85 ± 2% +0.2 4.09 ± 3% perf-profile.children.cycles-pp.testcase
18.83 ± 2% +2.9 21.70 perf-profile.children.cycles-pp.__fget_light
0.00 +43.6 43.61 perf-profile.children.cycles-pp.__put_user_nocheck_2
68.36 -45.9 22.50 perf-profile.self.cycles-pp.do_sys_poll
1.34 ± 2% -0.1 1.26 perf-profile.self.cycles-pp.__fdget
0.20 -0.0 0.17 ± 3% perf-profile.self.cycles-pp.__might_sleep
0.21 ± 2% -0.0 0.18 ± 2% perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
0.09 ± 4% -0.0 0.07 perf-profile.self.cycles-pp.poll_select_set_timeout
0.11 ± 3% -0.0 0.10 perf-profile.self.cycles-pp.poll_freewait
0.09 +0.0 0.11 ± 3% perf-profile.self.cycles-pp.check_stack_object
0.30 +0.0 0.33 perf-profile.self.cycles-pp.__check_object_size
0.18 ± 2% +0.0 0.21 ± 2% perf-profile.self.cycles-pp._copy_from_user
0.18 ± 3% +0.0 0.22 perf-profile.self.cycles-pp.__might_fault
0.32 ± 3% +0.1 0.38 perf-profile.self.cycles-pp.___might_sleep
0.43 +0.1 0.52 perf-profile.self.cycles-pp.__kmalloc
3.79 ± 2% +0.2 4.02 ± 3% perf-profile.self.cycles-pp.testcase
17.33 ± 2% +2.9 20.27 perf-profile.self.cycles-pp.__fget_light
0.00 +42.6 42.61 perf-profile.self.cycles-pp.__put_user_nocheck_2
***************************************************************************************************
lkp-hsw-4ex1: 144 threads Intel(R) Xeon(R) CPU E7-8890 v3 @ 2.50GHz with 512G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
gcc-9/performance/x86_64-rhel-8.3/process/100%/debian-10.4-x86_64-20200603.cgz/lkp-hsw-4ex1/poll2/will-it-scale/0x16
commit:
ea6f043fc9 ("x86: Make __get_user() generate an out-of-line call")
d55564cfc2 ("x86: Make __put_user() generate an out-of-line call")
ea6f043fc9847e67 d55564cfc222326e944893eff0c
---------------- ---------------------------
%stddev %change %stddev
\ | \
42577786 -7.3% 39477406 will-it-scale.144.processes
295678 -7.3% 274148 will-it-scale.per_process_ops
42577786 -7.3% 39477406 will-it-scale.workload
57721 -1.2% 57029 proc-vmstat.nr_slab_unreclaimable
19.00 -5.3% 18.00 vmstat.cpu.us
90088 ± 79% -70.3% 26733 ± 15% numa-meminfo.node1.AnonPages
94290 ± 72% -63.4% 34523 ± 23% numa-meminfo.node1.Inactive
94178 ± 72% -63.3% 34518 ± 23% numa-meminfo.node1.Inactive(anon)
3764 ± 11% -19.7% 3023 ± 2% numa-meminfo.node1.PageTables
20104 ± 13% -15.5% 16993 ± 9% softirqs.CPU0.RCU
18905 ± 6% -8.6% 17277 ± 5% softirqs.CPU136.RCU
16811 ± 4% -12.0% 14790 ± 4% softirqs.CPU71.RCU
19562 ± 3% -9.7% 17666 ± 7% softirqs.CPU97.RCU
22522 ± 79% -70.2% 6705 ± 15% numa-vmstat.node1.nr_anon_pages
23544 ± 72% -63.3% 8649 ± 23% numa-vmstat.node1.nr_inactive_anon
941.00 ± 11% -19.7% 756.00 ± 2% numa-vmstat.node1.nr_page_table_pages
23544 ± 72% -63.3% 8649 ± 23% numa-vmstat.node1.nr_zone_inactive_anon
419078 ± 10% -17.1% 347285 ± 3% numa-vmstat.node1.numa_local
0.05 ± 4% +12.6% 0.06 ± 5% sched_debug.cfs_rq:/.nr_running.stddev
39.42 ±100% +122.5% 87.71 ± 2% sched_debug.cfs_rq:/.removed.runnable_avg.max
39.38 ±100% +122.8% 87.71 ± 2% sched_debug.cfs_rq:/.removed.util_avg.max
92.35 ± 6% +10.9% 102.44 ± 5% sched_debug.cfs_rq:/.runnable_avg.stddev
732.50 ± 7% +22.2% 895.42 ± 7% sched_debug.cfs_rq:/.util_est_enqueued.max
89.39 ± 8% +50.5% 134.49 ± 14% sched_debug.cfs_rq:/.util_est_enqueued.stddev
2369 ± 3% -13.0% 2062 ± 5% slabinfo.PING.active_objs
2369 ± 3% -13.0% 2062 ± 5% slabinfo.PING.num_objs
1124 ± 7% -11.3% 997.75 ± 5% slabinfo.file_lock_cache.active_objs
1124 ± 7% -11.3% 997.75 ± 5% slabinfo.file_lock_cache.num_objs
2775 ± 5% -20.4% 2208 ± 7% slabinfo.fsnotify_mark_connector.active_objs
2775 ± 5% -20.4% 2208 ± 7% slabinfo.fsnotify_mark_connector.num_objs
11030 ± 6% -8.7% 10069 ± 5% slabinfo.pde_opener.active_objs
11030 ± 6% -8.7% 10069 ± 5% slabinfo.pde_opener.num_objs
425.00 ±100% +116.8% 921.25 ± 3% syscalls.sys_close.med
4507 +13.2% 5102 syscalls.sys_poll.min
17548777 ± 2% +2.3e+06 19877523 ± 4% syscalls.sys_poll.noise.100%
22979833 ± 3% +4.7e+06 27645541 ± 4% syscalls.sys_poll.noise.2%
17799035 ± 2% +2.4e+06 20156928 ± 4% syscalls.sys_poll.noise.25%
20161873 ± 3% +3.1e+06 23286940 ± 4% syscalls.sys_poll.noise.5%
17648058 ± 2% +2.4e+06 20015410 ± 4% syscalls.sys_poll.noise.50%
17585605 ± 2% +2.4e+06 19958729 ± 4% syscalls.sys_poll.noise.75%
0.11 ± 19% +35.8% 0.15 ± 16% perf-stat.i.MPKI
9.917e+10 +8.0% 1.071e+11 perf-stat.i.branch-instructions
0.29 -0.0 0.25 perf-stat.i.branch-miss-rate%
2.791e+08 -8.5% 2.554e+08 perf-stat.i.branch-misses
4.037e+11 -1.7% 3.968e+11 perf-stat.i.cpu-cycles
1336350 -8.7% 1220687 ± 13% perf-stat.i.cycles-between-cache-misses
1.114e+11 -2.9% 1.082e+11 perf-stat.i.dTLB-loads
0.09 ± 15% -0.0 0.07 ± 2% perf-stat.i.dTLB-store-miss-rate%
50453634 ± 15% -19.9% 40407884 ± 2% perf-stat.i.dTLB-store-misses
5.934e+10 +1.2% 6.004e+10 perf-stat.i.dTLB-stores
45611355 +6.4% 48521131 ± 2% perf-stat.i.iTLB-load-misses
4.969e+11 -1.2% 4.908e+11 perf-stat.i.instructions
10882 -7.0% 10125 ± 2% perf-stat.i.instructions-per-iTLB-miss
2.80 -1.7% 2.75 perf-stat.i.metric.GHz
1873 +2.0% 1911 perf-stat.i.metric.M/sec
0.28 -0.0 0.24 perf-stat.overall.branch-miss-rate%
0.08 ± 15% -0.0 0.07 ± 2% perf-stat.overall.dTLB-store-miss-rate%
10891 -7.0% 10124 perf-stat.overall.instructions-per-iTLB-miss
3511981 +6.8% 3750127 perf-stat.overall.path-length
9.878e+10 +8.1% 1.067e+11 perf-stat.ps.branch-instructions
2.781e+08 -8.5% 2.545e+08 perf-stat.ps.branch-misses
4.021e+11 -1.7% 3.953e+11 perf-stat.ps.cpu-cycles
1.109e+11 -2.8% 1.078e+11 perf-stat.ps.dTLB-loads
50236046 ± 15% -19.9% 40254632 ± 2% perf-stat.ps.dTLB-store-misses
5.911e+10 +1.2% 5.982e+10 perf-stat.ps.dTLB-stores
45438239 +6.3% 48313052 ± 2% perf-stat.ps.iTLB-load-misses
4.949e+11 -1.2% 4.89e+11 perf-stat.ps.instructions
1.495e+14 -1.0% 1.48e+14 perf-stat.total.instructions
6804 ± 24% -30.0% 4763 ± 35% interrupts.CPU103.NMI:Non-maskable_interrupts
6804 ± 24% -30.0% 4763 ± 35% interrupts.CPU103.PMI:Performance_monitoring_interrupts
349.75 ± 14% -13.9% 301.25 interrupts.CPU104.RES:Rescheduling_interrupts
6859 ± 24% -34.0% 4528 ± 24% interrupts.CPU108.NMI:Non-maskable_interrupts
6859 ± 24% -34.0% 4528 ± 24% interrupts.CPU108.PMI:Performance_monitoring_interrupts
7894 -38.3% 4868 ± 35% interrupts.CPU114.NMI:Non-maskable_interrupts
7894 -38.3% 4868 ± 35% interrupts.CPU114.PMI:Performance_monitoring_interrupts
5905 ± 33% -17.8% 4855 ± 34% interrupts.CPU119.NMI:Non-maskable_interrupts
5905 ± 33% -17.8% 4855 ± 34% interrupts.CPU119.PMI:Performance_monitoring_interrupts
6873 ± 24% -29.6% 4841 ± 34% interrupts.CPU121.NMI:Non-maskable_interrupts
6873 ± 24% -29.6% 4841 ± 34% interrupts.CPU121.PMI:Performance_monitoring_interrupts
5938 ± 33% -34.0% 3920 interrupts.CPU129.NMI:Non-maskable_interrupts
5938 ± 33% -34.0% 3920 interrupts.CPU129.PMI:Performance_monitoring_interrupts
6981 ± 24% -44.0% 3909 interrupts.CPU131.NMI:Non-maskable_interrupts
6981 ± 24% -44.0% 3909 interrupts.CPU131.PMI:Performance_monitoring_interrupts
7944 -41.9% 4612 ± 26% interrupts.CPU135.NMI:Non-maskable_interrupts
7944 -41.9% 4612 ± 26% interrupts.CPU135.PMI:Performance_monitoring_interrupts
5952 ± 33% -17.8% 4894 ± 34% interrupts.CPU136.NMI:Non-maskable_interrupts
5952 ± 33% -17.8% 4894 ± 34% interrupts.CPU136.PMI:Performance_monitoring_interrupts
5939 ± 33% -33.9% 3923 interrupts.CPU137.NMI:Non-maskable_interrupts
5939 ± 33% -33.9% 3923 interrupts.CPU137.PMI:Performance_monitoring_interrupts
6978 ± 24% -29.6% 4913 ± 34% interrupts.CPU138.NMI:Non-maskable_interrupts
6978 ± 24% -29.6% 4913 ± 34% interrupts.CPU138.PMI:Performance_monitoring_interrupts
6946 ± 25% -43.9% 3898 interrupts.CPU142.NMI:Non-maskable_interrupts
6946 ± 25% -43.9% 3898 interrupts.CPU142.PMI:Performance_monitoring_interrupts
7284 ± 12% -24.8% 5474 ± 25% interrupts.CPU21.NMI:Non-maskable_interrupts
7284 ± 12% -24.8% 5474 ± 25% interrupts.CPU21.PMI:Performance_monitoring_interrupts
836.25 ± 28% -36.9% 528.00 ± 45% interrupts.CPU29.CAL:Function_call_interrupts
5876 ± 33% -18.6% 4785 ± 34% interrupts.CPU29.NMI:Non-maskable_interrupts
5876 ± 33% -18.6% 4785 ± 34% interrupts.CPU29.PMI:Performance_monitoring_interrupts
6560 ± 24% -27.1% 4783 ± 34% interrupts.CPU33.NMI:Non-maskable_interrupts
6560 ± 24% -27.1% 4783 ± 34% interrupts.CPU33.PMI:Performance_monitoring_interrupts
6840 ± 24% -39.2% 4158 ± 14% interrupts.CPU35.NMI:Non-maskable_interrupts
6840 ± 24% -39.2% 4158 ± 14% interrupts.CPU35.PMI:Performance_monitoring_interrupts
309.50 ± 2% +24.2% 384.50 ± 12% interrupts.CPU37.RES:Rescheduling_interrupts
331.00 ± 5% +38.7% 459.00 ± 28% interrupts.CPU38.RES:Rescheduling_interrupts
5946 ± 32% -17.8% 4890 ± 34% interrupts.CPU41.NMI:Non-maskable_interrupts
5946 ± 32% -17.8% 4890 ± 34% interrupts.CPU41.PMI:Performance_monitoring_interrupts
1730 ± 11% -27.1% 1261 ± 25% interrupts.CPU54.CAL:Function_call_interrupts
523.75 ± 9% -23.2% 402.25 ± 13% interrupts.CPU54.RES:Rescheduling_interrupts
4207 ± 9% +86.7% 7854 interrupts.CPU56.NMI:Non-maskable_interrupts
4207 ± 9% +86.7% 7854 interrupts.CPU56.PMI:Performance_monitoring_interrupts
305.75 +91.4% 585.25 ± 64% interrupts.CPU57.RES:Rescheduling_interrupts
4590 ± 22% +70.9% 7844 interrupts.CPU59.NMI:Non-maskable_interrupts
4590 ± 22% +70.9% 7844 interrupts.CPU59.PMI:Performance_monitoring_interrupts
6013 ± 30% -20.3% 4793 ± 34% interrupts.CPU7.NMI:Non-maskable_interrupts
6013 ± 30% -20.3% 4793 ± 34% interrupts.CPU7.PMI:Performance_monitoring_interrupts
437.00 ± 7% -14.4% 374.25 ± 3% interrupts.CPU73.RES:Rescheduling_interrupts
5893 ± 33% -22.0% 4596 ± 28% interrupts.CPU80.NMI:Non-maskable_interrupts
5893 ± 33% -22.0% 4596 ± 28% interrupts.CPU80.PMI:Performance_monitoring_interrupts
853.75 ± 23% -38.2% 527.75 ± 46% interrupts.CPU98.CAL:Function_call_interrupts
26.40 -1.0 25.38 perf-profile.calltrace.cycles-pp.__fget_light.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
12.18 -0.7 11.44 perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.__poll
7.20 ± 2% -0.5 6.69 ± 2% perf-profile.calltrace.cycles-pp.testcase
5.35 -0.3 5.05 ± 2% perf-profile.calltrace.cycles-pp.syscall_trace_enter.do_syscall_64.entry_SYSCALL_64_after_hwframe.__poll
4.72 -0.3 4.45 ± 2% perf-profile.calltrace.cycles-pp.ftrace_syscall_enter.syscall_trace_enter.do_syscall_64.entry_SYSCALL_64_after_hwframe.__poll
5.00 -0.2 4.78 perf-profile.calltrace.cycles-pp.__fdget.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.55 -0.2 0.39 ± 57% perf-profile.calltrace.cycles-pp.ring_buffer_unlock_commit.trace_buffer_unlock_commit_regs.ftrace_syscall_exit.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe
1.92 -0.1 1.78 perf-profile.calltrace.cycles-pp.__check_object_size.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
2.31 ± 2% -0.1 2.19 ± 3% perf-profile.calltrace.cycles-pp.trace_buffer_lock_reserve.ftrace_syscall_enter.syscall_trace_enter.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.71 ± 4% -0.1 0.60 ± 5% perf-profile.calltrace.cycles-pp.__virt_addr_valid.__check_object_size.do_sys_poll.__x64_sys_poll.do_syscall_64
0.81 ± 2% -0.1 0.72 ± 2% perf-profile.calltrace.cycles-pp.trace_buffer_unlock_commit_regs.ftrace_syscall_enter.syscall_trace_enter.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.60 ± 2% -0.1 0.53 ± 2% perf-profile.calltrace.cycles-pp.ring_buffer_unlock_commit.trace_buffer_unlock_commit_regs.ftrace_syscall_enter.syscall_trace_enter.do_syscall_64
1.13 ± 3% -0.0 1.09 perf-profile.calltrace.cycles-pp.kfree.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
65.27 +0.4 65.69 perf-profile.calltrace.cycles-pp.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe.__poll
92.14 +0.6 92.70 perf-profile.calltrace.cycles-pp.__poll
85.67 +0.6 86.29 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__poll
73.01 +1.4 74.39 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__poll
67.10 +1.7 68.77 perf-profile.calltrace.cycles-pp.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe.__poll
0.00 +6.2 6.22 perf-profile.calltrace.cycles-pp.__put_user_nocheck_2.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
26.26 -1.4 24.82 perf-profile.children.cycles-pp.__fget_light
12.24 -0.8 11.49 perf-profile.children.cycles-pp.syscall_exit_to_user_mode
7.27 ± 2% -0.5 6.76 ± 2% perf-profile.children.cycles-pp.testcase
5.37 -0.3 5.07 ± 2% perf-profile.children.cycles-pp.syscall_trace_enter
4.78 -0.3 4.50 ± 2% perf-profile.children.cycles-pp.ftrace_syscall_enter
4.32 -0.2 4.07 perf-profile.children.cycles-pp.__fdget
2.03 -0.1 1.89 perf-profile.children.cycles-pp.__check_object_size
1.59 -0.1 1.48 ± 2% perf-profile.children.cycles-pp.trace_buffer_unlock_commit_regs
1.17 -0.1 1.05 ± 2% perf-profile.children.cycles-pp.ring_buffer_unlock_commit
0.71 ± 4% -0.1 0.60 ± 5% perf-profile.children.cycles-pp.__virt_addr_valid
0.68 ± 2% -0.1 0.60 perf-profile.children.cycles-pp.rb_commit
1.54 -0.1 1.48 ± 2% perf-profile.children.cycles-pp.__kmalloc
0.56 ± 2% -0.1 0.51 ± 4% perf-profile.children.cycles-pp.memcpy_erms
0.30 ± 5% -0.0 0.27 ± 4% perf-profile.children.cycles-pp.ring_buffer_event_data
0.10 ± 4% -0.0 0.08 ± 5% perf-profile.children.cycles-pp.__x86_indirect_thunk_rax
0.33 -0.0 0.30 perf-profile.children.cycles-pp.exit_to_user_mode_prepare
0.06 -0.0 0.05 perf-profile.children.cycles-pp.should_failslab
0.28 +0.0 0.31 ± 2% perf-profile.children.cycles-pp.syscall_enter_from_user_mode
0.18 ± 2% +0.0 0.22 ± 3% perf-profile.children.cycles-pp.poll_freewait
92.75 +0.5 93.27 perf-profile.children.cycles-pp.__poll
85.74 +0.6 86.34 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
73.09 +1.4 74.45 perf-profile.children.cycles-pp.do_syscall_64
67.14 +1.7 68.81 perf-profile.children.cycles-pp.__x64_sys_poll
66.28 +1.7 68.01 perf-profile.children.cycles-pp.do_sys_poll
0.00 +5.3 5.34 perf-profile.children.cycles-pp.__put_user_nocheck_2
24.35 -1.3 23.02 perf-profile.self.cycles-pp.__fget_light
8.21 -0.7 7.56 perf-profile.self.cycles-pp.syscall_exit_to_user_mode
7.08 ± 3% -0.5 6.57 ± 2% perf-profile.self.cycles-pp.testcase
1.87 -0.2 1.71 perf-profile.self.cycles-pp.__fdget
0.69 ± 5% -0.1 0.58 ± 6% perf-profile.self.cycles-pp.__virt_addr_valid
0.67 ± 2% -0.1 0.59 ± 2% perf-profile.self.cycles-pp.rb_commit
0.69 -0.1 0.63 perf-profile.self.cycles-pp.__x64_sys_poll
0.54 ± 3% -0.1 0.49 ± 3% perf-profile.self.cycles-pp.memcpy_erms
0.47 -0.0 0.43 ± 4% perf-profile.self.cycles-pp.ring_buffer_unlock_commit
0.65 -0.0 0.60 perf-profile.self.cycles-pp.ftrace_syscall_exit
0.81 -0.0 0.77 ± 2% perf-profile.self.cycles-pp.__kmalloc
0.24 ± 3% -0.0 0.21 ± 5% perf-profile.self.cycles-pp.ring_buffer_event_data
0.27 -0.0 0.25 perf-profile.self.cycles-pp.exit_to_user_mode_prepare
0.22 -0.0 0.20 ± 2% perf-profile.self.cycles-pp.do_syscall_64
0.24 +0.0 0.27 ± 4% perf-profile.self.cycles-pp.syscall_enter_from_user_mode
0.16 ± 2% +0.0 0.19 ± 4% perf-profile.self.cycles-pp.poll_freewait
0.00 +3.6 3.64 perf-profile.self.cycles-pp.__put_user_nocheck_2
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Oliver Sang
View attachment "config-5.9.0-00857-gd55564cfc22232" of type "text/plain" (169998 bytes)
View attachment "job-script" of type "text/plain" (7803 bytes)
View attachment "job.yaml" of type "text/plain" (5430 bytes)
View attachment "reproduce" of type "text/plain" (336 bytes)
Powered by blists - more mailing lists