[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20210113025549.GC7528@xsang-OptiPlex-9020>
Date: Wed, 13 Jan 2021 10:55:49 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: kernel test robot <oliver.sang@...el.com>,
Al Viro <viro@...iv.linux.org.uk>,
David Laight <David.Laight@...lab.com>,
Peter Zijlstra <peterz@...radead.org>,
LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org,
lkp@...el.com, ying.huang@...el.com, feng.tang@...el.com,
zhengjun.xing@...el.com
Subject: [poll] ef0ba05538: will-it-scale.per_thread_ops 8.9% improvement
Greeting,
FYI, we noticed a 8.9% improvement of will-it-scale.per_thread_ops due to commit:
commit: ef0ba05538299f1391cbe097de36895bb36ecfe6 ("poll: fix performance regression due to out-of-line __put_user()")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
in testcase: will-it-scale
on test machine: 144 threads Intel(R) Xeon(R) CPU E7-8890 v3 @ 2.50GHz with 512G memory
with following parameters:
nr_task: 50%
mode: thread
test: poll2
cpufreq_governor: performance
ucode: 0x16
test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
test-url: https://github.com/antonblanchard/will-it-scale
In addition to that, the commit also has significant impact on the following tests:
+------------------+-----------------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_thread_ops 7.1% improvement |
| test machine | 48 threads Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz with 112G memory |
| test parameters | cpufreq_governor=performance |
| | mode=thread |
| | nr_task=16 |
| | test=poll2 |
| | ucode=0x42e |
+------------------+-----------------------------------------------------------------------+
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
gcc-9/performance/x86_64-rhel-8.3/thread/50%/debian-10.4-x86_64-20200603.cgz/lkp-hsw-4ex1/poll2/will-it-scale/0x16
commit:
a91bd6223e ("Revert "init/console: Use ttynull as a fallback when there is no console"")
ef0ba05538 ("poll: fix performance regression due to out-of-line __put_user()")
a91bd6223ecd46ad ef0ba05538299f1391cbe097de3
---------------- ---------------------------
%stddev %change %stddev
\ | \
18011165 +8.9% 19618836 will-it-scale.72.threads
250154 +8.9% 272483 will-it-scale.per_thread_ops
18011165 +8.9% 19618836 will-it-scale.workload
46702 +1.7% 47497 proc-vmstat.nr_slab_unreclaimable
238955 ± 31% +42.3% 340009 ± 19% numa-numastat.node1.local_node
291569 ± 30% +42.5% 415478 ± 15% numa-numastat.node1.numa_hit
33349633 ± 8% -25.8% 24744913 ± 18% cpuidle.C1E.usage
7.521e+09 ± 32% +54.0% 1.158e+10 ± 26% cpuidle.C6.time
10594513 ± 25% +81.2% 19201840 ± 23% cpuidle.C6.usage
102872 ± 10% -16.0% 86427 ± 11% syscalls.sys_openat.max
8422 -9.8% 7597 syscalls.sys_poll.med
8268 -9.8% 7462 syscalls.sys_poll.min
16.50 ± 13% -23.0% 12.71 ± 10% sched_debug.cfs_rq:/.load_avg.avg
255.52 ± 22% -30.4% 177.92 ± 2% sched_debug.cfs_rq:/.load_avg.max
43.12 ± 12% -29.5% 30.38 ± 10% sched_debug.cfs_rq:/.load_avg.stddev
186.58 ± 8% -55.9% 82.38 ±100% sched_debug.cfs_rq:/.removed.load_avg.max
27.55 ± 30% -62.8% 10.26 ±103% sched_debug.cfs_rq:/.removed.load_avg.stddev
1.60 ± 37% -60.5% 0.63 ±120% sched_debug.cfs_rq:/.removed.runnable_avg.avg
1.60 ± 37% -60.5% 0.63 ±120% sched_debug.cfs_rq:/.removed.util_avg.avg
1475 ± 53% -88.6% 167.50 ± 22% numa-meminfo.node1.Active
1475 ± 53% -88.6% 167.50 ± 22% numa-meminfo.node1.Active(anon)
17441 ± 19% +37.9% 24052 ± 10% numa-meminfo.node2.KReclaimable
664056 ± 4% +18.3% 785538 ± 7% numa-meminfo.node2.MemUsed
743.00 ± 31% +98.8% 1476 ± 51% numa-meminfo.node2.PageTables
17441 ± 19% +37.9% 24052 ± 10% numa-meminfo.node2.SReclaimable
35651 ± 4% +26.6% 45148 ± 7% numa-meminfo.node2.SUnreclaim
53093 ± 9% +30.3% 69201 ± 6% numa-meminfo.node2.Slab
47310 ± 9% -13.2% 41067 ± 9% numa-meminfo.node3.SUnreclaim
368.50 ± 53% -88.8% 41.25 ± 22% numa-vmstat.node1.nr_active_anon
368.50 ± 53% -88.8% 41.25 ± 22% numa-vmstat.node1.nr_zone_active_anon
183.25 ± 32% +101.0% 368.25 ± 51% numa-vmstat.node2.nr_page_table_pages
4360 ± 19% +37.9% 6012 ± 10% numa-vmstat.node2.nr_slab_reclaimable
8912 ± 4% +26.6% 11286 ± 7% numa-vmstat.node2.nr_slab_unreclaimable
460320 ± 12% +31.8% 606634 ± 8% numa-vmstat.node2.numa_hit
304883 ± 21% +55.8% 475111 ± 10% numa-vmstat.node2.numa_local
11827 ± 9% -13.2% 10266 ± 9% numa-vmstat.node3.nr_slab_unreclaimable
674508 ± 23% -29.0% 478814 ± 11% numa-vmstat.node3.numa_hit
542032 ± 25% -38.2% 334743 ± 16% numa-vmstat.node3.numa_local
2495 ± 7% +12.7% 2812 ± 3% slabinfo.PING.active_objs
2495 ± 7% +12.7% 2812 ± 3% slabinfo.PING.num_objs
2262 ± 12% +19.5% 2703 ± 6% slabinfo.fsnotify_mark_connector.active_objs
2262 ± 12% +19.5% 2703 ± 6% slabinfo.fsnotify_mark_connector.num_objs
901.00 ± 5% -11.5% 797.00 slabinfo.pool_workqueue.active_objs
930.50 ± 5% -11.8% 821.00 ± 2% slabinfo.pool_workqueue.num_objs
3144 ± 5% +9.1% 3430 ± 2% slabinfo.signal_cache.active_objs
3144 ± 5% +9.3% 3437 ± 2% slabinfo.signal_cache.num_objs
4087 ± 3% +10.0% 4496 ± 2% slabinfo.sock_inode_cache.active_objs
4087 ± 3% +10.0% 4496 ± 2% slabinfo.sock_inode_cache.num_objs
0.02 ± 17% -27.8% 0.01 ± 5% perf-sched.sch_delay.max.ms.__traceiter_sched_switch.__traceiter_sched_switch.futex_wait_queue_me.futex_wait.do_futex
250.89 ±173% -100.0% 0.03 ± 8% perf-sched.sch_delay.max.ms.__traceiter_sched_switch.__traceiter_sched_switch.pipe_read.new_sync_read.vfs_read
806.12 ± 24% -50.4% 400.04 ± 66% perf-sched.wait_and_delay.avg.ms.__traceiter_sched_switch.__traceiter_sched_switch.schedule_timeout.io_schedule_timeout.wait_for_completion_io
61.00 ± 57% +63.5% 99.75 ± 7% perf-sched.wait_and_delay.count.__traceiter_sched_switch.__traceiter_sched_switch.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt
1929 ± 2% -8.6% 1763 ± 4% perf-sched.wait_and_delay.max.ms.__traceiter_sched_switch.__traceiter_sched_switch.devkmsg_read.vfs_read.ksys_read
1929 ± 2% -8.6% 1763 ± 4% perf-sched.wait_and_delay.max.ms.__traceiter_sched_switch.__traceiter_sched_switch.do_syslog.part.0
1932 ± 2% -8.6% 1767 ± 4% perf-sched.wait_and_delay.max.ms.__traceiter_sched_switch.__traceiter_sched_switch.pipe_read.new_sync_read.vfs_read
806.11 ± 24% -46.7% 429.95 ± 51% perf-sched.wait_time.avg.ms.__traceiter_sched_switch.__traceiter_sched_switch.schedule_timeout.io_schedule_timeout.wait_for_completion_io
1929 ± 2% -8.6% 1763 ± 4% perf-sched.wait_time.max.ms.__traceiter_sched_switch.__traceiter_sched_switch.devkmsg_read.vfs_read.ksys_read
1929 ± 2% -8.6% 1763 ± 4% perf-sched.wait_time.max.ms.__traceiter_sched_switch.__traceiter_sched_switch.do_syslog.part.0
1932 ± 2% -8.6% 1767 ± 4% perf-sched.wait_time.max.ms.__traceiter_sched_switch.__traceiter_sched_switch.pipe_read.new_sync_read.vfs_read
0.07 ±108% -77.8% 0.01 ± 62% perf-sched.wait_time.max.ms.__traceiter_sched_switch.__traceiter_sched_switch.schedule_timeout.wait_for_completion.stop_one_cpu
72.66 -7.7 64.98 ± 10% perf-profile.calltrace.cycles-pp.__poll
60.43 -7.7 52.76 ± 10% perf-profile.calltrace.cycles-pp.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe.__poll
69.09 -7.6 61.48 ± 10% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__poll
63.12 -7.5 55.66 ± 10% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__poll
61.29 -7.4 53.91 ± 10% perf-profile.calltrace.cycles-pp.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe.__poll
24.23 -2.8 21.39 ± 10% perf-profile.calltrace.cycles-pp.__fget_files.__fget_light.do_sys_poll.__x64_sys_poll.do_syscall_64
24.49 ± 2% +7.6 32.10 ± 22% perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
24.49 ± 2% +7.6 32.10 ± 22% perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
24.49 ± 2% +7.6 32.10 ± 22% perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64_no_verify
24.46 ± 2% +7.6 32.09 ± 22% perf-profile.calltrace.cycles-pp.cpuidle_enter.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
24.46 ± 2% +7.6 32.09 ± 22% perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.do_idle.cpu_startup_entry.start_secondary
24.66 ± 2% +7.8 32.41 ± 22% perf-profile.calltrace.cycles-pp.secondary_startup_64_no_verify
24.42 ± 2% +7.8 32.23 ± 22% perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.cpuidle_enter.do_idle.cpu_startup_entry
72.88 -7.7 65.22 ± 10% perf-profile.children.cycles-pp.__poll
69.17 -7.6 61.57 ± 10% perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
63.19 -7.5 55.73 ± 10% perf-profile.children.cycles-pp.do_syscall_64
61.31 -7.4 53.93 ± 10% perf-profile.children.cycles-pp.__x64_sys_poll
60.96 -7.4 53.58 ± 10% perf-profile.children.cycles-pp.do_sys_poll
25.04 -2.8 22.19 ± 10% perf-profile.children.cycles-pp.__fget_files
0.33 ± 5% -0.1 0.25 ± 20% perf-profile.children.cycles-pp.perf_tp_event
24.49 ± 2% +7.6 32.10 ± 22% perf-profile.children.cycles-pp.start_secondary
24.66 ± 2% +7.8 32.41 ± 22% perf-profile.children.cycles-pp.secondary_startup_64_no_verify
24.66 ± 2% +7.8 32.41 ± 22% perf-profile.children.cycles-pp.cpu_startup_entry
24.66 ± 2% +7.8 32.41 ± 22% perf-profile.children.cycles-pp.do_idle
24.63 ± 2% +7.8 32.41 ± 22% perf-profile.children.cycles-pp.cpuidle_enter
24.63 ± 2% +7.8 32.41 ± 22% perf-profile.children.cycles-pp.cpuidle_enter_state
24.59 ± 2% +7.8 32.40 ± 22% perf-profile.children.cycles-pp.intel_idle
24.00 -2.8 21.20 ± 10% perf-profile.self.cycles-pp.__fget_files
12.90 -2.4 10.53 ± 10% perf-profile.self.cycles-pp.do_sys_poll
24.59 ± 2% +7.8 32.40 ± 22% perf-profile.self.cycles-pp.intel_idle
0.10 ± 10% +20.6% 0.12 ± 4% perf-stat.i.MPKI
0.18 +0.0 0.19 perf-stat.i.branch-miss-rate%
1.258e+08 +4.6% 1.315e+08 perf-stat.i.branch-misses
23216759 ± 5% +11.6% 25900213 ± 7% perf-stat.i.cache-references
0.70 -1.4% 0.69 perf-stat.i.cpi
0.05 -0.0 0.04 ± 3% perf-stat.i.dTLB-load-miss-rate%
36187253 -20.1% 28909422 ± 3% perf-stat.i.dTLB-load-misses
7.244e+10 +2.0% 7.393e+10 perf-stat.i.dTLB-loads
0.04 +0.0 0.04 perf-stat.i.dTLB-store-miss-rate%
18118830 +9.0% 19752182 perf-stat.i.dTLB-store-misses
4.61e+10 +3.5% 4.773e+10 perf-stat.i.dTLB-stores
27239224 +6.0% 28873516 ± 2% perf-stat.i.iTLB-load-misses
3.005e+11 +1.6% 3.054e+11 perf-stat.i.instructions
11048 -4.2% 10587 ± 2% perf-stat.i.instructions-per-iTLB-miss
1.43 +1.5% 1.46 perf-stat.i.ipc
1356 +1.2% 1372 perf-stat.i.metric.M/sec
0.08 ± 5% +9.4% 0.09 ± 7% perf-stat.overall.MPKI
0.16 +0.0 0.17 perf-stat.overall.branch-miss-rate%
0.70 -1.5% 0.69 perf-stat.overall.cpi
0.05 -0.0 0.04 ± 3% perf-stat.overall.dTLB-load-miss-rate%
0.04 +0.0 0.04 perf-stat.overall.dTLB-store-miss-rate%
11040 -4.1% 10589 ± 2% perf-stat.overall.instructions-per-iTLB-miss
1.44 +1.6% 1.46 perf-stat.overall.ipc
5022724 -6.7% 4684438 perf-stat.overall.path-length
1.255e+08 +4.5% 1.311e+08 perf-stat.ps.branch-misses
23272217 ± 5% +11.1% 25848995 ± 7% perf-stat.ps.cache-references
36073415 -20.2% 28792468 ± 3% perf-stat.ps.dTLB-load-misses
7.22e+10 +2.0% 7.361e+10 perf-stat.ps.dTLB-loads
18057011 +8.9% 19667743 perf-stat.ps.dTLB-store-misses
4.595e+10 +3.4% 4.753e+10 perf-stat.ps.dTLB-stores
27136115 +5.9% 28744716 ± 2% perf-stat.ps.iTLB-load-misses
2.995e+11 +1.5% 3.041e+11 perf-stat.ps.instructions
9.046e+13 +1.6% 9.19e+13 perf-stat.total.instructions
13893 ± 5% -20.4% 11054 ± 6% softirqs.CPU101.RCU
11428 ± 46% +119.3% 25066 ± 27% softirqs.CPU101.SCHED
37214 ± 12% -51.1% 18200 ± 63% softirqs.CPU106.SCHED
8255 ± 8% +15.2% 9512 ± 7% softirqs.CPU110.RCU
38004 ± 5% -24.1% 28839 ± 26% softirqs.CPU110.SCHED
10247 ± 9% +17.6% 12053 ± 5% softirqs.CPU113.RCU
33888 ± 11% -41.5% 19830 ± 16% softirqs.CPU113.SCHED
9500 ± 10% +30.2% 12366 ± 4% softirqs.CPU118.RCU
38569 ± 4% -54.4% 17583 ± 54% softirqs.CPU118.SCHED
9284 ± 11% -17.5% 7661 ± 12% softirqs.CPU126.RCU
14321 ± 27% +51.5% 21693 ± 18% softirqs.CPU13.SCHED
13294 ± 30% +108.6% 27734 ± 22% softirqs.CPU130.SCHED
7801 ± 8% +33.9% 10446 ± 10% softirqs.CPU133.RCU
34662 ± 13% -35.4% 22383 ± 17% softirqs.CPU133.SCHED
8945 ± 9% +31.6% 11769 ± 14% softirqs.CPU138.RCU
34958 ± 16% -43.5% 19740 ± 42% softirqs.CPU138.SCHED
9051 ± 4% +23.3% 11164 ± 15% softirqs.CPU141.RCU
30437 ± 15% -40.5% 18124 ± 47% softirqs.CPU15.SCHED
10040 ± 4% -18.4% 8190 ± 14% softirqs.CPU16.RCU
11827 ± 27% +164.1% 31241 ± 23% softirqs.CPU16.SCHED
20594 ± 23% -38.9% 12572 ± 52% softirqs.CPU20.SCHED
14656 ± 5% -25.4% 10931 ± 5% softirqs.CPU21.RCU
9461 ± 59% +192.0% 27630 ± 24% softirqs.CPU21.SCHED
35725 ± 7% -44.2% 19932 ± 35% softirqs.CPU27.SCHED
33917 ± 12% -46.8% 18044 ± 43% softirqs.CPU29.SCHED
12308 ± 10% -23.7% 9386 ± 14% softirqs.CPU34.RCU
8999 ± 65% +179.0% 25110 ± 40% softirqs.CPU34.SCHED
14707 ± 4% -20.2% 11729 ± 4% softirqs.CPU41.RCU
11362 ± 26% +106.9% 23512 ± 19% softirqs.CPU41.SCHED
15054 ± 7% -17.7% 12389 ± 8% softirqs.CPU42.RCU
8835 ± 45% +95.3% 17253 ± 35% softirqs.CPU42.SCHED
15106 ± 6% -25.1% 11310 ± 12% softirqs.CPU46.RCU
6446 ± 28% +298.2% 25667 ± 27% softirqs.CPU46.SCHED
28882 ± 23% -33.1% 19313 ± 23% softirqs.CPU5.SCHED
34743 ± 9% -27.3% 25246 ± 16% softirqs.CPU53.SCHED
9551 ± 2% +23.2% 11769 ± 10% softirqs.CPU56.RCU
27072 ± 13% -45.3% 14802 ± 41% softirqs.CPU56.SCHED
9149 ± 15% +23.3% 11285 ± 12% softirqs.CPU57.RCU
8294 ± 9% +37.7% 11420 ± 9% softirqs.CPU58.RCU
31613 ± 6% -51.5% 15332 ± 49% softirqs.CPU58.SCHED
11790 ± 9% -13.9% 10152 ± 6% softirqs.CPU61.RCU
10040 ± 57% +106.2% 20700 ± 17% softirqs.CPU61.SCHED
10515 ± 59% +126.3% 23797 ± 27% softirqs.CPU66.SCHED
11976 ± 11% -18.6% 9745 ± 5% softirqs.CPU77.RCU
30773 ± 13% -28.9% 21866 ± 7% softirqs.CPU85.SCHED
12405 ± 8% -16.3% 10383 ± 8% softirqs.CPU87.RCU
42921 ± 46% -68.9% 13333 ± 63% softirqs.CPU88.SCHED
10774 ± 10% -17.0% 8942 ± 5% softirqs.CPU92.RCU
35714 ± 14% -56.5% 15552 ± 26% softirqs.CPU93.SCHED
14302 ± 4% -17.9% 11748 ± 10% softirqs.CPU98.RCU
14121 ± 5% -12.5% 12355 ± 10% softirqs.CPU99.RCU
9617 ± 25% +120.3% 21192 ± 41% softirqs.CPU99.SCHED
279880 ± 11% -13.0% 243558 interrupts.CAL:Function_call_interrupts
221.25 ± 25% -59.3% 90.00 ± 67% interrupts.CPU101.RES:Rescheduling_interrupts
1179 ± 20% +42.3% 1678 ± 19% interrupts.CPU102.CAL:Function_call_interrupts
350.50 ± 66% +136.2% 827.75 ± 36% interrupts.CPU102.TLB:TLB_shootdowns
390.25 ± 26% -35.6% 251.25 ± 7% interrupts.CPU104.TLB:TLB_shootdowns
37.25 ±102% +260.4% 134.25 ± 62% interrupts.CPU106.RES:Rescheduling_interrupts
55.25 ± 45% +154.3% 140.50 ± 39% interrupts.CPU113.RES:Rescheduling_interrupts
2207 ± 34% +156.5% 5660 ± 28% interrupts.CPU114.NMI:Non-maskable_interrupts
2207 ± 34% +156.5% 5660 ± 28% interrupts.CPU114.PMI:Performance_monitoring_interrupts
31.50 ± 84% +165.1% 83.50 ± 54% interrupts.CPU114.RES:Rescheduling_interrupts
14.00 ± 74% +910.7% 141.50 ± 53% interrupts.CPU118.RES:Rescheduling_interrupts
90.25 ±108% +897.0% 899.75 ± 48% interrupts.CPU118.TLB:TLB_shootdowns
3236 ± 64% -48.5% 1666 ± 15% interrupts.CPU119.CAL:Function_call_interrupts
195.50 ± 35% -42.2% 113.00 ± 20% interrupts.CPU12.RES:Rescheduling_interrupts
235.75 ± 20% -41.3% 138.50 ± 46% interrupts.CPU125.RES:Rescheduling_interrupts
208.00 ± 36% -54.3% 95.00 ± 69% interrupts.CPU126.RES:Rescheduling_interrupts
167.00 ± 25% -65.6% 57.50 ± 69% interrupts.CPU128.RES:Rescheduling_interrupts
189.25 ± 25% -42.0% 109.75 ± 42% interrupts.CPU13.RES:Rescheduling_interrupts
211.25 ± 23% -75.9% 51.00 ± 62% interrupts.CPU130.RES:Rescheduling_interrupts
109.25 ± 53% -73.0% 29.50 ± 97% interrupts.CPU131.RES:Rescheduling_interrupts
4394 ± 45% -31.4% 3012 ± 8% interrupts.CPU136.NMI:Non-maskable_interrupts
4394 ± 45% -31.4% 3012 ± 8% interrupts.CPU136.PMI:Performance_monitoring_interrupts
107.25 ± 39% -59.2% 43.75 ± 96% interrupts.CPU136.RES:Rescheduling_interrupts
78.00 ± 63% -61.9% 29.75 ±101% interrupts.CPU137.RES:Rescheduling_interrupts
48.25 ± 45% +187.0% 138.50 ± 56% interrupts.CPU138.RES:Rescheduling_interrupts
107.50 ± 41% -78.1% 23.50 ± 77% interrupts.CPU139.RES:Rescheduling_interrupts
216.50 ± 36% -55.8% 95.75 ± 67% interrupts.CPU14.RES:Rescheduling_interrupts
2077 ± 22% -42.8% 1187 ± 31% interrupts.CPU140.CAL:Function_call_interrupts
1080 ± 15% +67.0% 1804 ± 24% interrupts.CPU15.CAL:Function_call_interrupts
1464 ± 2% +226.9% 4785 ± 41% interrupts.CPU15.NMI:Non-maskable_interrupts
1464 ± 2% +226.9% 4785 ± 41% interrupts.CPU15.PMI:Performance_monitoring_interrupts
1988 ± 5% -33.4% 1324 ± 30% interrupts.CPU16.CAL:Function_call_interrupts
5928 ± 31% -39.3% 3597 ± 29% interrupts.CPU16.NMI:Non-maskable_interrupts
5928 ± 31% -39.3% 3597 ± 29% interrupts.CPU16.PMI:Performance_monitoring_interrupts
214.75 ± 18% -74.5% 54.75 ± 80% interrupts.CPU16.RES:Rescheduling_interrupts
1287 ± 9% -62.9% 478.25 ± 84% interrupts.CPU16.TLB:TLB_shootdowns
2130 ± 4% -29.0% 1513 ± 24% interrupts.CPU21.CAL:Function_call_interrupts
236.75 ± 25% -73.8% 62.00 ± 68% interrupts.CPU21.RES:Rescheduling_interrupts
1389 ± 10% -51.2% 678.00 ± 54% interrupts.CPU21.TLB:TLB_shootdowns
2079 ± 24% +217.2% 6595 ± 33% interrupts.CPU25.NMI:Non-maskable_interrupts
2079 ± 24% +217.2% 6595 ± 33% interrupts.CPU25.PMI:Performance_monitoring_interrupts
1143 ± 20% +47.6% 1686 ± 16% interrupts.CPU26.CAL:Function_call_interrupts
37.50 ± 42% +194.7% 110.50 ± 42% interrupts.CPU27.RES:Rescheduling_interrupts
1930 ± 30% +120.7% 4259 ± 54% interrupts.CPU29.NMI:Non-maskable_interrupts
1930 ± 30% +120.7% 4259 ± 54% interrupts.CPU29.PMI:Performance_monitoring_interrupts
40.75 ± 72% +246.0% 141.00 ± 24% interrupts.CPU29.RES:Rescheduling_interrupts
1850 ± 4% +25.1% 2315 ± 6% interrupts.CPU32.CAL:Function_call_interrupts
3210 ± 54% -56.7% 1388 ± 34% interrupts.CPU34.CAL:Function_call_interrupts
233.50 ± 31% -64.2% 83.50 ± 89% interrupts.CPU34.RES:Rescheduling_interrupts
1310 ± 22% -57.1% 562.75 ± 95% interrupts.CPU34.TLB:TLB_shootdowns
213.50 ± 13% -60.3% 84.75 ± 38% interrupts.CPU41.RES:Rescheduling_interrupts
2145 ± 3% -36.5% 1362 ± 34% interrupts.CPU46.CAL:Function_call_interrupts
7988 ± 2% -27.7% 5774 ± 24% interrupts.CPU46.NMI:Non-maskable_interrupts
7988 ± 2% -27.7% 5774 ± 24% interrupts.CPU46.PMI:Performance_monitoring_interrupts
254.25 ± 16% -71.5% 72.50 ± 96% interrupts.CPU46.RES:Rescheduling_interrupts
1375 ± 6% -59.3% 559.50 ± 83% interrupts.CPU46.TLB:TLB_shootdowns
1177 ± 28% +29.1% 1519 ± 22% interrupts.CPU5.CAL:Function_call_interrupts
1798 ± 35% +144.9% 4404 ± 49% interrupts.CPU54.NMI:Non-maskable_interrupts
1798 ± 35% +144.9% 4404 ± 49% interrupts.CPU54.PMI:Performance_monitoring_interrupts
2151 ± 33% +222.7% 6940 ± 15% interrupts.CPU58.NMI:Non-maskable_interrupts
2151 ± 33% +222.7% 6940 ± 15% interrupts.CPU58.PMI:Performance_monitoring_interrupts
226.50 ± 18% -49.2% 115.00 ± 13% interrupts.CPU61.RES:Rescheduling_interrupts
5949 ± 31% -50.6% 2936 ± 33% interrupts.CPU66.NMI:Non-maskable_interrupts
5949 ± 31% -50.6% 2936 ± 33% interrupts.CPU66.PMI:Performance_monitoring_interrupts
229.00 ± 31% -67.5% 74.50 ± 63% interrupts.CPU66.RES:Rescheduling_interrupts
205.50 ± 13% -47.9% 107.00 ± 49% interrupts.CPU72.RES:Rescheduling_interrupts
193.25 ± 46% -39.8% 116.25 ± 53% interrupts.CPU73.RES:Rescheduling_interrupts
2213 ± 29% +123.1% 4938 ± 32% interrupts.CPU83.NMI:Non-maskable_interrupts
2213 ± 29% +123.1% 4938 ± 32% interrupts.CPU83.PMI:Performance_monitoring_interrupts
1185 ± 39% +74.9% 2072 ± 21% interrupts.CPU86.CAL:Function_call_interrupts
8124 -61.1% 3163 ± 57% interrupts.CPU87.NMI:Non-maskable_interrupts
8124 -61.1% 3163 ± 57% interrupts.CPU87.PMI:Performance_monitoring_interrupts
54.00 ± 57% +205.6% 165.00 ± 46% interrupts.CPU88.RES:Rescheduling_interrupts
179.75 ± 37% +433.4% 958.75 ± 42% interrupts.CPU88.TLB:TLB_shootdowns
4996 ± 38% +38.6% 6925 ± 17% interrupts.CPU91.NMI:Non-maskable_interrupts
4996 ± 38% +38.6% 6925 ± 17% interrupts.CPU91.PMI:Performance_monitoring_interrupts
910.25 ± 19% +94.8% 1773 ± 36% interrupts.CPU93.CAL:Function_call_interrupts
28.75 ±102% +433.0% 153.25 ± 34% interrupts.CPU93.RES:Rescheduling_interrupts
111.75 ±153% +646.5% 834.25 ± 47% interrupts.CPU93.TLB:TLB_shootdowns
229.00 ± 24% -58.3% 95.50 ± 71% interrupts.CPU98.RES:Rescheduling_interrupts
will-it-scale.per_thread_ops
275000 +------------------------------------------------------------------+
| O O O O O O O O O O O O O O O O O O O O O O |
270000 |-+ O O O |
| |
| O O |
265000 |-+ |
|.+.+.+.+.+.. .+.+.+.+.+.+.+.+.+..+.+.+.+ |
260000 |-+ + : |
| : |
255000 |-+ : |
| : |
| +. .+. .+..+ |
250000 |-+ +.+ +.+ |
| |
245000 +------------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
***************************************************************************************************
lkp-ivb-2ep1: 48 threads Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz with 112G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
gcc-9/performance/x86_64-rhel-8.3/thread/16/debian-10.4-x86_64-20200603.cgz/lkp-ivb-2ep1/poll2/will-it-scale/0x42e
commit:
a91bd6223e ("Revert "init/console: Use ttynull as a fallback when there is no console"")
ef0ba05538 ("poll: fix performance regression due to out-of-line __put_user()")
a91bd6223ecd46ad ef0ba05538299f1391cbe097de3
---------------- ---------------------------
%stddev %change %stddev
\ | \
4103673 +7.1% 4394585 will-it-scale.16.threads
256479 +7.1% 274661 will-it-scale.per_thread_ops
4103673 +7.1% 4394585 will-it-scale.workload
766.75 +2.6% 786.89 ± 2% boot-time.idle
54928 -2.0% 53825 proc-vmstat.pgreuse
68.41 ± 11% +24.0% 84.79 ± 17% sched_debug.cfs_rq:/.load_avg.stddev
34.08 ± 30% +54.1% 52.50 ± 26% sched_debug.cfs_rq:/.removed.load_avg.stddev
72.95 ± 20% +45.8% 106.38 ± 31% sched_debug.cfs_rq:/.removed.runnable_avg.max
11.88 ± 28% +68.7% 20.04 ± 32% sched_debug.cfs_rq:/.removed.runnable_avg.stddev
72.95 ± 20% +45.8% 106.38 ± 31% sched_debug.cfs_rq:/.removed.util_avg.max
11.88 ± 28% +68.8% 20.04 ± 32% sched_debug.cfs_rq:/.removed.util_avg.stddev
0.00 ± 17% +47.4% 0.01 ± 17% perf-sched.sch_delay.avg.ms.wait_for_partner.fifo_open.do_dentry_open.path_openat
0.02 ± 7% -35.1% 0.02 ± 18% perf-sched.sch_delay.max.ms.do_nanosleep.hrtimer_nanosleep.__x64_sys_nanosleep.do_syscall_64
1567 ± 7% -34.7% 1024 ± 35% perf-sched.wait_and_delay.avg.ms.futex_wait_queue_me.futex_wait.do_futex.__x64_sys_futex
6003 ± 14% -31.0% 4141 ± 10% perf-sched.wait_and_delay.max.ms.worker_thread.kthread.ret_from_fork
1567 ± 7% -34.7% 1024 ± 35% perf-sched.wait_time.avg.ms.futex_wait_queue_me.futex_wait.do_futex.__x64_sys_futex
3.16 ± 15% -35.1% 2.05 ± 58% perf-sched.wait_time.avg.ms.rcu_gp_kthread.kthread.ret_from_fork
6002 ± 14% -31.0% 4141 ± 10% perf-sched.wait_time.max.ms.worker_thread.kthread.ret_from_fork
0.69 ± 6% +0.2 0.86 ± 16% perf-profile.calltrace.cycles-pp.__check_object_size.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.01 ±173% +0.0 0.06 ± 14% perf-profile.children.cycles-pp.clockevents_program_event
0.01 ±173% +0.1 0.06 ± 17% perf-profile.children.cycles-pp.poll_select_set_timeout
0.18 ± 15% +0.1 0.27 ± 20% perf-profile.children.cycles-pp.__virt_addr_valid
0.71 ± 6% +0.2 0.89 ± 15% perf-profile.children.cycles-pp.__check_object_size
0.11 ± 7% +0.0 0.15 ± 14% perf-profile.self.cycles-pp.do_syscall_64
0.01 ±173% +0.1 0.06 ± 17% perf-profile.self.cycles-pp.poll_select_set_timeout
0.17 ± 16% +0.1 0.26 ± 21% perf-profile.self.cycles-pp.__virt_addr_valid
0.64 ± 6% +0.1 0.73 ± 11% perf-profile.self.cycles-pp.__fdget
18348 ± 6% +14.1% 20941 ± 5% softirqs.CPU2.RCU
7162 ± 5% +25.8% 9007 ± 8% softirqs.CPU20.RCU
10576 ± 13% +19.6% 12644 ± 11% softirqs.CPU25.RCU
10627 ± 11% +31.5% 13970 ± 18% softirqs.CPU29.RCU
9132 ± 9% +28.2% 11710 ± 10% softirqs.CPU33.RCU
9969 ± 16% +27.4% 12699 ± 10% softirqs.CPU34.RCU
9463 ± 3% +14.6% 10843 ± 5% softirqs.CPU37.RCU
9952 ± 7% +21.0% 12041 ± 6% softirqs.CPU38.RCU
15774 ± 8% +15.8% 18261 ± 3% softirqs.CPU4.RCU
6414 ± 5% +27.1% 8151 ± 18% softirqs.CPU42.RCU
7342 ± 3% +23.3% 9057 ± 10% softirqs.CPU44.RCU
15373 ± 8% +17.8% 18113 ± 5% softirqs.CPU8.RCU
1.729e+10 -2.7% 1.682e+10 perf-stat.i.branch-instructions
0.21 +0.0 0.23 ± 2% perf-stat.i.branch-miss-rate%
34023615 +5.6% 35930017 ± 2% perf-stat.i.branch-misses
0.09 +0.0 0.09 perf-stat.i.dTLB-store-miss-rate%
8987462 +7.2% 9630858 perf-stat.i.dTLB-store-misses
1.053e+10 +2.1% 1.075e+10 perf-stat.i.dTLB-stores
5043576 +7.2% 5405769 ± 2% perf-stat.i.iTLB-load-misses
13393 -7.0% 12449 perf-stat.i.instructions-per-iTLB-miss
0.20 +0.0 0.21 ± 2% perf-stat.overall.branch-miss-rate%
0.09 +0.0 0.09 perf-stat.overall.dTLB-store-miss-rate%
13396 -7.0% 12452 perf-stat.overall.instructions-per-iTLB-miss
4961998 -7.1% 4607701 perf-stat.overall.path-length
1.723e+10 -2.7% 1.677e+10 perf-stat.ps.branch-instructions
33921356 +5.6% 35811411 ± 2% perf-stat.ps.branch-misses
8957329 +7.2% 9598546 perf-stat.ps.dTLB-store-misses
1.05e+10 +2.1% 1.072e+10 perf-stat.ps.dTLB-stores
5026750 +7.2% 5387439 ± 2% perf-stat.ps.iTLB-load-misses
6123 ± 73% -93.8% 378.00 ± 87% interrupts.40:PCI-MSI.2621446-edge.eth0-TxRx-5
6622 ± 27% -47.9% 3449 ± 42% interrupts.CPU10.NMI:Non-maskable_interrupts
6622 ± 27% -47.9% 3449 ± 42% interrupts.CPU10.PMI:Performance_monitoring_interrupts
7050 ± 18% -50.1% 3516 ± 18% interrupts.CPU11.NMI:Non-maskable_interrupts
7050 ± 18% -50.1% 3516 ± 18% interrupts.CPU11.PMI:Performance_monitoring_interrupts
395.50 ± 23% -46.6% 211.25 ± 27% interrupts.CPU12.TLB:TLB_shootdowns
4738 ± 42% -39.4% 2872 ± 28% interrupts.CPU14.NMI:Non-maskable_interrupts
4738 ± 42% -39.4% 2872 ± 28% interrupts.CPU14.PMI:Performance_monitoring_interrupts
7226 ± 24% -47.1% 3826 ± 16% interrupts.CPU2.NMI:Non-maskable_interrupts
7226 ± 24% -47.1% 3826 ± 16% interrupts.CPU2.PMI:Performance_monitoring_interrupts
430.25 ± 68% -55.3% 192.25 ± 23% interrupts.CPU21.NMI:Non-maskable_interrupts
430.25 ± 68% -55.3% 192.25 ± 23% interrupts.CPU21.PMI:Performance_monitoring_interrupts
1706 ± 16% -30.7% 1181 ± 16% interrupts.CPU23.CAL:Function_call_interrupts
1036 ± 11% +14.8% 1189 ± 4% interrupts.CPU27.CAL:Function_call_interrupts
156.50 ± 75% +113.4% 334.00 ± 19% interrupts.CPU27.TLB:TLB_shootdowns
1033 ± 10% +19.6% 1235 ± 11% interrupts.CPU29.CAL:Function_call_interrupts
135.50 ± 91% +171.4% 367.75 ± 40% interrupts.CPU29.TLB:TLB_shootdowns
6123 ± 73% -93.8% 378.00 ± 87% interrupts.CPU31.40:PCI-MSI.2621446-edge.eth0-TxRx-5
1029 ± 3% +16.6% 1200 ± 10% interrupts.CPU37.CAL:Function_call_interrupts
1197 ± 8% -16.1% 1005 ± 10% interrupts.CPU5.CAL:Function_call_interrupts
333.75 ± 30% -62.5% 125.00 ± 92% interrupts.CPU5.TLB:TLB_shootdowns
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Oliver Sang
View attachment "config-5.11.0-rc2-00182-gef0ba0553829" of type "text/plain" (172414 bytes)
View attachment "job-script" of type "text/plain" (7795 bytes)
View attachment "job.yaml" of type "text/plain" (5351 bytes)
View attachment "reproduce" of type "text/plain" (336 bytes)
Powered by blists - more mailing lists