[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20201203051918.GC27350@xsang-OptiPlex-9020>
Date: Thu, 3 Dec 2020 13:19:18 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Nadav Amit <nadav.amit@...il.com>
Cc: 0day robot <lkp@...el.com>, Jens Axboe <axboe@...nel.dk>,
Andrea Arcangeli <aarcange@...hat.com>,
Peter Xu <peterx@...hat.com>,
Alexander Viro <viro@...iv.linux.org.uk>,
LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org,
ying.huang@...el.com, feng.tang@...el.com, zhengjun.xing@...el.com,
linux-fsdevel@...r.kernel.org, Nadav Amit <namit@...are.com>,
io-uring@...r.kernel.org, linux-mm@...ck.org
Subject: [fs/userfaultfd] fec9227821: will-it-scale.per_process_ops -5.5%
regression
Greeting,
FYI, we noticed a -5.5% regression of will-it-scale.per_process_ops due to commit:
commit: fec92278217ba01b4a3b9f9ec0f6a392069cdbd0 ("[RFC PATCH 12/13] fs/userfaultfd: kmem-cache for wait-queue objects")
url: https://github.com/0day-ci/linux/commits/Nadav-Amit/fs-userfaultfd-support-iouring-and-polling/20201129-085119
base: https://git.kernel.org/cgit/linux/kernel/git/shuah/linux-kselftest.git next
in testcase: will-it-scale
on test machine: 104 threads Skylake with 192G memory
with following parameters:
nr_task: 50%
mode: process
test: brk1
cpufreq_governor: performance
ucode: 0x2006a08
test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
test-url: https://github.com/antonblanchard/will-it-scale
In addition to that, the commit also has significant impact on the following tests:
+------------------+---------------------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_process_ops -11.0% regression |
| test machine | 192 threads Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz with 192G memory |
| test parameters | cpufreq_governor=performance |
| | mode=process |
| | nr_task=16 |
| | test=brk1 |
| | ucode=0x5003003 |
+------------------+---------------------------------------------------------------------------+
If you fix the issue, kindly add following tag
Reported-by: kernel test robot <oliver.sang@...el.com>
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
gcc-9/performance/x86_64-rhel-8.3/process/50%/debian-10.4-x86_64-20200603.cgz/lkp-skl-fpga01/brk1/will-it-scale/0x2006a08
commit:
ddfa740e9c ("fs/userfaultfd: complete write asynchronously")
fec9227821 ("fs/userfaultfd: kmem-cache for wait-queue objects")
ddfa740e9caf7642 fec92278217ba01b4a3b9f9ec0f
---------------- ---------------------------
%stddev %change %stddev
\ | \
65219467 -5.5% 61607693 will-it-scale.52.processes
1254220 -5.5% 1184763 will-it-scale.per_process_ops
65219467 -5.5% 61607693 will-it-scale.workload
20.00 -5.0% 19.00 vmstat.cpu.us
34.22 -4.0% 32.85 ± 2% boot-time.boot
3146 -4.3% 3010 ± 2% boot-time.idle
654.25 ± 20% -39.0% 399.25 ± 25% numa-vmstat.node0.nr_active_anon
654.25 ± 20% -39.0% 399.25 ± 25% numa-vmstat.node0.nr_zone_active_anon
10140 ± 9% +27.1% 12889 ± 10% numa-vmstat.node1.nr_slab_reclaimable
21388 ± 3% +13.2% 24204 ± 7% numa-vmstat.node1.nr_slab_unreclaimable
1096 ± 8% +304.6% 4434 slabinfo.dmaengine-unmap-16.active_objs
1096 ± 8% +304.6% 4434 slabinfo.dmaengine-unmap-16.num_objs
4838 ± 4% -17.0% 4018 ± 3% slabinfo.eventpoll_pwq.active_objs
4838 ± 4% -17.0% 4018 ± 3% slabinfo.eventpoll_pwq.num_objs
2689 ± 18% -37.7% 1675 ± 22% numa-meminfo.node0.Active
2617 ± 20% -38.9% 1599 ± 25% numa-meminfo.node0.Active(anon)
40564 ± 9% +27.1% 51560 ± 10% numa-meminfo.node1.KReclaimable
40564 ± 9% +27.1% 51560 ± 10% numa-meminfo.node1.SReclaimable
85552 ± 3% +13.2% 96818 ± 7% numa-meminfo.node1.SUnreclaim
126118 ± 4% +17.7% 148380 ± 8% numa-meminfo.node1.Slab
7.12 ± 17% -87.9% 0.86 ±100% sched_debug.cfs_rq:/.removed.load_avg.avg
33.50 ± 8% -74.6% 8.50 ±100% sched_debug.cfs_rq:/.removed.load_avg.stddev
2.76 ± 27% -88.1% 0.33 ±102% sched_debug.cfs_rq:/.removed.runnable_avg.avg
80.58 ± 9% -60.1% 32.12 ±102% sched_debug.cfs_rq:/.removed.runnable_avg.max
13.39 ± 18% -75.9% 3.23 ±102% sched_debug.cfs_rq:/.removed.runnable_avg.stddev
2.76 ± 28% -88.1% 0.33 ±102% sched_debug.cfs_rq:/.removed.util_avg.avg
80.58 ± 9% -60.1% 32.12 ±102% sched_debug.cfs_rq:/.removed.util_avg.max
13.39 ± 18% -75.9% 3.23 ±102% sched_debug.cfs_rq:/.removed.util_avg.stddev
1036 ± 8% +14.0% 1181 ± 8% sched_debug.cpu.nr_switches.min
-22.25 -30.3% -15.50 sched_debug.cpu.nr_uninterruptible.min
2.50 ± 91% +7990.0% 202.25 ±166% interrupts.CPU1.TLB:TLB_shootdowns
451.00 +12.8% 508.75 ± 5% interrupts.CPU100.CAL:Function_call_interrupts
457.50 ± 3% +12.3% 514.00 ± 8% interrupts.CPU103.CAL:Function_call_interrupts
48.75 ±130% -89.7% 5.00 ±122% interrupts.CPU15.RES:Rescheduling_interrupts
3195 ± 18% +140.3% 7678 interrupts.CPU24.NMI:Non-maskable_interrupts
3195 ± 18% +140.3% 7678 interrupts.CPU24.PMI:Performance_monitoring_interrupts
8.25 ± 41% +1009.1% 91.50 ± 49% interrupts.CPU24.RES:Rescheduling_interrupts
694.25 ± 28% +89.6% 1316 ± 24% interrupts.CPU3.CAL:Function_call_interrupts
3946 ± 46% +86.3% 7352 ± 12% interrupts.CPU30.NMI:Non-maskable_interrupts
3946 ± 46% +86.3% 7352 ± 12% interrupts.CPU30.PMI:Performance_monitoring_interrupts
30.00 ±115% +200.8% 90.25 ± 51% interrupts.CPU36.RES:Rescheduling_interrupts
7.50 ± 14% +1123.3% 91.75 ± 51% interrupts.CPU40.RES:Rescheduling_interrupts
10.50 ± 38% +590.5% 72.50 ± 60% interrupts.CPU42.RES:Rescheduling_interrupts
449.00 +214.1% 1410 ±107% interrupts.CPU76.CAL:Function_call_interrupts
448.75 +99.8% 896.75 ± 51% interrupts.CPU82.CAL:Function_call_interrupts
453.25 +78.7% 809.75 ± 50% interrupts.CPU86.CAL:Function_call_interrupts
456.00 +145.0% 1117 ± 93% interrupts.CPU90.CAL:Function_call_interrupts
72.75 ± 82% -89.7% 7.50 ± 33% interrupts.CPU92.RES:Rescheduling_interrupts
2.00 ± 79% +1737.5% 36.75 ±146% interrupts.CPU92.TLB:TLB_shootdowns
5545 ± 32% +32.6% 7353 ± 12% interrupts.CPU93.NMI:Non-maskable_interrupts
5545 ± 32% +32.6% 7353 ± 12% interrupts.CPU93.PMI:Performance_monitoring_interrupts
10.50 ± 10% +514.3% 64.50 ± 76% interrupts.CPU93.RES:Rescheduling_interrupts
2.683e+10 +3.7% 2.781e+10 perf-stat.i.branch-instructions
0.68 -0.1 0.63 perf-stat.i.branch-miss-rate%
1.811e+08 -5.2% 1.718e+08 perf-stat.i.branch-misses
1.12 -4.6% 1.07 perf-stat.i.cpi
0.17 -0.0 0.15 perf-stat.i.dTLB-load-miss-rate%
64926279 -5.5% 61335249 perf-stat.i.dTLB-load-misses
3.779e+10 +5.6% 3.99e+10 perf-stat.i.dTLB-loads
2.1e+10 +2.7% 2.157e+10 perf-stat.i.dTLB-stores
1.292e+11 +4.6% 1.352e+11 perf-stat.i.instructions
1957 +3.7% 2029 perf-stat.i.instructions-per-iTLB-miss
0.89 +4.8% 0.94 perf-stat.i.ipc
823.71 +4.3% 858.87 perf-stat.i.metric.M/sec
0.67 -0.1 0.62 perf-stat.overall.branch-miss-rate%
1.12 -4.6% 1.07 perf-stat.overall.cpi
0.17 -0.0 0.15 perf-stat.overall.dTLB-load-miss-rate%
1933 +3.6% 2004 perf-stat.overall.instructions-per-iTLB-miss
0.89 +4.8% 0.94 perf-stat.overall.ipc
82.14 +1.7 83.85 perf-stat.overall.node-store-miss-rate%
597331 +10.8% 662119 perf-stat.overall.path-length
2.674e+10 +3.7% 2.772e+10 perf-stat.ps.branch-instructions
1.804e+08 -5.2% 1.71e+08 perf-stat.ps.branch-misses
64722645 -5.5% 61153001 perf-stat.ps.dTLB-load-misses
3.766e+10 +5.6% 3.976e+10 perf-stat.ps.dTLB-loads
2.093e+10 +2.7% 2.15e+10 perf-stat.ps.dTLB-stores
1.288e+11 +4.6% 1.347e+11 perf-stat.ps.instructions
3.896e+13 +4.7% 4.079e+13 perf-stat.total.instructions
19290 ± 14% -31.0% 13316 ± 5% softirqs.CPU13.RCU
22289 ± 79% -44.0% 12473 ±110% softirqs.CPU18.SCHED
19387 ± 12% -26.7% 14206 ± 6% softirqs.CPU21.RCU
14997 ± 5% +51.6% 22739 ± 2% softirqs.CPU24.RCU
39995 ± 3% -88.9% 4457 softirqs.CPU24.SCHED
22221 ± 79% -73.2% 5963 ± 42% softirqs.CPU28.SCHED
18559 ± 24% -28.7% 13237 ± 7% softirqs.CPU33.RCU
16004 ± 19% +31.9% 21107 ± 4% softirqs.CPU34.RCU
22675 ± 7% -31.0% 15655 ± 18% softirqs.CPU35.RCU
4273 ± 17% +620.7% 30798 ± 48% softirqs.CPU35.SCHED
20207 ± 16% -23.6% 15448 ± 19% softirqs.CPU37.RCU
15311 ± 19% +37.4% 21044 ± 7% softirqs.CPU4.RCU
30669 ± 48% -68.4% 9687 ± 89% softirqs.CPU40.SCHED
20195 ± 15% -23.5% 15442 ± 20% softirqs.CPU41.RCU
22191 ± 25% -37.8% 13806 ± 10% softirqs.CPU43.RCU
16782 ± 14% -21.8% 13122 ± 4% softirqs.CPU47.RCU
22290 ± 8% -22.0% 17381 ± 22% softirqs.CPU49.RCU
22338 ± 79% -79.7% 4526 softirqs.CPU61.SCHED
30860 ± 49% -85.3% 4533 softirqs.CPU65.SCHED
24975 ± 57% -82.2% 4447 softirqs.CPU73.SCHED
20318 ± 6% -39.8% 12236 ± 2% softirqs.CPU76.RCU
4615 ± 5% +761.7% 39773 ± 2% softirqs.CPU76.SCHED
21142 ± 3% -29.2% 14979 ± 9% softirqs.CPU82.RCU
13144 ±113% +199.0% 39305 ± 3% softirqs.CPU86.SCHED
39713 ± 4% -67.4% 12956 ±110% softirqs.CPU87.SCHED
17739 ± 16% -22.2% 13795 ± 4% softirqs.CPU88.RCU
18651 ± 15% -27.5% 13514 ± 11% softirqs.CPU92.RCU
30590 ± 48% -57.5% 12998 ±111% softirqs.CPU93.SCHED
15264 ± 17% +26.7% 19337 ± 5% softirqs.CPU95.RCU
1.33 ± 10% -0.1 1.20 ± 10% perf-profile.calltrace.cycles-pp.find_vma.__do_munmap.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.80 ± 11% -0.1 0.69 ± 11% perf-profile.calltrace.cycles-pp.security_mmap_addr.get_unmapped_area.do_brk_flags.__x64_sys_brk.do_syscall_64
0.00 +0.8 0.76 ± 9% perf-profile.calltrace.cycles-pp.memset_erms.kmem_cache_alloc.userfaultfd_unmap_complete.__x64_sys_brk.do_syscall_64
0.00 +0.9 0.94 ± 4% perf-profile.calltrace.cycles-pp.kmem_cache_free.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe.brk
0.00 +2.3 2.29 ± 13% perf-profile.calltrace.cycles-pp.kmem_cache_alloc.userfaultfd_unmap_complete.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.00 +2.5 2.51 ± 12% perf-profile.calltrace.cycles-pp.userfaultfd_unmap_complete.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe.brk
0.55 ± 10% -0.3 0.28 ± 14% perf-profile.children.cycles-pp.vma_merge
1.81 ± 10% -0.2 1.59 ± 10% perf-profile.children.cycles-pp.get_unmapped_area
1.72 ± 10% -0.2 1.54 ± 10% perf-profile.children.cycles-pp.find_vma
0.30 ± 9% -0.1 0.15 ± 11% perf-profile.children.cycles-pp.cap_capable
0.82 ± 11% -0.1 0.70 ± 11% perf-profile.children.cycles-pp.security_mmap_addr
0.57 ± 11% -0.1 0.50 ± 9% perf-profile.children.cycles-pp.obj_cgroup_charge
0.32 ± 10% -0.1 0.25 ± 11% perf-profile.children.cycles-pp.__vm_enough_memory
0.32 ± 12% -0.1 0.26 ± 9% perf-profile.children.cycles-pp.__x86_retpoline_rax
0.46 ± 9% -0.1 0.41 ± 11% perf-profile.children.cycles-pp.vmacache_find
0.22 ± 11% -0.0 0.19 ± 10% perf-profile.children.cycles-pp.exit_to_user_mode_prepare
0.24 ± 9% -0.0 0.21 ± 11% perf-profile.children.cycles-pp.free_pgd_range
0.00 +0.1 0.08 ± 10% perf-profile.children.cycles-pp.should_failslab
2.83 ± 11% +0.7 3.49 ± 7% perf-profile.children.cycles-pp.kmem_cache_free
0.00 +0.8 0.77 ± 9% perf-profile.children.cycles-pp.memset_erms
4.08 ± 11% +1.9 6.03 ± 11% perf-profile.children.cycles-pp.kmem_cache_alloc
0.21 ± 10% +2.3 2.52 ± 12% perf-profile.children.cycles-pp.userfaultfd_unmap_complete
0.53 ± 9% -0.3 0.27 ± 14% perf-profile.self.cycles-pp.vma_merge
0.28 ± 11% -0.1 0.14 ± 11% perf-profile.self.cycles-pp.cap_capable
0.99 ± 10% -0.1 0.88 ± 11% perf-profile.self.cycles-pp.unmap_page_range
0.78 ± 11% -0.1 0.69 ± 9% perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
0.70 ± 11% -0.1 0.62 ± 10% perf-profile.self.cycles-pp.vm_area_alloc
0.41 ± 11% -0.1 0.34 ± 12% perf-profile.self.cycles-pp.percpu_counter_add_batch
0.55 ± 12% -0.1 0.49 ± 9% perf-profile.self.cycles-pp.obj_cgroup_charge
0.44 ± 9% -0.1 0.39 ± 11% perf-profile.self.cycles-pp.vmacache_find
0.25 ± 12% -0.1 0.20 ± 10% perf-profile.self.cycles-pp.__x86_retpoline_rax
0.36 ± 11% -0.0 0.31 ± 10% perf-profile.self.cycles-pp.security_mmap_addr
0.19 ± 11% -0.0 0.16 ± 10% perf-profile.self.cycles-pp.exit_to_user_mode_prepare
0.10 ± 12% -0.0 0.08 ± 13% perf-profile.self.cycles-pp.__vm_enough_memory
0.48 ± 10% +0.1 0.61 ± 9% perf-profile.self.cycles-pp.cap_vm_enough_memory
0.00 +0.7 0.73 ± 10% perf-profile.self.cycles-pp.memset_erms
1.86 ± 11% +0.8 2.62 ± 7% perf-profile.self.cycles-pp.kmem_cache_free
1.91 ± 11% +0.8 2.74 ± 12% perf-profile.self.cycles-pp.kmem_cache_alloc
will-it-scale.52.processes
6.6e+07 +----------------------------------------------------------------+
6.55e+07 |.+..+.+.+.. .+..+.+.+..+.+. |
| .+..+.+.+ +..+.+.+ |
6.5e+07 |-+ +.+. .+.+.+..+.+ |
6.45e+07 |-+ +. |
| |
6.4e+07 |-+ |
6.35e+07 |-+ |
6.3e+07 |-+ |
| |
6.25e+07 |-+ |
6.2e+07 |-+ |
| O O O O O O O O O O O O O O O O O |
6.15e+07 |-O O O O O O O O O |
6.1e+07 +----------------------------------------------------------------+
will-it-scale.per_process_ops
1.27e+06 +----------------------------------------------------------------+
1.26e+06 |.+..+.+.+.. .+..+.+.+..+.+. |
| .+..+.+.+ +..+.+.+ |
1.25e+06 |-+ +.+.+..+.+.+..+.+ |
1.24e+06 |-+ |
| |
1.23e+06 |-+ |
1.22e+06 |-+ |
1.21e+06 |-+ |
| |
1.2e+06 |-+ |
1.19e+06 |-+ O O |
| O O O O O O O O O O O O O O O O O O O O O O |
1.18e+06 |-O O O |
1.17e+06 +----------------------------------------------------------------+
will-it-scale.workload
6.6e+07 +----------------------------------------------------------------+
6.55e+07 |.+..+.+.+.. .+..+.+.+..+.+. |
| .+..+.+.+ +..+.+.+ |
6.5e+07 |-+ +.+. .+.+.+..+.+ |
6.45e+07 |-+ +. |
| |
6.4e+07 |-+ |
6.35e+07 |-+ |
6.3e+07 |-+ |
| |
6.25e+07 |-+ |
6.2e+07 |-+ |
| O O O O O O O O O O O O O O O O O |
6.15e+07 |-O O O O O O O O O |
6.1e+07 +----------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
***************************************************************************************************
lkp-csl-2ap2: 192 threads Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz with 192G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
gcc-9/performance/x86_64-rhel-8.3/process/16/debian-10.4-x86_64-20200603.cgz/lkp-csl-2ap2/brk1/will-it-scale/0x5003003
commit:
ddfa740e9c ("fs/userfaultfd: complete write asynchronously")
fec9227821 ("fs/userfaultfd: kmem-cache for wait-queue objects")
ddfa740e9caf7642 fec92278217ba01b4a3b9f9ec0f
---------------- ---------------------------
%stddev %change %stddev
\ | \
46606610 -11.0% 41486565 will-it-scale.16.processes
2912912 -11.0% 2592909 will-it-scale.per_process_ops
46606610 -11.0% 41486565 will-it-scale.workload
0.72 -0.1 0.65 mpstat.cpu.all.usr%
17741 -4.4% 16964 proc-vmstat.nr_shmem
-116535 -515.3% 484006 ± 50% sched_debug.cfs_rq:/.spread0.avg
1380 ± 6% +495.6% 8222 slabinfo.dmaengine-unmap-16.active_objs
32.50 ± 7% +500.0% 195.00 slabinfo.dmaengine-unmap-16.active_slabs
1380 ± 6% +495.6% 8222 slabinfo.dmaengine-unmap-16.num_objs
32.50 ± 7% +500.0% 195.00 slabinfo.dmaengine-unmap-16.num_slabs
11962 ± 7% -17.3% 9891 ± 12% softirqs.CPU10.RCU
10075 ± 23% +28.9% 12985 ± 4% softirqs.CPU110.RCU
42801 ± 4% -5.8% 40327 ± 2% softirqs.CPU136.SCHED
42633 ± 4% -15.2% 36169 ± 18% softirqs.CPU137.SCHED
42786 ± 4% -6.8% 39864 softirqs.CPU156.SCHED
11795 ± 8% -16.6% 9835 ± 11% softirqs.CPU2.RCU
42004 ± 4% -5.9% 39537 ± 3% softirqs.CPU25.SCHED
39956 ± 4% -65.4% 13836 ±110% softirqs.CPU5.SCHED
9734 ± 8% -13.2% 8450 ± 8% softirqs.CPU68.RCU
41424 ± 4% -14.7% 35347 ± 19% softirqs.CPU87.SCHED
1.935e+10 -2.0% 1.895e+10 perf-stat.i.branch-instructions
0.61 +2.3% 0.62 perf-stat.i.cpi
1.494e+10 -2.9% 1.451e+10 perf-stat.i.dTLB-stores
9.271e+10 -1.1% 9.17e+10 perf-stat.i.instructions
1.64 -2.2% 1.61 perf-stat.i.ipc
320.23 -1.4% 315.65 perf-stat.i.metric.M/sec
0.61 +2.3% 0.62 perf-stat.overall.cpi
1.65 -2.2% 1.61 perf-stat.overall.ipc
601140 +10.9% 666775 perf-stat.overall.path-length
1.928e+10 -2.0% 1.889e+10 perf-stat.ps.branch-instructions
1.489e+10 -2.9% 1.446e+10 perf-stat.ps.dTLB-stores
9.24e+10 -1.1% 9.139e+10 perf-stat.ps.instructions
2.802e+13 -1.3% 2.766e+13 perf-stat.total.instructions
0.01 ± 25% +188.2% 0.02 ± 57% perf-sched.sch_delay.avg.ms.do_syslog.part.0.kmsg_read.vfs_read
0.01 ± 15% -46.6% 0.01 ± 42% perf-sched.sch_delay.avg.ms.schedule_timeout.wait_for_completion.__flush_work.lru_add_drain_all
0.01 ± 22% +324.4% 0.05 ± 67% perf-sched.sch_delay.max.ms.do_syslog.part.0.kmsg_read.vfs_read
0.01 ± 15% -43.1% 0.01 ± 41% perf-sched.sch_delay.max.ms.schedule_timeout.wait_for_completion.__flush_work.lru_add_drain_all
0.03 ± 23% -78.0% 0.01 ±173% perf-sched.wait_and_delay.avg.ms.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.[unknown]
605.16 ± 7% +13.3% 685.54 ± 5% perf-sched.wait_and_delay.avg.ms.schedule_hrtimeout_range_clock.poll_schedule_timeout.constprop.0.do_sys_poll
4.35 ± 10% +19.2% 5.19 ± 4% perf-sched.wait_and_delay.avg.ms.schedule_timeout.rcu_gp_kthread.kthread.ret_from_fork
54.50 ± 9% -18.8% 44.25 ± 5% perf-sched.wait_and_delay.count.schedule_hrtimeout_range_clock.poll_schedule_timeout.constprop.0.do_sys_poll
2295 ± 10% -17.2% 1900 ± 4% perf-sched.wait_and_delay.count.schedule_timeout.rcu_gp_kthread.kthread.ret_from_fork
0.43 ±143% -92.6% 0.03 ±173% perf-sched.wait_and_delay.max.ms.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.[unknown]
85.77 ± 63% +111.9% 181.78 ± 16% perf-sched.wait_and_delay.max.ms.schedule_timeout.rcu_gp_kthread.kthread.ret_from_fork
0.03 ± 23% -25.0% 0.02 ± 11% perf-sched.wait_time.avg.ms.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.[unknown]
605.15 ± 7% +13.3% 685.54 ± 5% perf-sched.wait_time.avg.ms.schedule_hrtimeout_range_clock.poll_schedule_timeout.constprop.0.do_sys_poll
4.34 ± 10% +19.1% 5.17 ± 4% perf-sched.wait_time.avg.ms.schedule_timeout.rcu_gp_kthread.kthread.ret_from_fork
4.24 ± 10% +64.4% 6.97 ± 49% perf-sched.wait_time.max.ms.rcu_gp_kthread.kthread.ret_from_fork
85.73 ± 63% +112.0% 181.73 ± 16% perf-sched.wait_time.max.ms.schedule_timeout.rcu_gp_kthread.kthread.ret_from_fork
8753 -54.6% 3974 ± 70% interrupts.CPU101.NMI:Non-maskable_interrupts
8753 -54.6% 3974 ± 70% interrupts.CPU101.PMI:Performance_monitoring_interrupts
1.75 ± 47% +8342.9% 147.75 ±168% interrupts.CPU137.RES:Rescheduling_interrupts
112.75 ± 8% +40.1% 158.00 ± 19% interrupts.CPU145.NMI:Non-maskable_interrupts
112.75 ± 8% +40.1% 158.00 ± 19% interrupts.CPU145.PMI:Performance_monitoring_interrupts
1251 ± 31% +151.4% 3145 ± 43% interrupts.CPU149.CAL:Function_call_interrupts
117.50 ± 7% +27.7% 150.00 ± 9% interrupts.CPU159.NMI:Non-maskable_interrupts
117.50 ± 7% +27.7% 150.00 ± 9% interrupts.CPU159.PMI:Performance_monitoring_interrupts
115.25 ± 9% -26.7% 84.50 ± 20% interrupts.CPU161.NMI:Non-maskable_interrupts
115.25 ± 9% -26.7% 84.50 ± 20% interrupts.CPU161.PMI:Performance_monitoring_interrupts
8756 -50.5% 4334 ± 58% interrupts.CPU2.NMI:Non-maskable_interrupts
8756 -50.5% 4334 ± 58% interrupts.CPU2.PMI:Performance_monitoring_interrupts
113.75 ± 8% +26.6% 144.00 ± 8% interrupts.CPU49.NMI:Non-maskable_interrupts
113.75 ± 8% +26.6% 144.00 ± 8% interrupts.CPU49.PMI:Performance_monitoring_interrupts
98.75 ± 22% +44.3% 142.50 ± 19% interrupts.CPU66.NMI:Non-maskable_interrupts
98.75 ± 22% +44.3% 142.50 ± 19% interrupts.CPU66.PMI:Performance_monitoring_interrupts
1.50 ±110% +4266.7% 65.50 ±129% interrupts.CPU98.RES:Rescheduling_interrupts
228023 ± 7% -16.3% 190922 ± 7% interrupts.NMI:Non-maskable_interrupts
228023 ± 7% -16.3% 190922 ± 7% interrupts.PMI:Performance_monitoring_interrupts
0.66 ± 31% +0.2 0.90 ± 30% perf-profile.calltrace.cycles-pp.__hrtimer_run_queues.hrtimer_interrupt.__sysvec_apic_timer_interrupt.asm_call_sysvec_on_stack.sysvec_apic_timer_interrupt
0.87 ± 9% +0.3 1.18 ± 3% perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.brk
1.06 ± 16% +0.4 1.42 ± 21% perf-profile.calltrace.cycles-pp.hrtimer_interrupt.__sysvec_apic_timer_interrupt.asm_call_sysvec_on_stack.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt
1.08 ± 16% +0.4 1.46 ± 22% perf-profile.calltrace.cycles-pp.__sysvec_apic_timer_interrupt.asm_call_sysvec_on_stack.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state
1.09 ± 16% +0.4 1.47 ± 23% perf-profile.calltrace.cycles-pp.asm_call_sysvec_on_stack.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter
0.00 +0.6 0.58 ± 3% perf-profile.calltrace.cycles-pp.___might_sleep.kmem_cache_alloc.userfaultfd_unmap_complete.__x64_sys_brk.do_syscall_64
0.00 +1.7 1.67 ± 3% perf-profile.calltrace.cycles-pp.memset_erms.kmem_cache_alloc.userfaultfd_unmap_complete.__x64_sys_brk.do_syscall_64
0.00 +1.8 1.79 ± 4% perf-profile.calltrace.cycles-pp.kmem_cache_free.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe.brk
0.00 +4.5 4.46 ± 5% perf-profile.calltrace.cycles-pp.kmem_cache_alloc.userfaultfd_unmap_complete.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.00 +5.0 4.96 ± 4% perf-profile.calltrace.cycles-pp.userfaultfd_unmap_complete.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe.brk
47.85 ± 9% +7.2 55.00 ± 3% perf-profile.calltrace.cycles-pp.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe.brk
49.25 ± 9% +7.4 56.63 ± 3% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.brk
51.01 ± 9% +7.5 58.48 ± 3% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.brk
0.44 ± 11% -0.2 0.27 ± 4% perf-profile.children.cycles-pp.cap_capable
0.05 ± 8% +0.0 0.07 ± 5% perf-profile.children.cycles-pp._raw_spin_unlock_irqrestore
0.08 +0.0 0.10 ± 10% perf-profile.children.cycles-pp.sched_clock
0.08 ± 6% +0.0 0.10 ± 10% perf-profile.children.cycles-pp.native_sched_clock
0.09 ± 4% +0.0 0.11 ± 17% perf-profile.children.cycles-pp.read_tsc
0.10 ± 14% +0.0 0.13 ± 8% perf-profile.children.cycles-pp.lapic_next_deadline
0.09 +0.0 0.12 ± 10% perf-profile.children.cycles-pp.sched_clock_cpu
0.04 ± 57% +0.0 0.07 ± 17% perf-profile.children.cycles-pp.get_next_timer_interrupt
0.00 +0.1 0.05 ± 9% perf-profile.children.cycles-pp.memset
0.04 ±115% +0.1 0.10 ± 31% perf-profile.children.cycles-pp.tick_nohz_irq_exit
0.26 ± 18% +0.1 0.33 ± 11% perf-profile.children.cycles-pp.clockevents_program_event
0.04 ± 58% +0.1 0.16 ± 2% perf-profile.children.cycles-pp.should_failslab
0.14 ± 42% +0.1 0.28 ± 19% perf-profile.children.cycles-pp.tick_nohz_next_event
0.21 ± 31% +0.2 0.36 ± 9% perf-profile.children.cycles-pp.tick_nohz_get_sleep_length
0.54 ± 21% +0.2 0.72 ± 17% perf-profile.children.cycles-pp.update_process_times
0.54 ± 10% +0.2 0.72 ± 4% perf-profile.children.cycles-pp.rcu_all_qs
0.65 ± 20% +0.2 0.84 ± 17% perf-profile.children.cycles-pp.tick_sched_timer
0.56 ± 24% +0.2 0.76 ± 20% perf-profile.children.cycles-pp.tick_sched_handle
0.93 ± 23% +0.3 1.20 ± 22% perf-profile.children.cycles-pp.__hrtimer_run_queues
0.87 ± 10% +0.3 1.15 ± 2% perf-profile.children.cycles-pp.__might_sleep
1.09 ± 11% +0.4 1.46 ± 5% perf-profile.children.cycles-pp._cond_resched
1.39 ± 13% +0.4 1.79 ± 16% perf-profile.children.cycles-pp.hrtimer_interrupt
1.43 ± 13% +0.4 1.83 ± 17% perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
1.68 ± 13% +0.5 2.15 ± 19% perf-profile.children.cycles-pp.asm_call_sysvec_on_stack
1.94 ± 10% +0.6 2.54 ± 3% perf-profile.children.cycles-pp.___might_sleep
0.00 +1.7 1.67 ± 3% perf-profile.children.cycles-pp.memset_erms
4.88 ± 8% +1.7 6.63 ± 4% perf-profile.children.cycles-pp.kmem_cache_free
6.56 ± 10% +4.6 11.13 ± 3% perf-profile.children.cycles-pp.kmem_cache_alloc
0.37 ± 9% +4.6 4.99 ± 4% perf-profile.children.cycles-pp.userfaultfd_unmap_complete
48.02 ± 9% +7.1 55.14 ± 3% perf-profile.children.cycles-pp.__x64_sys_brk
49.49 ± 9% +7.3 56.81 ± 3% perf-profile.children.cycles-pp.do_syscall_64
51.22 ± 9% +7.4 58.66 ± 3% perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
0.42 ± 11% -0.2 0.24 ± 6% perf-profile.self.cycles-pp.cap_capable
0.07 ± 5% +0.0 0.09 ± 14% perf-profile.self.cycles-pp.native_sched_clock
0.04 ± 57% +0.0 0.07 ± 7% perf-profile.self.cycles-pp._raw_spin_unlock_irqrestore
0.10 ± 14% +0.0 0.13 ± 8% perf-profile.self.cycles-pp.lapic_next_deadline
0.00 +0.1 0.05 ± 9% perf-profile.self.cycles-pp.memset
0.01 ±173% +0.1 0.08 ± 23% perf-profile.self.cycles-pp.tick_nohz_next_event
0.34 ± 10% +0.1 0.41 ± 6% perf-profile.self.cycles-pp.userfaultfd_unmap_complete
0.00 +0.1 0.08 ± 5% perf-profile.self.cycles-pp.should_failslab
0.40 ± 9% +0.1 0.48 ± 8% perf-profile.self.cycles-pp.syscall_exit_to_user_mode
0.37 ± 12% +0.1 0.49 ± 5% perf-profile.self.cycles-pp.rcu_all_qs
0.53 ± 11% +0.2 0.70 ± 3% perf-profile.self.cycles-pp._cond_resched
0.21 ± 8% +0.2 0.44 ± 6% perf-profile.self.cycles-pp.do_syscall_64
0.46 ± 8% +0.2 0.70 ± 6% perf-profile.self.cycles-pp.cap_vm_enough_memory
0.81 ± 10% +0.3 1.09 ± 2% perf-profile.self.cycles-pp.__might_sleep
1.88 ± 10% +0.6 2.46 ± 3% perf-profile.self.cycles-pp.___might_sleep
0.00 +1.6 1.61 ± 3% perf-profile.self.cycles-pp.memset_erms
3.19 ± 10% +1.7 4.86 ± 6% perf-profile.self.cycles-pp.kmem_cache_alloc
3.31 ± 7% +1.7 5.05 ± 4% perf-profile.self.cycles-pp.kmem_cache_free
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Oliver Sang
View attachment "config-5.10.0-rc1-00026-gfec92278217b" of type "text/plain" (170398 bytes)
View attachment "job-script" of type "text/plain" (7702 bytes)
View attachment "job.yaml" of type "text/plain" (5169 bytes)
View attachment "reproduce" of type "text/plain" (336 bytes)
Powered by blists - more mailing lists