[<prev] [next>] [day] [month] [year] [list]
Message-ID: <8f10e63f-c95a-4771-b215-12e2b263d083@default>
Date: Wed, 25 Nov 2020 10:56:01 -0800 (PST)
From: Alex Kogan <alex.kogan@...cle.com>
To: <oliver.sang@...el.com>
Cc: <tglx@...utronix.de>, <lkp@...ts.01.org>, <ying.huang@...el.com>,
<lkp@...el.com>, <linux@...linux.org.uk>, <feng.tang@...el.com>,
<hpa@...or.com>, <dave.dice@...cle.com>, <mingo@...hat.com>,
<will.deacon@....com>, <arnd@...db.de>, <jglauber@...vell.com>,
<guohanjun@...wei.com>, <x86@...nel.org>,
<zhengjun.xing@...el.com>, <daniel.m.jordan@...cle.com>,
<steven.sistare@...cle.com>, <bp@...en8.de>,
<linux-arm-kernel@...ts.infradead.org>, <longman@...hat.com>,
<linux-kernel@...r.kernel.org>, <peterz@...radead.org>,
<linux-arch@...r.kernel.org>
Subject: Re: [locking/qspinlock] 6f9a39a437: unixbench.score -17.3%
regression
Oliver, thank you for this report.
All, with nr_task=30%, the benchmark hits the sweet spot on the contention curve
amplifying the overhead of shuffling threads between waiting queues without
reaping the locality overhead. I was able to reproduce the regression on our
machine, though to a lesser extent of about 10% of the performance drop for
the given test.
Luckily, we have a solution for this exact scenario, which we call the
shuffle reduction optimization, or SRO. It was a part of the series until v9,
but since it did not provide much benefit in my benchmarks in v10, it was
dropped. Now, with SRO, the regression on unixbench shrinks to about 1%,
while other performance numbers do not change much.
I attach the SRO patch here. IMHO, it is pretty straight-forward.
It uses randomization, but only to throttle the creation of a secondary queue.
In particular, it does not introduce any extra delays for threads waiting
in that queue once it is created.
Anyway, any feedback is welcome!
Unless I hear any objections, I will plan to post another version of the series
with SRO included.
Thanks,
-- Alex
----- Original Message -----
From: oliver.sang@...el.com
To: alex.kogan@...cle.com
Cc: linux@...linux.org.uk, peterz@...radead.org, mingo@...hat.com, will.deacon@....com, arnd@...db.de, longman@...hat.com, linux-arch@...r.kernel.org, linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org, tglx@...utronix.de, bp@...en8.de, hpa@...or.com, x86@...nel.org, guohanjun@...wei.com, jglauber@...vell.com, steven.sistare@...cle.com, daniel.m.jordan@...cle.com, alex.kogan@...cle.com, dave.dice@...cle.com, lkp@...el.com, lkp@...ts.01.org, ying.huang@...el.com, feng.tang@...el.com, zhengjun.xing@...el.com
Sent: Sunday, November 22, 2020 4:33:52 AM GMT -05:00 US/Canada Eastern
Subject: [locking/qspinlock] 6f9a39a437: unixbench.score -17.3% regression
Greeting,
FYI, we noticed a -17.3% regression of unixbench.score due to commit:
commit: 6f9a39a4372e37907ac1fc7ede6c90932a88d174 ("[PATCH v12 5/5] locking/qspinlock: Avoid moving certain threads between waiting queues in CNA")
url: https://urldefense.com/v3/__https://github.com/0day-ci/linux/commits/Alex-Kogan/Add-NUMA-awareness-to-qspinlock/20201118-072506__;!!GqivPVa7Brio!J6uFF5neDgzw1T5v2mMXBTe1dyDbcWqAn9mi-YuDyYUiT8W303JqK82CZiGJB2Kl$
base: https://urldefense.com/v3/__https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git__;!!GqivPVa7Brio!J6uFF5neDgzw1T5v2mMXBTe1dyDbcWqAn9mi-YuDyYUiT8W303JqK82CZn0AlnmE$ 932f8c64d38bb08f69c8c26a2216ba0c36c6daa8
in testcase: unixbench
on test machine: 96 threads Intel(R) Xeon(R) CPU @ 2.30GHz with 128G memory
with following parameters:
runtime: 300s
nr_task: 30%
test: context1
cpufreq_governor: performance
ucode: 0x4003003
test-description: UnixBench is the original BYTE UNIX benchmark suite aims to test performance of Unix-like system.
test-url: https://urldefense.com/v3/__https://github.com/kdlucas/byte-unixbench__;!!GqivPVa7Brio!J6uFF5neDgzw1T5v2mMXBTe1dyDbcWqAn9mi-YuDyYUiT8W303JqK82CZlLfqDIS$
If you fix the issue, kindly add following tag
Reported-by: kernel test robot <oliver.sang@...el.com>
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://urldefense.com/v3/__https://github.com/intel/lkp-tests.git__;!!GqivPVa7Brio!J6uFF5neDgzw1T5v2mMXBTe1dyDbcWqAn9mi-YuDyYUiT8W303JqK82CZjvM7lRy$
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
=========================================================================================
compiler/cpufreq_governor/kconfig/nr_task/rootfs/runtime/tbox_group/test/testcase/ucode:
gcc-9/performance/x86_64-rhel-8.3/30%/debian-10.4-x86_64-20200603.cgz/300s/lkp-csl-2sp4/context1/unixbench/0x4003003
commit:
eaf522d564 ("locking/qspinlock: Introduce starvation avoidance into CNA")
6f9a39a437 ("locking/qspinlock: Avoid moving certain threads between waiting queues in CNA")
eaf522d56432e0e5 6f9a39a4372e37907ac1fc7ede6
---------------- ---------------------------
%stddev %change %stddev
\ | \
3715 -17.3% 3070 unixbench.score
11584 +13.2% 13118 unixbench.time.involuntary_context_switches
1830 +4.7% 1916 unixbench.time.percent_of_cpu_this_job_got
7012 +5.1% 7373 unixbench.time.system_time
141.44 -15.6% 119.37 unixbench.time.user_time
4.338e+08 -16.4% 3.627e+08 unixbench.time.voluntary_context_switches
5.807e+08 -17.5% 4.793e+08 unixbench.workload
139.00 ± 67% -71.0% 40.25 numa-vmstat.node1.nr_mlock
1.08 -0.1 0.94 mpstat.cpu.all.irq%
0.48 ± 2% -0.1 0.40 mpstat.cpu.all.usr%
956143 ± 7% +11.0% 1060959 ± 3% numa-meminfo.node0.MemUsed
1185909 ± 5% -8.8% 1081277 ± 3% numa-meminfo.node1.MemUsed
4402315 -16.3% 3682692 vmstat.system.cs
235535 -4.6% 224625 vmstat.system.in
6.42e+09 +16.4% 7.471e+09 cpuidle.C1.time
1.941e+10 ± 7% -20.0% 1.553e+10 ± 21% cpuidle.C1E.time
94497227 ± 5% -63.8% 34185071 ± 15% cpuidle.C1E.usage
2.62e+08 ± 8% -90.1% 26020649 cpuidle.POLL.time
81581001 ± 9% -96.1% 3221876 cpuidle.POLL.usage
84602 ± 3% +12.7% 95329 ± 5% softirqs.CPU65.SCHED
86631 ± 5% +10.9% 96057 ± 6% softirqs.CPU67.SCHED
81448 ± 3% +12.6% 91708 softirqs.CPU70.SCHED
99715 +8.1% 107808 ± 2% softirqs.CPU75.SCHED
91997 ± 4% +15.5% 106236 ± 2% softirqs.CPU81.SCHED
417904 ± 6% +43.6% 600289 ± 16% sched_debug.cfs_rq:/.MIN_vruntime.avg
3142033 +9.7% 3446986 ± 4% sched_debug.cfs_rq:/.MIN_vruntime.max
969106 +20.4% 1166681 ± 8% sched_debug.cfs_rq:/.MIN_vruntime.stddev
44659 ± 12% +21.1% 54091 ± 3% sched_debug.cfs_rq:/.exec_clock.min
12198 ± 12% +24.5% 15181 ± 9% sched_debug.cfs_rq:/.load.avg
417904 ± 6% +43.6% 600289 ± 16% sched_debug.cfs_rq:/.max_vruntime.avg
3142033 +9.7% 3446986 ± 4% sched_debug.cfs_rq:/.max_vruntime.max
969106 +20.4% 1166681 ± 8% sched_debug.cfs_rq:/.max_vruntime.stddev
1926443 ± 12% +25.6% 2419565 ± 3% sched_debug.cfs_rq:/.min_vruntime.min
0.41 ± 2% +16.3% 0.47 ± 3% sched_debug.cfs_rq:/.nr_running.avg
322.15 ± 2% +13.5% 365.49 ± 4% sched_debug.cfs_rq:/.util_est_enqueued.avg
58399 ± 49% -62.5% 21882 ± 74% sched_debug.cpu.avg_idle.min
3.74 ± 14% -20.1% 2.99 ± 3% sched_debug.cpu.clock.stddev
20770 ± 50% -65.0% 7271 ± 39% sched_debug.cpu.max_idle_balance_cost.stddev
8250432 -16.5% 6887763 sched_debug.cpu.nr_switches.avg
11243220 ± 4% -21.5% 8826971 sched_debug.cpu.nr_switches.max
1603956 ± 26% -52.5% 761566 ± 4% sched_debug.cpu.nr_switches.stddev
8248654 -16.5% 6885987 sched_debug.cpu.sched_count.avg
11240496 ± 4% -21.5% 8823964 sched_debug.cpu.sched_count.max
1603802 ± 26% -52.5% 761522 ± 4% sched_debug.cpu.sched_count.stddev
4123397 -16.5% 3441927 sched_debug.cpu.sched_goidle.avg
5619132 ± 4% -21.5% 4410755 sched_debug.cpu.sched_goidle.max
801761 ± 26% -52.5% 380727 ± 4% sched_debug.cpu.sched_goidle.stddev
4124921 -16.5% 3443709 sched_debug.cpu.ttwu_count.avg
5620396 ± 4% -21.5% 4412427 sched_debug.cpu.ttwu_count.max
801796 ± 26% -52.5% 380615 ± 4% sched_debug.cpu.ttwu_count.stddev
7.45e+09 -14.3% 6.382e+09 perf-stat.i.branch-instructions
1.33 -0.1 1.24 perf-stat.i.branch-miss-rate%
91615750 -22.0% 71469356 perf-stat.i.branch-misses
3.80 +2.5 6.31 ± 13% perf-stat.i.cache-miss-rate%
8753636 ± 4% +109.7% 18358392 perf-stat.i.cache-misses
7.691e+08 -14.2% 6.597e+08 perf-stat.i.cache-references
4428060 -16.4% 3704052 perf-stat.i.context-switches
2.87 +11.2% 3.20 perf-stat.i.cpi
8.789e+10 -5.6% 8.294e+10 perf-stat.i.cpu-cycles
16303 ± 7% -74.2% 4204 ± 2% perf-stat.i.cycles-between-cache-misses
8.94e+09 -14.0% 7.685e+09 perf-stat.i.dTLB-loads
4.951e+09 -16.2% 4.149e+09 perf-stat.i.dTLB-stores
57458394 -17.3% 47543962 perf-stat.i.iTLB-load-misses
30827890 -15.9% 25930501 perf-stat.i.iTLB-loads
3.327e+10 -14.6% 2.842e+10 perf-stat.i.instructions
581.15 +3.3% 600.28 perf-stat.i.instructions-per-iTLB-miss
0.36 -9.4% 0.33 perf-stat.i.ipc
0.92 -5.6% 0.86 perf-stat.i.metric.GHz
1.01 ± 4% +17.6% 1.18 ± 4% perf-stat.i.metric.K/sec
230.75 -14.6% 197.02 perf-stat.i.metric.M/sec
87.41 +8.0 95.42 perf-stat.i.node-load-miss-rate%
1718045 ± 3% +125.3% 3871440 perf-stat.i.node-load-misses
227252 ± 3% -71.5% 64814 ± 10% perf-stat.i.node-loads
1686277 ± 4% +120.6% 3720452 perf-stat.i.node-store-misses
1.23 -0.1 1.12 perf-stat.overall.branch-miss-rate%
1.14 ± 5% +1.6 2.78 perf-stat.overall.cache-miss-rate%
2.64 +10.5% 2.92 perf-stat.overall.cpi
10070 ± 4% -55.1% 4519 perf-stat.overall.cycles-between-cache-misses
579.14 +3.2% 597.84 perf-stat.overall.instructions-per-iTLB-miss
0.38 -9.5% 0.34 perf-stat.overall.ipc
88.31 +10.0 98.35 perf-stat.overall.node-load-miss-rate%
97.96 +1.3 99.24 perf-stat.overall.node-store-miss-rate%
22430 +3.3% 23175 perf-stat.overall.path-length
7.434e+09 -14.4% 6.365e+09 perf-stat.ps.branch-instructions
91428244 -22.0% 71275228 perf-stat.ps.branch-misses
8723893 ± 4% +109.8% 18304568 perf-stat.ps.cache-misses
7.674e+08 -14.3% 6.578e+08 perf-stat.ps.cache-references
4418679 -16.4% 3693530 perf-stat.ps.context-switches
8.77e+10 -5.7% 8.271e+10 perf-stat.ps.cpu-cycles
8.921e+09 -14.1% 7.664e+09 perf-stat.ps.dTLB-loads
4.94e+09 -16.3% 4.137e+09 perf-stat.ps.dTLB-stores
57330404 -17.3% 47408036 perf-stat.ps.iTLB-load-misses
30765981 -15.9% 25859786 perf-stat.ps.iTLB-loads
3.32e+10 -14.6% 2.834e+10 perf-stat.ps.instructions
1712299 ± 3% +125.4% 3860240 perf-stat.ps.node-load-misses
226568 ± 3% -71.4% 64722 ± 10% perf-stat.ps.node-loads
1680387 ± 4% +120.8% 3709583 perf-stat.ps.node-store-misses
1.302e+13 -14.7% 1.111e+13 perf-stat.total.instructions
3591158 ± 5% -25.1% 2688593 interrupts.CAL:Function_call_interrupts
2328 ± 19% +42.8% 3323 ± 3% interrupts.CPU0.NMI:Non-maskable_interrupts
2328 ± 19% +42.8% 3323 ± 3% interrupts.CPU0.PMI:Performance_monitoring_interrupts
110354 ± 9% -20.0% 88244 ± 4% interrupts.CPU0.RES:Rescheduling_interrupts
128508 ± 14% -27.1% 93721 ± 3% interrupts.CPU1.RES:Rescheduling_interrupts
2180 ± 30% +47.0% 3205 ± 15% interrupts.CPU10.NMI:Non-maskable_interrupts
2180 ± 30% +47.0% 3205 ± 15% interrupts.CPU10.PMI:Performance_monitoring_interrupts
133107 ± 8% -25.7% 98924 ± 2% interrupts.CPU10.RES:Rescheduling_interrupts
133955 ± 13% -28.9% 95305 ± 6% interrupts.CPU11.RES:Rescheduling_interrupts
129709 ± 10% -24.9% 97452 ± 8% interrupts.CPU12.RES:Rescheduling_interrupts
130073 ± 10% -21.2% 102507 ± 2% interrupts.CPU13.RES:Rescheduling_interrupts
136313 ± 10% -27.4% 99010 ± 3% interrupts.CPU14.RES:Rescheduling_interrupts
139937 ± 7% -29.9% 98077 ± 7% interrupts.CPU15.RES:Rescheduling_interrupts
143424 ± 11% -28.4% 102678 ± 7% interrupts.CPU16.RES:Rescheduling_interrupts
138084 ± 10% -25.7% 102625 ± 5% interrupts.CPU17.RES:Rescheduling_interrupts
136238 ± 6% -26.3% 100366 ± 7% interrupts.CPU18.RES:Rescheduling_interrupts
140011 ± 10% -28.4% 100232 ± 4% interrupts.CPU19.RES:Rescheduling_interrupts
129720 ± 7% -28.8% 92405 ± 7% interrupts.CPU2.RES:Rescheduling_interrupts
43177 ± 33% -34.6% 28234 ± 5% interrupts.CPU20.CAL:Function_call_interrupts
143060 ± 6% -28.5% 102289 ± 7% interrupts.CPU20.RES:Rescheduling_interrupts
39911 ± 20% -30.4% 27788 ± 4% interrupts.CPU21.CAL:Function_call_interrupts
144644 ± 9% -27.6% 104676 ± 6% interrupts.CPU21.RES:Rescheduling_interrupts
38543 ± 21% -35.1% 25019 ± 14% interrupts.CPU22.CAL:Function_call_interrupts
144984 ± 7% -29.9% 101700 ± 2% interrupts.CPU22.RES:Rescheduling_interrupts
37835 ± 15% -22.9% 29155 ± 5% interrupts.CPU23.CAL:Function_call_interrupts
2089 ± 19% +70.6% 3565 ± 20% interrupts.CPU23.NMI:Non-maskable_interrupts
2089 ± 19% +70.6% 3565 ± 20% interrupts.CPU23.PMI:Performance_monitoring_interrupts
130192 ± 7% -22.1% 101416 ± 5% interrupts.CPU23.RES:Rescheduling_interrupts
37142 ± 6% -32.8% 24974 ± 6% interrupts.CPU24.CAL:Function_call_interrupts
142384 ± 5% -31.7% 97277 ± 6% interrupts.CPU24.RES:Rescheduling_interrupts
32664 ± 9% -22.2% 25422 ± 6% interrupts.CPU25.CAL:Function_call_interrupts
141175 ± 5% -31.2% 97084 ± 2% interrupts.CPU25.RES:Rescheduling_interrupts
31023 ± 21% -24.8% 23330 ± 7% interrupts.CPU26.CAL:Function_call_interrupts
131921 ± 4% -28.9% 93831 ± 3% interrupts.CPU26.RES:Rescheduling_interrupts
32946 ± 19% -26.2% 24303 ± 5% interrupts.CPU27.CAL:Function_call_interrupts
144853 ± 4% -35.7% 93190 ± 2% interrupts.CPU27.RES:Rescheduling_interrupts
136419 ± 4% -31.3% 93690 interrupts.CPU28.RES:Rescheduling_interrupts
36609 ± 20% -35.3% 23696 ± 5% interrupts.CPU29.CAL:Function_call_interrupts
145284 ± 10% -36.1% 92871 interrupts.CPU29.RES:Rescheduling_interrupts
122699 ± 7% -23.8% 93459 ± 10% interrupts.CPU3.RES:Rescheduling_interrupts
250.50 ± 40% -79.9% 50.25 ± 99% interrupts.CPU3.TLB:TLB_shootdowns
35689 ± 19% -36.1% 22793 ± 11% interrupts.CPU30.CAL:Function_call_interrupts
152345 ± 4% -40.3% 90991 ± 3% interrupts.CPU30.RES:Rescheduling_interrupts
33895 ± 10% -15.1% 28774 ± 8% interrupts.CPU31.CAL:Function_call_interrupts
150590 ± 5% -35.5% 97092 ± 7% interrupts.CPU31.RES:Rescheduling_interrupts
50156 ± 28% -45.8% 27170 ± 7% interrupts.CPU32.CAL:Function_call_interrupts
3757 ± 7% -43.6% 2120 ± 32% interrupts.CPU32.NMI:Non-maskable_interrupts
3757 ± 7% -43.6% 2120 ± 32% interrupts.CPU32.PMI:Performance_monitoring_interrupts
150142 ± 3% -36.3% 95673 interrupts.CPU32.RES:Rescheduling_interrupts
39957 ± 25% -34.5% 26158 ± 4% interrupts.CPU33.CAL:Function_call_interrupts
147066 ± 8% -34.4% 96521 ± 2% interrupts.CPU33.RES:Rescheduling_interrupts
168.25 ±137% -86.9% 22.00 ± 59% interrupts.CPU33.TLB:TLB_shootdowns
38357 ± 13% -29.9% 26881 ± 5% interrupts.CPU34.CAL:Function_call_interrupts
3757 ± 5% -28.5% 2686 ± 19% interrupts.CPU34.NMI:Non-maskable_interrupts
3757 ± 5% -28.5% 2686 ± 19% interrupts.CPU34.PMI:Performance_monitoring_interrupts
140734 ± 2% -33.3% 93841 ± 3% interrupts.CPU34.RES:Rescheduling_interrupts
37965 ± 17% -25.8% 28175 ± 4% interrupts.CPU35.CAL:Function_call_interrupts
3934 ± 8% -39.3% 2389 ± 13% interrupts.CPU35.NMI:Non-maskable_interrupts
3934 ± 8% -39.3% 2389 ± 13% interrupts.CPU35.PMI:Performance_monitoring_interrupts
146074 ± 10% -33.2% 97630 ± 2% interrupts.CPU35.RES:Rescheduling_interrupts
34131 ± 8% -18.8% 27704 ± 9% interrupts.CPU36.CAL:Function_call_interrupts
149093 ± 3% -35.0% 96945 ± 4% interrupts.CPU36.RES:Rescheduling_interrupts
44333 ± 47% -39.7% 26745 ± 7% interrupts.CPU37.CAL:Function_call_interrupts
149936 ± 4% -34.3% 98542 ± 3% interrupts.CPU37.RES:Rescheduling_interrupts
41199 ± 28% -30.2% 28741 ± 6% interrupts.CPU38.CAL:Function_call_interrupts
154224 ± 3% -31.6% 105443 ± 7% interrupts.CPU38.RES:Rescheduling_interrupts
36925 ± 8% -24.3% 27942 ± 5% interrupts.CPU39.CAL:Function_call_interrupts
150490 ± 2% -32.5% 101625 ± 4% interrupts.CPU39.RES:Rescheduling_interrupts
122742 ± 15% -25.4% 91596 ± 5% interrupts.CPU4.RES:Rescheduling_interrupts
143639 ± 9% -29.4% 101407 ± 2% interrupts.CPU40.RES:Rescheduling_interrupts
43235 ± 10% -30.9% 29877 ± 4% interrupts.CPU41.CAL:Function_call_interrupts
158981 ± 5% -32.8% 106760 ± 4% interrupts.CPU41.RES:Rescheduling_interrupts
47792 ± 33% -37.7% 29769 ± 5% interrupts.CPU42.CAL:Function_call_interrupts
3455 ± 11% -32.2% 2343 ± 36% interrupts.CPU42.NMI:Non-maskable_interrupts
3455 ± 11% -32.2% 2343 ± 36% interrupts.CPU42.PMI:Performance_monitoring_interrupts
160241 ± 5% -34.0% 105793 ± 4% interrupts.CPU42.RES:Rescheduling_interrupts
54419 ± 52% -44.1% 30408 ± 2% interrupts.CPU43.CAL:Function_call_interrupts
3726 ± 11% -38.7% 2285 ± 39% interrupts.CPU43.NMI:Non-maskable_interrupts
3726 ± 11% -38.7% 2285 ± 39% interrupts.CPU43.PMI:Performance_monitoring_interrupts
156010 -32.4% 105516 ± 2% interrupts.CPU43.RES:Rescheduling_interrupts
69033 ± 79% -56.0% 30393 ± 7% interrupts.CPU44.CAL:Function_call_interrupts
152478 ± 6% -30.4% 106187 ± 4% interrupts.CPU44.RES:Rescheduling_interrupts
49434 ± 49% -38.5% 30404 ± 9% interrupts.CPU45.CAL:Function_call_interrupts
153770 ± 7% -32.2% 104200 ± 3% interrupts.CPU45.RES:Rescheduling_interrupts
56303 ± 52% -50.4% 27914 ± 4% interrupts.CPU46.CAL:Function_call_interrupts
3924 ± 20% -48.7% 2012 ± 50% interrupts.CPU46.NMI:Non-maskable_interrupts
3924 ± 20% -48.7% 2012 ± 50% interrupts.CPU46.PMI:Performance_monitoring_interrupts
152891 ± 11% -31.7% 104494 ± 5% interrupts.CPU46.RES:Rescheduling_interrupts
42970 ± 30% -29.9% 30107 ± 9% interrupts.CPU47.CAL:Function_call_interrupts
3940 ± 8% -40.8% 2332 ± 38% interrupts.CPU47.NMI:Non-maskable_interrupts
3940 ± 8% -40.8% 2332 ± 38% interrupts.CPU47.PMI:Performance_monitoring_interrupts
146615 ± 5% -27.7% 106013 ± 4% interrupts.CPU47.RES:Rescheduling_interrupts
146863 ± 5% -18.4% 119774 ± 3% interrupts.CPU48.RES:Rescheduling_interrupts
136692 ± 8% -16.3% 114405 ± 7% interrupts.CPU49.RES:Rescheduling_interrupts
29311 ± 6% -12.4% 25673 ± 4% interrupts.CPU5.CAL:Function_call_interrupts
129497 ± 7% -27.1% 94375 ± 6% interrupts.CPU5.RES:Rescheduling_interrupts
143797 ± 11% -21.0% 113564 ± 4% interrupts.CPU50.RES:Rescheduling_interrupts
2891 ± 16% +31.3% 3797 ± 12% interrupts.CPU51.NMI:Non-maskable_interrupts
2891 ± 16% +31.3% 3797 ± 12% interrupts.CPU51.PMI:Performance_monitoring_interrupts
139766 ± 2% -19.6% 112352 ± 8% interrupts.CPU51.RES:Rescheduling_interrupts
137319 ± 4% -20.3% 109422 ± 5% interrupts.CPU52.RES:Rescheduling_interrupts
138705 ± 5% -21.3% 109158 ± 8% interrupts.CPU53.RES:Rescheduling_interrupts
2426 ± 28% +42.8% 3464 ± 19% interrupts.CPU54.NMI:Non-maskable_interrupts
2426 ± 28% +42.8% 3464 ± 19% interrupts.CPU54.PMI:Performance_monitoring_interrupts
140683 ± 11% -24.0% 106919 ± 4% interrupts.CPU54.RES:Rescheduling_interrupts
38238 ± 13% -22.9% 29493 ± 6% interrupts.CPU55.CAL:Function_call_interrupts
3043 ± 8% +18.7% 3612 ± 7% interrupts.CPU55.NMI:Non-maskable_interrupts
3043 ± 8% +18.7% 3612 ± 7% interrupts.CPU55.PMI:Performance_monitoring_interrupts
143657 ± 10% -25.0% 107806 ± 6% interrupts.CPU55.RES:Rescheduling_interrupts
131036 ± 8% -21.3% 103177 ± 4% interrupts.CPU56.RES:Rescheduling_interrupts
131204 ± 12% -21.2% 103444 ± 10% interrupts.CPU57.RES:Rescheduling_interrupts
122041 ± 12% -15.9% 102674 ± 7% interrupts.CPU58.RES:Rescheduling_interrupts
167.25 ± 65% -64.7% 59.00 ±157% interrupts.CPU58.TLB:TLB_shootdowns
1883 ± 33% +61.6% 3042 ± 3% interrupts.CPU6.NMI:Non-maskable_interrupts
1883 ± 33% +61.6% 3042 ± 3% interrupts.CPU6.PMI:Performance_monitoring_interrupts
132101 ± 12% -27.0% 96457 ± 8% interrupts.CPU6.RES:Rescheduling_interrupts
1832 ± 24% +69.3% 3102 ± 32% interrupts.CPU64.NMI:Non-maskable_interrupts
1832 ± 24% +69.3% 3102 ± 32% interrupts.CPU64.PMI:Performance_monitoring_interrupts
107979 ± 8% -11.6% 95452 interrupts.CPU66.RES:Rescheduling_interrupts
97965 ± 3% -15.1% 83199 ± 2% interrupts.CPU69.RES:Rescheduling_interrupts
126380 ± 11% -24.6% 95257 ± 5% interrupts.CPU7.RES:Rescheduling_interrupts
1820 ± 40% +60.9% 2929 ± 35% interrupts.CPU70.NMI:Non-maskable_interrupts
1820 ± 40% +60.9% 2929 ± 35% interrupts.CPU70.PMI:Performance_monitoring_interrupts
171279 ± 5% -29.4% 120994 ± 5% interrupts.CPU72.RES:Rescheduling_interrupts
50761 ± 40% -35.0% 32979 ± 7% interrupts.CPU73.CAL:Function_call_interrupts
173132 ± 7% -31.5% 118555 ± 5% interrupts.CPU73.RES:Rescheduling_interrupts
43479 ± 17% -25.8% 32276 ± 3% interrupts.CPU74.CAL:Function_call_interrupts
3755 ± 9% -31.7% 2564 ± 31% interrupts.CPU74.NMI:Non-maskable_interrupts
3755 ± 9% -31.7% 2564 ± 31% interrupts.CPU74.PMI:Performance_monitoring_interrupts
167124 ± 7% -28.8% 119063 ± 4% interrupts.CPU74.RES:Rescheduling_interrupts
164069 ± 7% -26.6% 120499 ± 4% interrupts.CPU75.RES:Rescheduling_interrupts
166858 ± 6% -28.4% 119453 ± 4% interrupts.CPU76.RES:Rescheduling_interrupts
157535 ± 6% -25.5% 117419 ± 4% interrupts.CPU77.RES:Rescheduling_interrupts
165642 ± 8% -25.9% 122719 ± 8% interrupts.CPU78.RES:Rescheduling_interrupts
162781 ± 5% -29.0% 115600 ± 3% interrupts.CPU79.RES:Rescheduling_interrupts
132224 ± 11% -26.6% 97010 interrupts.CPU8.RES:Rescheduling_interrupts
167082 ± 9% -30.7% 115794 ± 4% interrupts.CPU80.RES:Rescheduling_interrupts
49639 ± 37% -35.1% 32228 ± 2% interrupts.CPU81.CAL:Function_call_interrupts
144305 ± 5% -18.3% 117926 ± 4% interrupts.CPU81.RES:Rescheduling_interrupts
151333 ± 7% -23.2% 116159 ± 3% interrupts.CPU82.RES:Rescheduling_interrupts
142398 ± 8% -21.1% 112399 ± 7% interrupts.CPU83.RES:Rescheduling_interrupts
144455 ± 2% -20.5% 114911 interrupts.CPU84.RES:Rescheduling_interrupts
149850 ± 9% -24.3% 113396 ± 5% interrupts.CPU85.RES:Rescheduling_interrupts
34458 ± 4% -14.4% 29487 ± 8% interrupts.CPU86.CAL:Function_call_interrupts
138603 ± 6% -22.7% 107133 ± 2% interrupts.CPU86.RES:Rescheduling_interrupts
39228 ± 7% -25.5% 29231 ± 4% interrupts.CPU87.CAL:Function_call_interrupts
151814 ± 8% -31.1% 104629 ± 5% interrupts.CPU87.RES:Rescheduling_interrupts
137356 ± 8% -20.2% 109634 ± 3% interrupts.CPU88.RES:Rescheduling_interrupts
143613 ± 10% -28.9% 102166 ± 10% interrupts.CPU89.RES:Rescheduling_interrupts
122375 ± 8% -19.2% 98901 ± 3% interrupts.CPU9.RES:Rescheduling_interrupts
140781 ± 6% -25.0% 105531 ± 3% interrupts.CPU90.RES:Rescheduling_interrupts
138917 ± 12% -24.9% 104264 ± 5% interrupts.CPU91.RES:Rescheduling_interrupts
146814 ± 14% -29.2% 103902 ± 4% interrupts.CPU92.RES:Rescheduling_interrupts
132220 ± 15% -21.3% 104095 ± 2% interrupts.CPU93.RES:Rescheduling_interrupts
133.00 ± 88% -87.6% 16.50 ± 72% interrupts.CPU93.TLB:TLB_shootdowns
125991 ± 5% -19.0% 101995 ± 2% interrupts.CPU94.RES:Rescheduling_interrupts
115838 ± 9% -17.2% 95959 ± 3% interrupts.CPU95.RES:Rescheduling_interrupts
13255498 ± 2% -25.6% 9859155 interrupts.RES:Rescheduling_interrupts
7.59 ± 2% -1.5 6.04 perf-profile.calltrace.cycles-pp.new_sync_read.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
7.43 ± 2% -1.5 5.91 perf-profile.calltrace.cycles-pp.pipe_read.new_sync_read.vfs_read.ksys_read.do_syscall_64
6.03 ± 4% -1.0 5.06 perf-profile.calltrace.cycles-pp.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
5.90 ± 4% -1.0 4.95 perf-profile.calltrace.cycles-pp.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
4.44 ± 3% -0.9 3.51 perf-profile.calltrace.cycles-pp.schedule.pipe_read.new_sync_read.vfs_read.ksys_read
2.29 ± 4% -0.9 1.38 ± 2% perf-profile.calltrace.cycles-pp.dequeue_task_fair.__schedule.schedule.pipe_read.new_sync_read
4.07 ± 3% -0.9 3.21 perf-profile.calltrace.cycles-pp.__schedule.schedule.pipe_read.new_sync_read.vfs_read
2.62 ± 3% -0.9 1.76 ± 4% perf-profile.calltrace.cycles-pp.read
3.68 ± 2% -0.8 2.83 perf-profile.calltrace.cycles-pp.schedule_idle.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
2.06 ± 4% -0.8 1.22 perf-profile.calltrace.cycles-pp.dequeue_entity.dequeue_task_fair.__schedule.schedule.pipe_read
3.58 ± 2% -0.8 2.76 perf-profile.calltrace.cycles-pp.__schedule.schedule_idle.do_idle.cpu_startup_entry.start_secondary
2.37 ± 3% -0.8 1.58 ± 4% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.read
2.29 ± 3% -0.8 1.53 ± 4% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
2.26 ± 3% -0.8 1.50 ± 4% perf-profile.calltrace.cycles-pp.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
2.21 ± 3% -0.7 1.47 ± 4% perf-profile.calltrace.cycles-pp.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
4.25 ± 3% -0.7 3.51 perf-profile.calltrace.cycles-pp.stack_trace_save_tsk.__account_scheduler_latency.enqueue_entity.enqueue_task_fair.ttwu_do_activate
2.14 ± 4% -0.6 1.52 perf-profile.calltrace.cycles-pp.unwind_next_frame.arch_stack_walk.stack_trace_save_tsk.__account_scheduler_latency.enqueue_entity
3.48 ± 4% -0.6 2.90 ± 2% perf-profile.calltrace.cycles-pp.arch_stack_walk.stack_trace_save_tsk.__account_scheduler_latency.enqueue_entity.enqueue_task_fair
1.93 ± 3% -0.5 1.48 perf-profile.calltrace.cycles-pp.pick_next_task_fair.__schedule.schedule_idle.do_idle.cpu_startup_entry
1.54 ± 4% -0.4 1.18 perf-profile.calltrace.cycles-pp.menu_select.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
1.38 ± 3% -0.3 1.04 ± 2% perf-profile.calltrace.cycles-pp.set_next_entity.pick_next_task_fair.__schedule.schedule_idle.do_idle
0.72 ± 4% -0.1 0.58 ± 3% perf-profile.calltrace.cycles-pp.tick_nohz_get_sleep_length.menu_select.do_idle.cpu_startup_entry.start_secondary
0.66 ± 4% -0.1 0.54 ± 2% perf-profile.calltrace.cycles-pp.prepare_to_wait_event.pipe_read.new_sync_read.vfs_read.ksys_read
46.28 +0.5 46.74 perf-profile.calltrace.cycles-pp.secondary_startup_64_no_verify
0.14 ±173% +0.5 0.66 ± 9% perf-profile.calltrace.cycles-pp.cpuidle_enter.do_idle.cpu_startup_entry.start_kernel.secondary_startup_64_no_verify
0.14 ±173% +0.5 0.66 ± 9% perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.do_idle.cpu_startup_entry.start_kernel
0.15 ±173% +0.6 0.71 ± 8% perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_kernel.secondary_startup_64_no_verify
0.15 ±173% +0.6 0.71 ± 8% perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_kernel.secondary_startup_64_no_verify
0.15 ±173% +0.6 0.71 ± 8% perf-profile.calltrace.cycles-pp.start_kernel.secondary_startup_64_no_verify
7.85 ± 2% +0.8 8.64 ± 3% perf-profile.calltrace.cycles-pp.write
7.77 ± 2% +0.8 8.58 ± 3% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.write
7.73 ± 2% +0.8 8.55 ± 3% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
7.69 ± 3% +0.8 8.53 ± 3% perf-profile.calltrace.cycles-pp.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
7.64 ± 3% +0.9 8.49 ± 3% perf-profile.calltrace.cycles-pp.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
35.29 +0.9 36.15 perf-profile.calltrace.cycles-pp.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
35.15 +0.9 36.02 perf-profile.calltrace.cycles-pp.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
42.35 +1.8 44.15 perf-profile.calltrace.cycles-pp.new_sync_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
42.22 +1.8 44.06 perf-profile.calltrace.cycles-pp.pipe_write.new_sync_write.vfs_write.ksys_write.do_syscall_64
38.77 +1.9 40.67 perf-profile.calltrace.cycles-pp.cpuidle_enter.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
38.65 +1.9 40.56 perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.do_idle.cpu_startup_entry.start_secondary
40.84 +2.1 42.96 perf-profile.calltrace.cycles-pp.__wake_up_common_lock.pipe_write.new_sync_write.vfs_write.ksys_write
40.50 +2.1 42.65 perf-profile.calltrace.cycles-pp.__wake_up_common.__wake_up_common_lock.pipe_write.new_sync_write.vfs_write
40.15 +2.2 42.36 perf-profile.calltrace.cycles-pp.autoremove_wake_function.__wake_up_common.__wake_up_common_lock.pipe_write.new_sync_write
40.07 +2.2 42.29 perf-profile.calltrace.cycles-pp.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock.pipe_write
37.50 +2.7 40.20 perf-profile.calltrace.cycles-pp.ttwu_do_activate.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock
37.47 +2.7 40.18 perf-profile.calltrace.cycles-pp.enqueue_task_fair.ttwu_do_activate.try_to_wake_up.autoremove_wake_function.__wake_up_common
36.96 +2.9 39.84 perf-profile.calltrace.cycles-pp.enqueue_entity.enqueue_task_fair.ttwu_do_activate.try_to_wake_up.autoremove_wake_function
36.62 ± 2% +3.2 39.86 perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.cpuidle_enter.do_idle.cpu_startup_entry
34.50 +3.3 37.80 perf-profile.calltrace.cycles-pp.__account_scheduler_latency.enqueue_entity.enqueue_task_fair.ttwu_do_activate.try_to_wake_up
29.96 ± 2% +4.1 34.04 perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.__account_scheduler_latency.enqueue_entity.enqueue_task_fair.ttwu_do_activate
29.13 ± 2% +4.1 33.22 perf-profile.calltrace.cycles-pp.__cna_queued_spin_lock_slowpath._raw_spin_lock_irqsave.__account_scheduler_latency.enqueue_entity.enqueue_task_fair
8.30 ± 2% -1.7 6.58 perf-profile.children.cycles-pp.ksys_read
8.12 ± 2% -1.7 6.42 perf-profile.children.cycles-pp.vfs_read
7.75 ± 2% -1.7 6.06 perf-profile.children.cycles-pp.__schedule
7.59 ± 2% -1.5 6.05 perf-profile.children.cycles-pp.new_sync_read
7.45 ± 2% -1.5 5.94 perf-profile.children.cycles-pp.pipe_read
4.44 ± 3% -0.9 3.52 perf-profile.children.cycles-pp.schedule
2.65 ± 3% -0.9 1.78 ± 4% perf-profile.children.cycles-pp.read
3.70 ± 2% -0.8 2.87 perf-profile.children.cycles-pp.schedule_idle
4.28 ± 3% -0.7 3.54 perf-profile.children.cycles-pp.stack_trace_save_tsk
0.80 ± 35% -0.7 0.13 ± 5% perf-profile.children.cycles-pp.poll_idle
3.54 ± 3% -0.6 2.94 ± 2% perf-profile.children.cycles-pp.arch_stack_walk
2.02 ± 3% -0.6 1.43 ± 2% perf-profile.children.cycles-pp.update_load_avg
2.15 ± 3% -0.5 1.67 perf-profile.children.cycles-pp.pick_next_task_fair
2.30 ± 4% -0.5 1.82 perf-profile.children.cycles-pp.dequeue_task_fair
2.10 ± 4% -0.5 1.63 ± 2% perf-profile.children.cycles-pp.dequeue_entity
1.56 ± 4% -0.4 1.20 perf-profile.children.cycles-pp.menu_select
1.39 ± 3% -0.3 1.06 ± 2% perf-profile.children.cycles-pp.set_next_entity
0.46 ± 13% -0.3 0.15 ± 3% perf-profile.children.cycles-pp.sched_ttwu_pending
0.92 ± 3% -0.2 0.70 ± 2% perf-profile.children.cycles-pp.prepare_to_wait_event
1.13 -0.2 0.92 ± 3% perf-profile.children.cycles-pp.asm_call_sysvec_on_stack
0.33 ± 9% -0.2 0.12 ± 3% perf-profile.children.cycles-pp.asm_sysvec_call_function_single
0.32 ± 10% -0.2 0.11 ± 3% perf-profile.children.cycles-pp.__sysvec_call_function_single
0.61 ± 3% -0.2 0.41 ± 4% perf-profile.children.cycles-pp.__update_load_avg_cfs_rq
0.32 ± 10% -0.2 0.11 ± 4% perf-profile.children.cycles-pp.sysvec_call_function_single
0.47 ± 6% -0.2 0.28 perf-profile.children.cycles-pp.finish_task_switch
0.56 ± 5% -0.2 0.36 ± 3% perf-profile.children.cycles-pp.unwind_get_return_address
0.50 ± 6% -0.2 0.32 ± 4% perf-profile.children.cycles-pp.__kernel_text_address
0.96 ± 5% -0.2 0.78 perf-profile.children.cycles-pp.update_curr
0.44 ± 6% -0.2 0.27 ± 4% perf-profile.children.cycles-pp.kernel_text_address
2.17 ± 4% -0.2 2.00 perf-profile.children.cycles-pp.unwind_next_frame
0.73 ± 3% -0.2 0.56 ± 4% perf-profile.children.cycles-pp.select_task_rq_fair
0.95 -0.2 0.79 ± 2% perf-profile.children.cycles-pp.update_rq_clock
0.74 ± 4% -0.1 0.59 ± 4% perf-profile.children.cycles-pp.tick_nohz_get_sleep_length
0.53 ± 3% -0.1 0.40 ± 5% perf-profile.children.cycles-pp.ktime_get
0.41 ± 4% -0.1 0.28 ± 3% perf-profile.children.cycles-pp.stack_trace_consume_entry_nosched
0.71 -0.1 0.59 ± 3% perf-profile.children.cycles-pp.mutex_lock
0.50 ± 2% -0.1 0.38 ± 3% perf-profile.children.cycles-pp.tick_nohz_idle_exit
0.44 -0.1 0.33 perf-profile.children.cycles-pp.__orc_find
0.52 ± 2% -0.1 0.41 ± 3% perf-profile.children.cycles-pp.copy_page_to_iter
0.15 ± 19% -0.1 0.05 ± 8% perf-profile.children.cycles-pp.flush_smp_call_function_from_idle
0.44 ± 4% -0.1 0.34 ± 2% perf-profile.children.cycles-pp.security_file_permission
0.53 ± 2% -0.1 0.43 perf-profile.children.cycles-pp.__switch_to
0.48 ± 3% -0.1 0.38 ± 3% perf-profile.children.cycles-pp.__switch_to_asm
0.37 ± 3% -0.1 0.27 ± 4% perf-profile.children.cycles-pp.__update_load_avg_se
0.67 ± 2% -0.1 0.57 ± 2% perf-profile.children.cycles-pp._raw_spin_lock
0.32 ± 4% -0.1 0.22 ± 4% perf-profile.children.cycles-pp.copy_page_from_iter
0.38 ± 4% -0.1 0.29 ± 5% perf-profile.children.cycles-pp.select_idle_sibling
0.45 ± 5% -0.1 0.37 ± 4% perf-profile.children.cycles-pp.tick_nohz_next_event
0.29 ± 4% -0.1 0.21 ± 3% perf-profile.children.cycles-pp.tick_nohz_idle_enter
0.64 ± 2% -0.1 0.57 ± 3% perf-profile.children.cycles-pp._raw_spin_unlock_irqrestore
0.38 ± 3% -0.1 0.31 ± 4% perf-profile.children.cycles-pp.copyout
0.27 ± 6% -0.1 0.19 ± 6% perf-profile.children.cycles-pp.orc_find
0.40 ± 2% -0.1 0.33 ± 5% perf-profile.children.cycles-pp.copy_user_generic_unrolled
0.35 ± 4% -0.1 0.28 perf-profile.children.cycles-pp.pick_next_entity
0.38 ± 4% -0.1 0.31 perf-profile.children.cycles-pp.update_cfs_group
0.22 ± 4% -0.1 0.16 ± 5% perf-profile.children.cycles-pp.___perf_sw_event
0.30 ± 5% -0.1 0.23 ± 3% perf-profile.children.cycles-pp.__unwind_start
0.32 ± 4% -0.1 0.26 perf-profile.children.cycles-pp.ttwu_do_wakeup
0.20 ± 4% -0.1 0.14 ± 9% perf-profile.children.cycles-pp.__might_sleep
0.28 ± 6% -0.1 0.22 ± 5% perf-profile.children.cycles-pp.syscall_return_via_sysret
0.27 ± 4% -0.1 0.21 ± 3% perf-profile.children.cycles-pp.common_file_perm
0.18 ± 3% -0.1 0.12 ± 3% perf-profile.children.cycles-pp.in_sched_functions
0.30 ± 4% -0.1 0.24 perf-profile.children.cycles-pp.check_preempt_curr
0.22 ± 4% -0.1 0.17 ± 4% perf-profile.children.cycles-pp.rcu_idle_exit
0.34 ± 3% -0.1 0.28 ± 2% perf-profile.children.cycles-pp.sched_clock_cpu
0.30 ± 4% -0.1 0.24 ± 4% perf-profile.children.cycles-pp.update_ts_time_stats
0.31 ± 5% -0.1 0.25 ± 4% perf-profile.children.cycles-pp.nr_iowait_cpu
0.31 ± 3% -0.1 0.26 ± 3% perf-profile.children.cycles-pp.sched_clock
0.21 ± 5% -0.1 0.16 ± 7% perf-profile.children.cycles-pp.cpus_share_cache
0.17 ± 10% -0.1 0.11 ± 7% perf-profile.children.cycles-pp.place_entity
0.28 ± 3% -0.1 0.23 ± 2% perf-profile.children.cycles-pp.syscall_exit_to_user_mode
0.18 ± 4% -0.1 0.13 ± 3% perf-profile.children.cycles-pp.resched_curr
0.33 ± 2% -0.0 0.28 ± 2% perf-profile.children.cycles-pp.switch_mm_irqs_off
0.29 ± 3% -0.0 0.24 ± 2% perf-profile.children.cycles-pp.mutex_unlock
0.23 ± 3% -0.0 0.18 ± 2% perf-profile.children.cycles-pp._raw_spin_lock_irq
0.26 ± 3% -0.0 0.21 perf-profile.children.cycles-pp.___might_sleep
0.20 ± 6% -0.0 0.15 ± 2% perf-profile.children.cycles-pp.__list_del_entry_valid
0.29 ± 5% -0.0 0.25 ± 3% perf-profile.children.cycles-pp.native_sched_clock
0.24 ± 5% -0.0 0.19 ± 5% perf-profile.children.cycles-pp.get_next_timer_interrupt
0.12 ± 5% -0.0 0.08 ± 5% perf-profile.children.cycles-pp.cpuidle_governor_latency_req
0.23 ± 8% -0.0 0.19 ± 2% perf-profile.children.cycles-pp.hrtimer_next_event_without
0.21 ± 3% -0.0 0.17 ± 2% perf-profile.children.cycles-pp.read_tsc
0.14 ± 3% -0.0 0.10 ± 7% perf-profile.children.cycles-pp.rcu_eqs_exit
0.12 ± 4% -0.0 0.09 ± 7% perf-profile.children.cycles-pp.__entry_text_start
0.19 ± 2% -0.0 0.15 ± 5% perf-profile.children.cycles-pp.__fdget_pos
0.08 ± 6% -0.0 0.04 ± 58% perf-profile.children.cycles-pp.rcu_dynticks_eqs_exit
0.07 ± 10% -0.0 0.04 ± 57% perf-profile.children.cycles-pp.put_prev_entity
0.11 ± 13% -0.0 0.08 ± 5% perf-profile.children.cycles-pp.put_prev_task_fair
0.17 ± 4% -0.0 0.14 ± 3% perf-profile.children.cycles-pp.exit_to_user_mode_prepare
0.15 ± 7% -0.0 0.12 ± 3% perf-profile.children.cycles-pp.hrtimer_get_next_event
0.16 ± 2% -0.0 0.14 ± 6% perf-profile.children.cycles-pp.__fget_light
0.13 ± 10% -0.0 0.10 ± 7% perf-profile.children.cycles-pp.is_bpf_text_address
0.11 ± 6% -0.0 0.08 ± 5% perf-profile.children.cycles-pp.file_update_time
0.14 ± 6% -0.0 0.11 ± 11% perf-profile.children.cycles-pp.__wrgsbase_inactive
0.14 ± 8% -0.0 0.11 ± 7% perf-profile.children.cycles-pp.available_idle_cpu
0.09 ± 4% -0.0 0.06 ± 13% perf-profile.children.cycles-pp.menu_reflect
0.13 ± 9% -0.0 0.11 ± 6% perf-profile.children.cycles-pp.stack_access_ok
0.14 ± 5% -0.0 0.12 ± 7% perf-profile.children.cycles-pp.switch_fpu_return
0.10 ± 8% -0.0 0.08 ± 6% perf-profile.children.cycles-pp.current_time
0.09 ± 9% -0.0 0.07 ± 7% perf-profile.children.cycles-pp.__rdgsbase_inactive
0.10 -0.0 0.08 perf-profile.children.cycles-pp.__calc_delta
0.09 ± 10% -0.0 0.07 ± 7% perf-profile.children.cycles-pp.bpf_ksym_find
0.07 ± 10% -0.0 0.05 perf-profile.children.cycles-pp.pick_next_task_idle
0.18 ± 3% -0.0 0.16 ± 2% perf-profile.children.cycles-pp.fsnotify
0.17 ± 5% -0.0 0.15 ± 2% perf-profile.children.cycles-pp.copy_fpregs_to_fpstate
0.07 ± 6% -0.0 0.05 perf-profile.children.cycles-pp.put_task_stack
0.07 ± 6% -0.0 0.05 perf-profile.children.cycles-pp.apparmor_file_permission
0.07 ± 12% -0.0 0.05 ± 8% perf-profile.children.cycles-pp.copy_user_enhanced_fast_string
0.07 ± 6% -0.0 0.05 ± 8% perf-profile.children.cycles-pp.update_min_vruntime
0.17 ± 2% -0.0 0.16 perf-profile.children.cycles-pp.anon_pipe_buf_release
0.07 ± 5% -0.0 0.06 perf-profile.children.cycles-pp.atime_needs_update
0.08 ± 5% -0.0 0.07 perf-profile.children.cycles-pp.finish_wait
0.48 ± 14% +0.2 0.71 ± 8% perf-profile.children.cycles-pp.start_kernel
46.28 +0.5 46.74 perf-profile.children.cycles-pp.secondary_startup_64_no_verify
46.28 +0.5 46.74 perf-profile.children.cycles-pp.cpu_startup_entry
46.25 +0.5 46.71 perf-profile.children.cycles-pp.do_idle
7.88 ± 2% +0.8 8.65 ± 3% perf-profile.children.cycles-pp.write
42.99 +1.7 44.69 perf-profile.children.cycles-pp.ksys_write
42.80 +1.7 44.53 perf-profile.children.cycles-pp.vfs_write
42.37 +1.8 44.16 perf-profile.children.cycles-pp.new_sync_write
42.23 +1.8 44.06 perf-profile.children.cycles-pp.pipe_write
39.21 +2.1 41.33 perf-profile.children.cycles-pp.cpuidle_enter
40.84 +2.1 42.96 perf-profile.children.cycles-pp.__wake_up_common_lock
39.20 +2.1 41.32 perf-profile.children.cycles-pp.cpuidle_enter_state
40.50 +2.2 42.65 perf-profile.children.cycles-pp.__wake_up_common
40.15 +2.2 42.36 perf-profile.children.cycles-pp.autoremove_wake_function
40.09 +2.2 42.30 perf-profile.children.cycles-pp.try_to_wake_up
37.97 +2.4 40.36 perf-profile.children.cycles-pp.ttwu_do_activate
37.94 +2.4 40.33 perf-profile.children.cycles-pp.enqueue_task_fair
37.50 +2.5 40.05 perf-profile.children.cycles-pp.enqueue_entity
36.91 +2.9 39.86 perf-profile.children.cycles-pp.intel_idle
34.95 +3.0 37.95 perf-profile.children.cycles-pp.__account_scheduler_latency
31.46 +3.5 35.00 perf-profile.children.cycles-pp._raw_spin_lock_irqsave
29.71 ± 2% +3.8 33.52 perf-profile.children.cycles-pp.__cna_queued_spin_lock_slowpath
0.71 ± 39% -0.7 0.05 ± 8% perf-profile.self.cycles-pp.poll_idle
1.08 ± 3% -0.3 0.78 perf-profile.self.cycles-pp.update_load_avg
1.24 ± 2% -0.2 1.02 ± 2% perf-profile.self.cycles-pp.__schedule
1.86 -0.2 1.65 perf-profile.self.cycles-pp._raw_spin_lock_irqsave
0.59 ± 3% -0.2 0.40 ± 4% perf-profile.self.cycles-pp.__update_load_avg_cfs_rq
0.95 ± 3% -0.2 0.75 ± 2% perf-profile.self.cycles-pp.set_next_entity
0.66 ± 4% -0.2 0.51 ± 6% perf-profile.self.cycles-pp.menu_select
0.43 ± 5% -0.1 0.28 ± 3% perf-profile.self.cycles-pp.enqueue_task_fair
0.53 ± 3% -0.1 0.40 ± 2% perf-profile.self.cycles-pp._raw_spin_lock
0.67 ± 2% -0.1 0.54 ± 2% perf-profile.self.cycles-pp.stack_trace_save_tsk
0.77 ± 2% -0.1 0.64 ± 2% perf-profile.self.cycles-pp.update_rq_clock
0.72 ± 8% -0.1 0.60 perf-profile.self.cycles-pp.update_curr
0.44 -0.1 0.33 perf-profile.self.cycles-pp.__orc_find
0.56 ± 2% -0.1 0.45 ± 3% perf-profile.self.cycles-pp.pipe_read
0.33 ± 4% -0.1 0.22 perf-profile.self.cycles-pp.prepare_to_wait_event
0.48 ± 3% -0.1 0.38 ± 3% perf-profile.self.cycles-pp.__switch_to_asm
0.32 ± 2% -0.1 0.22 ± 7% perf-profile.self.cycles-pp.ktime_get
0.47 -0.1 0.38 ± 2% perf-profile.self.cycles-pp.__switch_to
0.35 ± 2% -0.1 0.26 ± 5% perf-profile.self.cycles-pp.select_task_rq_fair
0.28 ± 5% -0.1 0.20 ± 3% perf-profile.self.cycles-pp.dequeue_entity
0.23 ± 6% -0.1 0.15 ± 3% perf-profile.self.cycles-pp.stack_trace_consume_entry_nosched
0.46 ± 3% -0.1 0.39 ± 5% perf-profile.self.cycles-pp.mutex_lock
0.32 ± 4% -0.1 0.25 ± 4% perf-profile.self.cycles-pp.__update_load_avg_se
0.39 ± 3% -0.1 0.32 ± 6% perf-profile.self.cycles-pp.copy_user_generic_unrolled
0.45 ± 3% -0.1 0.38 perf-profile.self.cycles-pp._raw_spin_unlock_irqrestore
0.19 ± 6% -0.1 0.12 ± 8% perf-profile.self.cycles-pp.vfs_read
0.34 ± 4% -0.1 0.27 ± 2% perf-profile.self.cycles-pp.pick_next_entity
0.84 ± 2% -0.1 0.77 perf-profile.self.cycles-pp.enqueue_entity
0.28 ± 5% -0.1 0.21 ± 5% perf-profile.self.cycles-pp.syscall_return_via_sysret
0.19 ± 4% -0.1 0.12 ± 10% perf-profile.self.cycles-pp.__might_sleep
0.35 ± 3% -0.1 0.29 perf-profile.self.cycles-pp.__wake_up_common
0.19 ± 4% -0.1 0.12 ± 6% perf-profile.self.cycles-pp.___perf_sw_event
0.47 ± 2% -0.1 0.41 ± 2% perf-profile.self.cycles-pp.do_idle
0.27 ± 4% -0.1 0.21 ± 3% perf-profile.self.cycles-pp.__unwind_start
0.22 ± 6% -0.1 0.16 ± 2% perf-profile.self.cycles-pp.finish_task_switch
0.34 ± 3% -0.1 0.29 perf-profile.self.cycles-pp.schedule
0.35 ± 6% -0.1 0.29 ± 2% perf-profile.self.cycles-pp.update_cfs_group
0.24 ± 6% -0.1 0.19 ± 4% perf-profile.self.cycles-pp.orc_find
0.21 ± 5% -0.1 0.16 ± 7% perf-profile.self.cycles-pp.cpus_share_cache
0.30 ± 7% -0.1 0.25 ± 5% perf-profile.self.cycles-pp.nr_iowait_cpu
0.18 ± 4% -0.1 0.13 perf-profile.self.cycles-pp.resched_curr
0.29 ± 3% -0.1 0.24 ± 2% perf-profile.self.cycles-pp.mutex_unlock
0.16 ± 9% -0.1 0.11 ± 6% perf-profile.self.cycles-pp.place_entity
0.32 ± 5% -0.0 0.27 perf-profile.self.cycles-pp.cpuidle_enter_state
0.23 ± 3% -0.0 0.18 ± 3% perf-profile.self.cycles-pp._raw_spin_lock_irq
0.22 ± 4% -0.0 0.18 ± 4% perf-profile.self.cycles-pp.common_file_perm
0.12 ± 3% -0.0 0.08 ± 11% perf-profile.self.cycles-pp.in_sched_functions
0.28 ± 3% -0.0 0.24 ± 3% perf-profile.self.cycles-pp.native_sched_clock
0.20 ± 4% -0.0 0.15 ± 2% perf-profile.self.cycles-pp.__list_del_entry_valid
0.12 ± 5% -0.0 0.08 ± 6% perf-profile.self.cycles-pp.new_sync_write
0.25 -0.0 0.21 perf-profile.self.cycles-pp.___might_sleep
0.20 ± 4% -0.0 0.16 ± 5% perf-profile.self.cycles-pp.vfs_write
0.07 ± 7% -0.0 0.03 ±100% perf-profile.self.cycles-pp.main
0.29 ± 2% -0.0 0.25 ± 4% perf-profile.self.cycles-pp.switch_mm_irqs_off
0.21 ± 2% -0.0 0.17 ± 4% perf-profile.self.cycles-pp.read_tsc
0.07 ± 5% -0.0 0.04 ± 58% perf-profile.self.cycles-pp.rcu_dynticks_eqs_exit
0.12 ± 6% -0.0 0.09 ± 7% perf-profile.self.cycles-pp.new_sync_read
0.21 ± 2% -0.0 0.18 ± 6% perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
0.12 ± 6% -0.0 0.10 ± 9% perf-profile.self.cycles-pp.arch_stack_walk
0.07 ± 6% -0.0 0.04 ± 57% perf-profile.self.cycles-pp.update_min_vruntime
0.11 ± 4% -0.0 0.08 ± 10% perf-profile.self.cycles-pp.kernel_text_address
0.23 ± 7% -0.0 0.21 ± 5% perf-profile.self.cycles-pp.__account_scheduler_latency
0.14 ± 6% -0.0 0.11 ± 11% perf-profile.self.cycles-pp.__wrgsbase_inactive
0.09 ± 9% -0.0 0.06 ± 6% perf-profile.self.cycles-pp.__entry_text_start
0.08 ± 5% -0.0 0.05 ± 8% perf-profile.self.cycles-pp.copy_page_to_iter
0.19 ± 6% -0.0 0.17 ± 5% perf-profile.self.cycles-pp.pipe_write
0.15 ± 3% -0.0 0.13 ± 5% perf-profile.self.cycles-pp.__fget_light
0.06 ± 6% -0.0 0.04 ± 57% perf-profile.self.cycles-pp.unwind_get_return_address
0.14 ± 7% -0.0 0.12 ± 7% perf-profile.self.cycles-pp.switch_fpu_return
0.09 ± 9% -0.0 0.07 ± 7% perf-profile.self.cycles-pp.tick_nohz_next_event
0.08 ± 11% -0.0 0.05 ± 8% perf-profile.self.cycles-pp.__hrtimer_next_event_base
0.16 -0.0 0.14 ± 6% perf-profile.self.cycles-pp.pick_next_task_fair
0.09 ± 9% -0.0 0.07 ± 7% perf-profile.self.cycles-pp.__rdgsbase_inactive
0.06 -0.0 0.04 ± 57% perf-profile.self.cycles-pp.copy_page_from_iter
0.14 ± 6% -0.0 0.11 ± 7% perf-profile.self.cycles-pp.available_idle_cpu
0.08 ± 16% -0.0 0.06 ± 7% perf-profile.self.cycles-pp.call_cpuidle
0.10 ± 8% -0.0 0.08 ± 5% perf-profile.self.cycles-pp.syscall_exit_to_user_mode
0.09 ± 5% -0.0 0.07 ± 7% perf-profile.self.cycles-pp.rcu_idle_exit
0.19 ± 3% -0.0 0.17 ± 4% perf-profile.self.cycles-pp.dequeue_task_fair
0.10 ± 4% -0.0 0.08 perf-profile.self.cycles-pp.__calc_delta
0.17 ± 2% -0.0 0.16 ± 2% perf-profile.self.cycles-pp.anon_pipe_buf_release
0.17 ± 4% -0.0 0.15 ± 2% perf-profile.self.cycles-pp.copy_fpregs_to_fpstate
0.06 ± 6% -0.0 0.05 perf-profile.self.cycles-pp.copy_user_enhanced_fast_string
0.06 ± 6% -0.0 0.05 perf-profile.self.cycles-pp.put_task_stack
36.91 +2.9 39.86 perf-profile.self.cycles-pp.intel_idle
29.30 ± 2% +3.9 33.15 perf-profile.self.cycles-pp.__cna_queued_spin_lock_slowpath
unixbench.time.voluntary_context_switches
4.4e+08 +-----------------------------------------------------------------+
| +.. +..+. ..|
4.3e+08 |-+ : + + |
4.2e+08 |-+ : + |
| : |
4.1e+08 |-+ : |
4e+08 |-+ +. .+.. .+..+.+..+. : |
| .. +..+.+..+.+. + +..+ |
3.9e+08 |..+.+..+.+..+.+ |
3.8e+08 |-+ |
| |
3.7e+08 |-+ O O O O O |
3.6e+08 |-+ O O O O O O O O O O O |
| O O O O O O |
3.5e+08 +-----------------------------------------------------------------+
unixbench.score
3800 +--------------------------------------------------------------------+
| .+. .|
3700 |-+ +. .+. +. |
3600 |-+ : +. |
| : |
3500 |-+ : |
3400 |-+ .+.+..+..+. : |
| .+.+..+..+.+..+..+.+. +..+ |
3300 |..+.+..+..+.+..+. |
3200 |-+ |
| |
3100 |-+ O O O O O O O O O O O O O O |
3000 |-+O O O O O O |
| O O O O O |
2900 +--------------------------------------------------------------------+
unixbench.workload
6e+08 +-----------------------------------------------------------------+
| |
5.8e+08 |-+ +.. +..+. ..|
| : + + |
5.6e+08 |-+ : + |
| : |
5.4e+08 |-+ : |
| .+. .+.. .+..+.+..+.+..+. : |
5.2e+08 |.. .+.. .+.+. +..+ +.+. +..+ |
| + +.+. |
5e+08 |-+ |
| O O O O O |
4.8e+08 |-+ O O O O O O O O O O O |
| O O O O O O O O |
4.6e+08 +-----------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Oliver Sang
View attachment "0006-locking-qspinlock-Introduce-the-shuffle-reduction-op.patch" of type "text/x-patch" (3049 bytes)
Powered by blists - more mailing lists