[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200803094257.GA23458@shao2-debian>
Date: Mon, 3 Aug 2020 17:42:57 +0800
From: kernel test robot <rong.a.chen@...el.com>
To: Dan Williams <dan.j.williams@...el.com>
Cc: tglx@...utronix.de, mingo@...hat.com, vishal.l.verma@...el.com,
x86@...nel.org, stable@...r.kernel.org,
Borislav Petkov <bp@...en8.de>,
Vivek Goyal <vgoyal@...hat.com>,
"H. Peter Anvin" <hpa@...or.com>,
Andy Lutomirski <luto@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Tony Luck <tony.luck@...el.com>,
Erwin Tsaur <erwin.tsaur@...el.com>, linux-nvdimm@...ts.01.org,
linux-kernel@...r.kernel.org, 0day robot <lkp@...el.com>,
lkp@...ts.01.org
Subject: [x86/copy_mc] a0ac629ebe: fio.read_iops -43.3% regression
Greeting,
FYI, we noticed a -43.3% regression of fio.read_iops due to commit:
commit: a0ac629ebe7b3d248cb93807782a00d9142fdb98 ("x86/copy_mc: Introduce copy_mc_generic()")
url: https://github.com/0day-ci/linux/commits/Dan-Williams/Renovate-memcpy_mcsafe-with-copy_mc_to_-user-kernel/20200802-014046
in testcase: fio-basic
on test machine: 96 threads Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 256G memory
with following parameters:
disk: 2pmem
fs: xfs
mount_option: dax
runtime: 200s
nr_task: 50%
time_based: tb
rw: read
bs: 2M
ioengine: libaio
test_size: 200G
cpufreq_governor: performance
ucode: 0x5002f01
test-description: Fio is a tool that will spawn a number of threads or processes doing a particular type of I/O action as specified by the user.
test-url: https://github.com/axboe/fio
In addition to that, the commit also has significant impact on the following tests:
+------------------+----------------------------------------------------------------------+
| testcase: change | fio-basic: fio.read_iops -55.6% regression |
| test machine | 96 threads Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 256G memory |
| test parameters | bs=2M |
| | cpufreq_governor=performance |
| | disk=2pmem |
| | fs=xfs |
| | ioengine=sync |
| | mount_option=dax |
| | nr_task=50% |
| | runtime=200s |
| | rw=read |
| | test_size=200G |
| | time_based=tb |
| | ucode=0x5002f01 |
+------------------+----------------------------------------------------------------------+
If you fix the issue, kindly add following tag
Reported-by: kernel test robot <rong.a.chen@...el.com>
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
=========================================================================================
bs/compiler/cpufreq_governor/disk/fs/ioengine/kconfig/mount_option/nr_task/rootfs/runtime/rw/tbox_group/test_size/testcase/time_based/ucode:
2M/gcc-9/performance/2pmem/xfs/libaio/x86_64-rhel-8.3/dax/50%/debian-10.4-x86_64-20200603.cgz/200s/read/lkp-csl-2sp6/200G/fio-basic/tb/0x5002f01
commit:
7476b91d4d ("x86, powerpc: Rename memcpy_mcsafe() to copy_mc_to_{user, kernel}()")
a0ac629ebe ("x86/copy_mc: Introduce copy_mc_generic()")
7476b91d4db369d8 a0ac629ebe7b3d248cb93807782
---------------- ---------------------------
%stddev %change %stddev
\ | \
97.22 -96.0 1.19 ± 21% fio.latency_100ms%
0.14 -0.1 0.05 fio.latency_10ms%
0.27 ± 13% -0.1 0.14 fio.latency_20ms%
0.04 ± 6% -0.0 0.03 ± 12% fio.latency_20us%
1.00 ± 28% +96.6 97.57 fio.latency_250ms%
0.05 -0.0 0.05 fio.latency_4ms%
0.02 ± 48% +0.3 0.31 ± 15% fio.latency_500ms%
1.25 ± 47% -0.6 0.63 ± 11% fio.latency_50ms%
0.01 ± 9% +0.0 0.02 ± 24% fio.latency_50us%
44292 -43.3% 25124 fio.read_bw_MBps
67895296 +76.8% 1.201e+08 fio.read_clat_90%_us
68681728 +76.7% 1.214e+08 fio.read_clat_95%_us
98304000 ± 19% +80.3% 1.772e+08 ± 4% fio.read_clat_99%_us
66674508 +76.2% 1.175e+08 fio.read_clat_mean_us
9950116 ± 12% +80.3% 17935634 fio.read_clat_stddev
22146 -43.3% 12562 fio.read_iops
2152824 +76.8% 3805428 fio.read_slat_mean_us
291719 ± 14% +86.6% 544324 fio.read_slat_stddev
12923 -2.5% 12594 fio.time.involuntary_context_switches
77.65 ± 3% -39.1% 47.29 fio.time.user_time
4429275 -43.3% 2512537 fio.workload
0.14 ± 3% +0.0 0.16 ± 4% mpstat.cpu.all.soft%
0.47 ± 3% -0.2 0.31 mpstat.cpu.all.usr%
53185 ± 91% +121.2% 117642 ± 40% numa-vmstat.node0.numa_other
122640 ± 39% -52.6% 58092 ± 81% numa-vmstat.node1.numa_other
60096 +1.5% 61021 proc-vmstat.nr_slab_unreclaimable
20103 ± 5% -17.9% 16495 ± 12% proc-vmstat.pgactivate
49.00 -2.0% 48.00 vmstat.cpu.id
1612 -1.6% 1587 vmstat.system.cs
2713 ± 4% +8.0% 2931 ± 4% slabinfo.PING.active_objs
2713 ± 4% +8.0% 2931 ± 4% slabinfo.PING.num_objs
1164 ± 9% +16.8% 1360 ± 6% slabinfo.task_group.active_objs
1164 ± 9% +16.8% 1360 ± 6% slabinfo.task_group.num_objs
379.25 ± 85% +279.7% 1439 ± 75% sched_debug.cfs_rq:/.exec_clock.min
29948 ± 5% -15.5% 25309 ± 5% sched_debug.cfs_rq:/.exec_clock.stddev
21606 ± 7% +25.1% 27034 ± 7% sched_debug.cfs_rq:/.min_vruntime.min
33321 ± 6% -16.5% 27820 ± 6% sched_debug.cfs_rq:/.min_vruntime.stddev
13783 ±109% +184.1% 39158 ± 20% sched_debug.cfs_rq:/.spread0.avg
-38497 -76.6% -9012 sched_debug.cfs_rq:/.spread0.min
33321 ± 6% -16.5% 27820 ± 6% sched_debug.cfs_rq:/.spread0.stddev
12.22 ± 10% +27.9% 15.62 ± 3% sched_debug.cpu.clock.stddev
3716 ±173% -100.0% 1.50 ± 57% softirqs.CPU10.NET_RX
17411 ± 36% -41.8% 10126 ± 19% softirqs.CPU24.SCHED
9179 ± 67% +87.1% 17173 ± 23% softirqs.CPU35.SCHED
9611 ± 34% -58.9% 3951 ± 10% softirqs.CPU48.SCHED
17177 ± 30% -42.6% 9864 ± 37% softirqs.CPU69.SCHED
86644 ± 29% -22.3% 67339 ± 5% softirqs.CPU76.TIMER
6339 ± 66% +115.9% 13686 ± 31% softirqs.CPU78.SCHED
10156 ± 64% +91.8% 19477 ± 25% softirqs.CPU81.SCHED
1239 ±172% -100.0% 0.00 interrupts.62:PCI-MSI.31981595-edge.i40e-eth0-TxRx-26
47482 +5.4% 50055 ± 4% interrupts.CAL:Function_call_interrupts
209.00 ± 23% -50.4% 103.75 ± 8% interrupts.CPU0.RES:Rescheduling_interrupts
146.25 ± 16% -27.4% 106.25 ± 16% interrupts.CPU15.RES:Rescheduling_interrupts
168.75 ± 81% -64.6% 59.75 ± 33% interrupts.CPU15.TLB:TLB_shootdowns
7321 ± 5% -52.7% 3461 ± 39% interrupts.CPU20.NMI:Non-maskable_interrupts
7321 ± 5% -52.7% 3461 ± 39% interrupts.CPU20.PMI:Performance_monitoring_interrupts
6665 ± 14% -61.2% 2586 ± 26% interrupts.CPU21.NMI:Non-maskable_interrupts
6665 ± 14% -61.2% 2586 ± 26% interrupts.CPU21.PMI:Performance_monitoring_interrupts
64.50 ± 23% +41.9% 91.50 ± 22% interrupts.CPU21.TLB:TLB_shootdowns
100.00 ± 41% +66.0% 166.00 ± 9% interrupts.CPU24.RES:Rescheduling_interrupts
1238 ±173% -100.0% 0.00 interrupts.CPU26.62:PCI-MSI.31981595-edge.i40e-eth0-TxRx-26
438.25 ± 4% +16.1% 509.00 ± 18% interrupts.CPU28.CAL:Function_call_interrupts
145.50 ± 20% -34.4% 95.50 ± 25% interrupts.CPU35.RES:Rescheduling_interrupts
7134 ± 11% -28.3% 5118 ± 19% interrupts.CPU41.NMI:Non-maskable_interrupts
7134 ± 11% -28.3% 5118 ± 19% interrupts.CPU41.PMI:Performance_monitoring_interrupts
107.75 ± 34% -47.3% 56.75 ± 40% interrupts.CPU93.RES:Rescheduling_interrupts
63.18 ± 12% -26.1 37.12 ± 15% perf-profile.calltrace.cycles-pp.copy_mc_fragile.copy_mc_to_user.copyout_mc._copy_mc_to_iter.dax_iomap_actor
0.00 +3.7 3.72 ± 52% perf-profile.calltrace.cycles-pp.copy_mc_generic.copy_mc_to_user.copyout_mc._copy_mc_to_iter.dax_iomap_actor
0.00 +37.8 37.83 ± 12% perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt.copy_mc_generic.copy_mc_to_user.copyout_mc._copy_mc_to_iter
63.34 ± 12% -26.2 37.14 ± 15% perf-profile.children.cycles-pp.copy_mc_fragile
2.41 ±112% -2.2 0.25 ±108% perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
2.26 ±109% -2.0 0.29 ± 89% perf-profile.children.cycles-pp.asm_call_on_stack
2.15 ±112% -1.9 0.23 ±110% perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
2.12 ±113% -1.9 0.23 ±110% perf-profile.children.cycles-pp.hrtimer_interrupt
1.68 ±114% -1.5 0.17 ±119% perf-profile.children.cycles-pp.__hrtimer_run_queues
1.48 ±123% -1.3 0.15 ±121% perf-profile.children.cycles-pp.tick_sched_timer
1.34 ±120% -1.2 0.14 ±122% perf-profile.children.cycles-pp.tick_sched_handle
1.28 ±119% -1.1 0.14 ±122% perf-profile.children.cycles-pp.update_process_times
0.70 ±107% -0.6 0.10 ±120% perf-profile.children.cycles-pp.scheduler_tick
2.65 ±106% +16.5 19.13 ± 12% perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
0.00 +22.6 22.58 ± 7% perf-profile.children.cycles-pp.copy_mc_generic
62.52 ± 12% -25.5 37.00 ± 15% perf-profile.self.cycles-pp.copy_mc_fragile
0.00 +22.4 22.41 ± 6% perf-profile.self.cycles-pp.copy_mc_generic
42.43 +68.7% 71.58 perf-stat.i.MPKI
5.949e+09 -42.2% 3.44e+09 perf-stat.i.branch-instructions
0.07 +0.0 0.10 ± 5% perf-stat.i.branch-miss-rate%
3554006 ± 2% -7.5% 3286479 ± 3% perf-stat.i.branch-misses
95.02 -2.4 92.63 perf-stat.i.cache-miss-rate%
1.444e+09 -5.2% 1.369e+09 perf-stat.i.cache-misses
1.513e+09 -2.8% 1.471e+09 perf-stat.i.cache-references
3.81 +72.5% 6.58 perf-stat.i.cpi
102.49 +4.5% 107.13 perf-stat.i.cycles-between-cache-misses
0.00 ± 4% +0.0 0.00 ± 41% perf-stat.i.dTLB-load-miss-rate%
6.03e+09 -42.0% 3.495e+09 perf-stat.i.dTLB-loads
0.00 ± 5% +0.0 0.00 ± 7% perf-stat.i.dTLB-store-miss-rate%
5.909e+09 -42.5% 3.4e+09 perf-stat.i.dTLB-stores
47.00 +1.4 48.45 perf-stat.i.iTLB-load-miss-rate%
2270674 -11.0% 2021114 perf-stat.i.iTLB-load-misses
2563127 -16.0% 2151931 perf-stat.i.iTLB-loads
3.548e+10 -42.4% 2.044e+10 perf-stat.i.instructions
15634 -35.2% 10127 perf-stat.i.instructions-per-iTLB-miss
0.26 -41.6% 0.15 perf-stat.i.ipc
207.77 -37.5% 129.85 perf-stat.i.metric.M/sec
78061415 ± 13% +98.0% 1.546e+08 ± 20% perf-stat.i.node-load-misses
85582855 ± 11% +58.1% 1.353e+08 ± 20% perf-stat.i.node-loads
3.817e+08 -2.8% 3.709e+08 perf-stat.i.node-stores
42.66 +68.7% 71.96 perf-stat.overall.MPKI
0.06 +0.0 0.09 ± 3% perf-stat.overall.branch-miss-rate%
95.45 -2.4 93.07 perf-stat.overall.cache-miss-rate%
3.81 +73.0% 6.59 perf-stat.overall.cpi
93.55 +5.2% 98.41 perf-stat.overall.cycles-between-cache-misses
0.00 ± 5% +0.0 0.00 ± 13% perf-stat.overall.dTLB-load-miss-rate%
0.00 ± 5% +0.0 0.00 ± 2% perf-stat.overall.dTLB-store-miss-rate%
46.98 +1.5 48.43 perf-stat.overall.iTLB-load-miss-rate%
15639 -35.2% 10127 perf-stat.overall.instructions-per-iTLB-miss
0.26 -42.2% 0.15 perf-stat.overall.ipc
1605743 +1.5% 1630326 perf-stat.overall.path-length
5.919e+09 -42.2% 3.422e+09 perf-stat.ps.branch-instructions
3519866 ± 2% -7.8% 3245208 ± 3% perf-stat.ps.branch-misses
1.437e+09 -5.2% 1.362e+09 perf-stat.ps.cache-misses
1.506e+09 -2.8% 1.463e+09 perf-stat.ps.cache-references
1552 -1.4% 1530 perf-stat.ps.context-switches
6e+09 -42.1% 3.477e+09 perf-stat.ps.dTLB-loads
5.88e+09 -42.5% 3.382e+09 perf-stat.ps.dTLB-stores
2257568 -11.0% 2008542 perf-stat.ps.iTLB-load-misses
2547705 -16.1% 2138603 perf-stat.ps.iTLB-loads
3.53e+10 -42.4% 2.034e+10 perf-stat.ps.instructions
77685715 ± 13% +97.9% 1.538e+08 ± 20% perf-stat.ps.node-load-misses
85143339 ± 11% +58.1% 1.346e+08 ± 20% perf-stat.ps.node-loads
3.797e+08 -2.8% 3.69e+08 perf-stat.ps.node-stores
7.112e+12 -42.4% 4.096e+12 perf-stat.total.instructions
fio.read_bw_MBps
46000 +-------------------------------------------------------------------+
44000 |..+.+..+.+..+..+.+..+..+.+.. .+.. .+.+..+.+..+ |
| + +. |
42000 |-+ |
40000 |-+ |
38000 |-+ |
36000 |-+ |
| |
34000 |-+ |
32000 |-+ |
30000 |-+ |
28000 |-+ |
| |
26000 |-+O O O O O O O O O O O O O O O O O O O O O O O O O |
24000 +-------------------------------------------------------------------+
fio.read_iops
23000 +-------------------------------------------------------------------+
22000 |..+.+..+.+..+..+.+..+..+.+.. .+.. .+.+..+.+..+ |
| + +. |
21000 |-+ |
20000 |-+ |
19000 |-+ |
18000 |-+ |
| |
17000 |-+ |
16000 |-+ |
15000 |-+ |
14000 |-+ |
| |
13000 |-+O O O O O O O O O O O O O O O O O O O O O O O O O |
12000 +-------------------------------------------------------------------+
fio.read_clat_mean_us
1.2e+08 +-----------------------------------------------------------------+
| O O O O O O O O O O O O O O O O O O O O O O O O |
1.1e+08 |-+ |
| |
| |
1e+08 |-+ |
| |
9e+07 |-+ |
| |
8e+07 |-+ |
| |
| |
7e+07 |..+.+.. .+..+. .+..+.+..+..+.+..+.+.. |
| +.+..+ +..+ + |
6e+07 +-----------------------------------------------------------------+
fio.read_clat_90__us
1.3e+08 +-----------------------------------------------------------------+
| |
1.2e+08 |-+O O O O O O O O O O O O O O O O O O O O O O O O O |
| |
1.1e+08 |-+ |
| |
1e+08 |-+ |
| |
9e+07 |-+ |
| |
8e+07 |-+ |
| |
7e+07 |..+.+..+.+..+.+..+.+..+.+..+.+..+..+.+..+.+..+ |
| |
6e+07 +-----------------------------------------------------------------+
fio.read_clat_95__us
1.3e+08 +-----------------------------------------------------------------+
| O O O O O |
1.2e+08 |-+O O O O O O O O O O O O O O O O O O O O |
| |
1.1e+08 |-+ |
| |
1e+08 |-+ |
| |
9e+07 |-+ |
| |
8e+07 |-+ |
| .+. .+.. |
7e+07 |..+.+..+.+..+.+..+.+..+.+. +. +.+..+.+..+ |
| |
6e+07 +-----------------------------------------------------------------+
fio.read_slat_mean_us
4e+06 +-----------------------------------------------------------------+
3.8e+06 |-+O O O O O O O O O O O O O O O O O O O O O O O O O |
| |
3.6e+06 |-+ |
3.4e+06 |-+ |
| |
3.2e+06 |-+ |
3e+06 |-+ |
2.8e+06 |-+ |
| |
2.6e+06 |-+ |
2.4e+06 |-+ |
| |
2.2e+06 |..+.+..+.+..+.+..+.+..+.+..+.+..+..+.+..+.+..+ |
2e+06 +-----------------------------------------------------------------+
fio.latency_10ms_
0.15 +--------------------------------------------------------------------+
0.14 |..+.+..+..+.+..+..+.+..+..+.+..+..+.+..+.+..+..+ |
| |
0.13 |-+ |
0.12 |-+ |
0.11 |-+ |
0.1 |-+ |
| |
0.09 |-+ |
0.08 |-+ |
0.07 |-+ |
0.06 |-+ |
| |
0.05 |-+O O O O O O O O O O O O O O O O O O O O O O O O O |
0.04 +--------------------------------------------------------------------+
fio.latency_20ms_
0.55 +--------------------------------------------------------------------+
| + + + |
0.5 |-+ : + + :: |
0.45 |++ : + + + : : |
| + : + + : |
0.4 |-+ : : : : : |
0.35 |-+ : : : : : |
| : : : : : + |
0.3 |-+ : : : +. : + + |
0.25 |-+ : : .. +.. : + + |
| + +..+ +..+ +..+ |
0.2 |-+ |
0.15 |-+ |
| O O O O O O O O O O O O O O O O O O O O O O O O O |
0.1 +--------------------------------------------------------------------+
fio.latency_100ms_
100 +---------------------------------------------------------------------+
90 |-+ + +.+..+..+ +. |
| |
80 |-+ |
70 |-+ |
| |
60 |-+ |
50 |-+ |
40 |-+ |
| |
30 |-+ |
20 |-+ |
| |
10 |-+ |
0 +---------------------------------------------------------------------+
fio.latency_250ms_
100 +---------------------------------------------------------------------+
90 |-+ O |
| |
80 |-+ |
70 |-+ |
| |
60 |-+ |
50 |-+ |
40 |-+ |
| |
30 |-+ |
20 |-+ |
| |
10 |-+ |
0 +---------------------------------------------------------------------+
fio.workload
4.6e+06 +-----------------------------------------------------------------+
4.4e+06 |..+.+..+.+..+.+..+.+..+.+.. .+.. .+.+..+.+..+ |
| + +. |
4.2e+06 |-+ |
4e+06 |-+ |
3.8e+06 |-+ |
3.6e+06 |-+ |
| |
3.4e+06 |-+ |
3.2e+06 |-+ |
3e+06 |-+ |
2.8e+06 |-+ |
| |
2.6e+06 |-+O O O O O O O O O O O O O O O O O O O O O O O O O |
2.4e+06 +-----------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
***************************************************************************************************
lkp-csl-2sp6: 96 threads Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 256G memory
=========================================================================================
bs/compiler/cpufreq_governor/disk/fs/ioengine/kconfig/mount_option/nr_task/rootfs/runtime/rw/tbox_group/test_size/testcase/time_based/ucode:
2M/gcc-9/performance/2pmem/xfs/sync/x86_64-rhel-8.3/dax/50%/debian-10.4-x86_64-20200603.cgz/200s/read/lkp-csl-2sp6/200G/fio-basic/tb/0x5002f01
commit:
7476b91d4d ("x86, powerpc: Rename memcpy_mcsafe() to copy_mc_to_{user, kernel}()")
a0ac629ebe ("x86/copy_mc: Introduce copy_mc_generic()")
7476b91d4db369d8 a0ac629ebe7b3d248cb93807782
---------------- ---------------------------
%stddev %change %stddev
\ | \
0.61 ± 15% -0.4 0.22 ± 94% fio.latency_1000us%
0.01 ± 11% +1.3 1.27 ± 25% fio.latency_10ms%
96.06 -95.5 0.60 ± 80% fio.latency_2ms%
1.27 ± 33% +96.2 97.48 fio.latency_4ms%
1.29 ± 55% -1.2 0.05 ± 54% fio.latency_500us%
75143 -55.6% 33381 fio.read_bw_MBps
1372160 +118.5% 2998272 fio.read_clat_90%_us
1409024 +116.9% 3055616 fio.read_clat_95%_us
2142208 ± 19% +120.3% 4718592 ± 17% fio.read_clat_99%_us
1272849 +125.4% 2869293 fio.read_clat_mean_us
228201 ± 15% +103.6% 464620 ± 14% fio.read_clat_stddev
37571 -55.6% 16690 fio.read_iops
69.28 ± 2% -40.3% 41.38 ± 3% fio.time.user_time
7514438 -55.6% 3338252 fio.workload
0.11 ± 3% +0.0 0.14 ± 5% mpstat.cpu.all.soft%
0.43 ± 3% -0.1 0.28 ± 2% mpstat.cpu.all.usr%
115069 -2.3% 112454 proc-vmstat.nr_shmem
20846 ± 6% -27.8% 15052 ± 3% proc-vmstat.pgactivate
967.50 ± 27% -50.0% 483.75 ± 78% slabinfo.xfs_buf_item.active_objs
967.50 ± 27% -50.0% 483.75 ± 78% slabinfo.xfs_buf_item.num_objs
100.00 -2.0% 98.00 vmstat.io.bo
1672 -3.3% 1616 vmstat.system.cs
9.059e+09 ± 6% -32.3% 6.131e+09 ± 54% cpuidle.C1E.time
19004364 ± 3% -22.4% 14741281 ± 34% cpuidle.C1E.usage
4.034e+08 ±133% +713.0% 3.28e+09 ±100% cpuidle.C6.time
570211 ±122% +571.6% 3829822 ± 86% cpuidle.C6.usage
61.80 ± 9% -17.6 44.19 perf-profile.calltrace.cycles-pp.copy_mc_fragile.copy_mc_to_user.copyout_mc._copy_mc_to_iter.dax_iomap_actor
0.00 +7.8 7.81 ± 6% perf-profile.calltrace.cycles-pp.copy_mc_generic.copy_mc_to_user.copyout_mc._copy_mc_to_iter.dax_iomap_actor
0.00 +29.2 29.21 ± 5% perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt.copy_mc_generic.copy_mc_to_user.copyout_mc._copy_mc_to_iter
61.92 ± 9% -17.7 44.25 perf-profile.children.cycles-pp.copy_mc_fragile
3.47 ±132% +11.7 15.21 ± 5% perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
0.00 +22.3 22.32 perf-profile.children.cycles-pp.copy_mc_generic
61.16 ± 9% -17.4 43.78 perf-profile.self.cycles-pp.copy_mc_fragile
0.00 +22.1 22.09 perf-profile.self.cycles-pp.copy_mc_generic
212.00 ± 38% +288.6% 823.90 ± 67% sched_debug.cfs_rq:/.exec_clock.min
34013 ± 3% -17.1% 28181 ± 2% sched_debug.cfs_rq:/.exec_clock.stddev
36118 ± 5% -15.0% 30710 ± 2% sched_debug.cfs_rq:/.min_vruntime.stddev
36118 ± 5% -15.0% 30707 ± 2% sched_debug.cfs_rq:/.spread0.stddev
9.52 ± 11% +33.8% 12.73 ± 9% sched_debug.cpu.clock.stddev
17832 ± 13% -47.5% 9368 ± 17% sched_debug.cpu.sched_count.max
2475 ± 9% -34.4% 1624 ± 8% sched_debug.cpu.sched_count.stddev
8858 ± 13% -48.3% 4577 ± 18% sched_debug.cpu.sched_goidle.max
1260 ± 9% -33.2% 841.68 ± 8% sched_debug.cpu.sched_goidle.stddev
8285 ± 16% -32.1% 5622 ± 7% sched_debug.cpu.ttwu_count.max
1169 ± 9% -24.9% 878.40 ± 4% sched_debug.cpu.ttwu_count.stddev
26587 ± 8% -21.9% 20773 ± 22% softirqs.CPU1.SCHED
19906 ± 37% -55.7% 8824 ± 96% softirqs.CPU10.SCHED
21997 ± 34% -82.2% 3910 ± 55% softirqs.CPU20.SCHED
5126 ± 70% +166.6% 13666 ± 15% softirqs.CPU30.SCHED
5567 ± 56% +165.3% 14772 ± 29% softirqs.CPU31.SCHED
10027 ± 35% +101.3% 20182 ± 18% softirqs.CPU33.SCHED
4868 ± 50% +112.6% 10349 ± 14% softirqs.CPU44.SCHED
6304 ± 60% +154.5% 16043 ± 22% softirqs.CPU46.SCHED
4127 ± 76% +198.6% 12326 ± 32% softirqs.CPU49.SCHED
6313 ± 62% +98.5% 12530 ± 19% softirqs.CPU51.SCHED
8249 ± 58% +148.7% 20515 ± 31% softirqs.CPU57.SCHED
6971 ±109% +268.6% 25698 ± 8% softirqs.CPU68.SCHED
25116 ± 15% -32.4% 16974 ± 12% softirqs.CPU78.SCHED
24757 ± 12% -36.8% 15657 ± 27% softirqs.CPU79.SCHED
20231 ± 14% -45.5% 11024 ± 24% softirqs.CPU81.SCHED
21830 ± 23% -55.4% 9733 ± 67% softirqs.CPU9.SCHED
24043 ± 16% -39.9% 14449 ± 23% softirqs.CPU94.SCHED
42.31 +68.3% 71.22 perf-stat.i.MPKI
9.958e+09 -54.7% 4.511e+09 perf-stat.i.branch-instructions
0.05 ± 2% +0.0 0.08 ± 4% perf-stat.i.branch-miss-rate%
3682118 ± 2% -8.2% 3381534 perf-stat.i.branch-misses
67.34 +10.4 77.74 perf-stat.i.cache-miss-rate%
1.709e+09 -12.2% 1.501e+09 perf-stat.i.cache-misses
2.531e+09 -24.0% 1.923e+09 perf-stat.i.cache-references
1639 -4.1% 1571 perf-stat.i.context-switches
2.25 +121.4% 4.98 perf-stat.i.cpi
99.03 -1.8% 97.24 perf-stat.i.cpu-migrations
85.60 +14.2% 97.78 perf-stat.i.cycles-between-cache-misses
0.00 ± 18% +0.0 0.00 ± 44% perf-stat.i.dTLB-load-miss-rate%
9.996e+09 -54.5% 4.549e+09 perf-stat.i.dTLB-loads
0.00 ± 7% +0.0 0.00 ± 6% perf-stat.i.dTLB-store-miss-rate%
9.904e+09 -54.9% 4.466e+09 perf-stat.i.dTLB-stores
44.79 +4.2 48.99 perf-stat.i.iTLB-load-miss-rate%
2535885 -13.8% 2185118 perf-stat.i.iTLB-load-misses
3134177 -27.3% 2278467 perf-stat.i.iTLB-loads
5.952e+10 -54.9% 2.687e+10 perf-stat.i.instructions
23480 -47.6% 12304 perf-stat.i.instructions-per-iTLB-miss
0.45 -54.6% 0.20 perf-stat.i.ipc
342.39 -51.0% 167.90 perf-stat.i.metric.M/sec
1.165e+08 ± 30% +72.8% 2.013e+08 ± 9% perf-stat.i.node-load-misses
1.257e+08 ± 26% +41.1% 1.773e+08 ± 9% perf-stat.i.node-loads
2.42e+08 +19.6% 2.895e+08 perf-stat.i.node-stores
42.53 +68.3% 71.58 perf-stat.overall.MPKI
0.04 ± 2% +0.0 0.07 perf-stat.overall.branch-miss-rate%
67.52 +10.5 78.07 perf-stat.overall.cache-miss-rate%
2.24 +122.4% 4.99 perf-stat.overall.cpi
78.17 +14.3% 89.34 perf-stat.overall.cycles-between-cache-misses
0.00 ± 25% +0.0 0.00 ± 12% perf-stat.overall.dTLB-load-miss-rate%
0.00 ± 13% +0.0 0.00 ± 10% perf-stat.overall.dTLB-store-miss-rate%
44.72 +4.2 48.96 perf-stat.overall.iTLB-load-miss-rate%
23499 -47.6% 12306 perf-stat.overall.instructions-per-iTLB-miss
0.45 -55.0% 0.20 perf-stat.overall.ipc
1587395 +1.5% 1611895 perf-stat.overall.path-length
9.912e+09 -54.7% 4.489e+09 perf-stat.ps.branch-instructions
3650903 ± 2% -8.4% 3345674 perf-stat.ps.branch-misses
1.701e+09 -12.2% 1.494e+09 perf-stat.ps.cache-misses
2.52e+09 -24.1% 1.914e+09 perf-stat.ps.cache-references
1616 -3.7% 1556 perf-stat.ps.context-switches
9.95e+09 -54.5% 4.526e+09 perf-stat.ps.dTLB-loads
9.859e+09 -54.9% 4.445e+09 perf-stat.ps.dTLB-stores
2521574 -13.8% 2172342 perf-stat.ps.iTLB-load-misses
3116655 -27.3% 2264894 perf-stat.ps.iTLB-loads
5.925e+10 -54.9% 2.673e+10 perf-stat.ps.instructions
1.159e+08 ± 30% +72.8% 2.003e+08 ± 9% perf-stat.ps.node-load-misses
1.25e+08 ± 26% +41.1% 1.764e+08 ± 9% perf-stat.ps.node-loads
2.407e+08 +19.6% 2.878e+08 perf-stat.ps.node-stores
1.193e+13 -54.9% 5.381e+12 perf-stat.total.instructions
0.00 +2.7e+105% 2689 ±171% interrupts.115:PCI-MSI.31981648-edge.i40e-eth0-TxRx-79
62.75 ± 27% +51.8% 95.25 ± 21% interrupts.CPU1.RES:Rescheduling_interrupts
6530 ± 17% -44.1% 3647 ± 35% interrupts.CPU17.NMI:Non-maskable_interrupts
6530 ± 17% -44.1% 3647 ± 35% interrupts.CPU17.PMI:Performance_monitoring_interrupts
62.00 ± 74% +187.9% 178.50 ± 5% interrupts.CPU20.RES:Rescheduling_interrupts
365.00 ± 78% -76.0% 87.50 ± 53% interrupts.CPU25.TLB:TLB_shootdowns
170.50 ± 15% -26.8% 124.75 ± 10% interrupts.CPU30.RES:Rescheduling_interrupts
7605 -43.3% 4316 ± 32% interrupts.CPU31.NMI:Non-maskable_interrupts
7605 -43.3% 4316 ± 32% interrupts.CPU31.PMI:Performance_monitoring_interrupts
169.00 ± 12% -37.1% 106.25 ± 23% interrupts.CPU31.RES:Rescheduling_interrupts
7145 ± 11% -33.0% 4786 ± 18% interrupts.CPU36.NMI:Non-maskable_interrupts
7145 ± 11% -33.0% 4786 ± 18% interrupts.CPU36.PMI:Performance_monitoring_interrupts
136.50 ± 27% -44.7% 75.50 ± 60% interrupts.CPU39.TLB:TLB_shootdowns
149.25 ± 24% -24.6% 112.50 ± 30% interrupts.CPU4.RES:Rescheduling_interrupts
7599 -46.6% 4061 ± 35% interrupts.CPU41.NMI:Non-maskable_interrupts
7599 -46.6% 4061 ± 35% interrupts.CPU41.PMI:Performance_monitoring_interrupts
6661 ± 24% -52.1% 3191 ± 51% interrupts.CPU44.NMI:Non-maskable_interrupts
6661 ± 24% -52.1% 3191 ± 51% interrupts.CPU44.PMI:Performance_monitoring_interrupts
7622 -43.5% 4307 ± 33% interrupts.CPU46.NMI:Non-maskable_interrupts
7622 -43.5% 4307 ± 33% interrupts.CPU46.PMI:Performance_monitoring_interrupts
7613 -43.1% 4331 ± 31% interrupts.CPU47.NMI:Non-maskable_interrupts
7613 -43.1% 4331 ± 31% interrupts.CPU47.PMI:Performance_monitoring_interrupts
5823 ± 32% -36.4% 3703 ± 34% interrupts.CPU5.NMI:Non-maskable_interrupts
5823 ± 32% -36.4% 3703 ± 34% interrupts.CPU5.PMI:Performance_monitoring_interrupts
89.25 ± 48% -61.1% 34.75 ± 31% interrupts.CPU53.TLB:TLB_shootdowns
5698 ± 33% -42.5% 3277 ± 49% interrupts.CPU55.NMI:Non-maskable_interrupts
5698 ± 33% -42.5% 3277 ± 49% interrupts.CPU55.PMI:Performance_monitoring_interrupts
172.00 ± 14% -35.2% 111.50 ± 41% interrupts.CPU56.RES:Rescheduling_interrupts
64.00 ± 42% -39.5% 38.75 ± 29% interrupts.CPU56.TLB:TLB_shootdowns
156.00 ± 17% -36.2% 99.50 ± 21% interrupts.CPU57.RES:Rescheduling_interrupts
146.25 ± 28% -48.9% 74.75 ± 67% interrupts.CPU58.RES:Rescheduling_interrupts
7627 -47.0% 4043 ± 31% interrupts.CPU62.NMI:Non-maskable_interrupts
7627 -47.0% 4043 ± 31% interrupts.CPU62.PMI:Performance_monitoring_interrupts
174.75 ± 12% -29.9% 122.50 ± 30% interrupts.CPU62.RES:Rescheduling_interrupts
76.00 ± 29% -48.4% 39.25 ± 29% interrupts.CPU62.TLB:TLB_shootdowns
7159 ± 11% -50.2% 3564 ± 32% interrupts.CPU63.NMI:Non-maskable_interrupts
7159 ± 11% -50.2% 3564 ± 32% interrupts.CPU63.PMI:Performance_monitoring_interrupts
7628 -62.9% 2831 interrupts.CPU66.NMI:Non-maskable_interrupts
7628 -62.9% 2831 interrupts.CPU66.PMI:Performance_monitoring_interrupts
174.50 ± 10% -36.4% 111.00 ± 50% interrupts.CPU66.RES:Rescheduling_interrupts
4370 ± 18% -34.7% 2853 interrupts.CPU69.NMI:Non-maskable_interrupts
4370 ± 18% -34.7% 2853 interrupts.CPU69.PMI:Performance_monitoring_interrupts
6885 ± 18% -45.8% 3731 ± 28% interrupts.CPU74.NMI:Non-maskable_interrupts
6885 ± 18% -45.8% 3731 ± 28% interrupts.CPU74.PMI:Performance_monitoring_interrupts
5900 ± 18% -57.5% 2510 ± 24% interrupts.CPU77.NMI:Non-maskable_interrupts
5900 ± 18% -57.5% 2510 ± 24% interrupts.CPU77.PMI:Performance_monitoring_interrupts
62.00 ± 41% +58.9% 98.50 ± 14% interrupts.CPU78.RES:Rescheduling_interrupts
0.00 +2.7e+105% 2689 ±171% interrupts.CPU79.115:PCI-MSI.31981648-edge.i40e-eth0-TxRx-79
49.75 ± 47% +119.6% 109.25 ± 28% interrupts.CPU79.RES:Rescheduling_interrupts
61.50 ± 54% +115.0% 132.25 ± 31% interrupts.CPU8.RES:Rescheduling_interrupts
5871 ± 19% -38.8% 3594 ± 32% interrupts.CPU80.NMI:Non-maskable_interrupts
5871 ± 19% -38.8% 3594 ± 32% interrupts.CPU80.PMI:Performance_monitoring_interrupts
60.50 ± 19% +120.2% 133.25 ± 14% interrupts.CPU81.RES:Rescheduling_interrupts
36.00 ± 79% +179.2% 100.50 ± 35% interrupts.CPU86.RES:Rescheduling_interrupts
6322 ± 21% -60.6% 2490 ± 25% interrupts.CPU88.NMI:Non-maskable_interrupts
6322 ± 21% -60.6% 2490 ± 25% interrupts.CPU88.PMI:Performance_monitoring_interrupts
32.50 ± 40% +150.0% 81.25 ± 41% interrupts.CPU92.RES:Rescheduling_interrupts
124.00 ± 11% -22.8% 95.75 ± 4% interrupts.IWI:IRQ_work_interrupts
538989 ± 8% -28.0% 387910 ± 2% interrupts.NMI:Non-maskable_interrupts
538989 ± 8% -28.0% 387910 ± 2% interrupts.PMI:Performance_monitoring_interrupts
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Rong Chen
View attachment "config-5.8.0-rc5-00002-ga0ac629ebe7b3d" of type "text/plain" (158408 bytes)
View attachment "job-script" of type "text/plain" (8311 bytes)
View attachment "job.yaml" of type "text/plain" (5718 bytes)
View attachment "reproduce" of type "text/plain" (948 bytes)
Powered by blists - more mailing lists