[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200925071217.GO28663@shao2-debian>
Date: Fri, 25 Sep 2020 15:12:17 +0800
From: kernel test robot <rong.a.chen@...el.com>
To: Ritesh Harjani <riteshh@...ux.ibm.com>
Cc: linux-ext4@...r.kernel.org, tytso@....edu, jack@...e.cz,
dan.j.williams@...el.com, anju@...ux.vnet.ibm.com,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
Ritesh Harjani <riteshh@...ux.ibm.com>,
0day robot <lkp@...el.com>, lkp@...ts.01.org,
ying.huang@...el.com, feng.tang@...el.com, zhengjun.xing@...el.com
Subject: [ext4] 4e8fc10115: fio.write_iops 330.6% improvement
Greeting,
FYI, we noticed a 330.6% improvement of fio.write_iops due to commit:
commit: 4e8fc10115a6978060fe8a90f6a3a05463fa0660 ("[PATCHv3 1/1] ext4: Optimize file overwrites")
url: https://github.com/0day-ci/linux/commits/Ritesh-Harjani/Optimize-ext4-file-overwrites-perf-improvement/20200918-131139
base: https://git.kernel.org/cgit/linux/kernel/git/tytso/ext4.git dev
in testcase: fio-basic
on test machine: 96 threads Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 256G memory
with following parameters:
disk: 2pmem
fs: ext4
mount_option: dax
runtime: 200s
nr_task: 50%
time_based: tb
rw: write
bs: 4k
ioengine: sync
test_size: 200G
cpufreq_governor: performance
ucode: 0x5002f01
test-description: Fio is a tool that will spawn a number of threads or processes doing a particular type of I/O action as specified by the user.
test-url: https://github.com/axboe/fio
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
=========================================================================================
bs/compiler/cpufreq_governor/disk/fs/ioengine/kconfig/mount_option/nr_task/rootfs/runtime/rw/tbox_group/test_size/testcase/time_based/ucode:
4k/gcc-9/performance/2pmem/ext4/sync/x86_64-rhel-8.3/dax/50%/debian-10.4-x86_64-20200603.cgz/200s/write/lkp-csl-2sp6/200G/fio-basic/tb/0x5002f01
commit:
27bc446e2d ("ext4: limit the length of per-inode prealloc list")
4e8fc10115 ("ext4: Optimize file overwrites")
27bc446e2def38db 4e8fc10115a6978060fe8a90f6a
---------------- ---------------------------
%stddev %change %stddev
\ | \
0.12 ±106% -0.1 0.01 fio.latency_100us%
51.38 ± 23% -48.5 2.85 ± 20% fio.latency_20us%
0.01 +16.6 16.64 ± 28% fio.latency_2us%
0.24 ±135% +54.7 54.89 ± 3% fio.latency_4us%
32.62 ± 18% -31.7 0.91 ± 15% fio.latency_50us%
14780 ± 3% -9.4% 13390 fio.time.involuntary_context_switches
9299 -7.0% 8647 fio.time.system_time
228.71 ± 4% +281.9% 873.42 ± 6% fio.time.user_time
23448 -6.5% 21915 fio.time.voluntary_context_switches
5.426e+08 ± 5% +330.6% 2.337e+09 ± 6% fio.workload
10597 ± 5% +330.6% 45638 ± 6% fio.write_bw_MBps
26944 ± 8% -76.8% 6240 ± 9% fio.write_clat_90%_us
30368 ± 8% -72.0% 8512 ± 11% fio.write_clat_95%_us
38016 ± 9% -49.0% 19392 ± 4% fio.write_clat_99%_us
17448 ± 5% -77.9% 3855 ± 7% fio.write_clat_mean_us
11052 ± 32% -68.3% 3502 ± 10% fio.write_clat_stddev
2713004 ± 5% +330.6% 11683335 ± 6% fio.write_iops
13639680 ± 7% +26.6% 17267712 ± 5% meminfo.DirectMap2M
2704 ± 97% +131.9% 6269 ± 26% numa-meminfo.node0.PageTables
676.50 ± 96% +131.1% 1563 ± 26% numa-vmstat.node0.nr_page_table_pages
48.36 -6.8% 45.09 iostat.cpu.system
1.21 ± 4% +271.5% 4.51 ± 6% iostat.cpu.user
0.74 ± 2% +0.1 0.81 ± 5% mpstat.cpu.all.irq%
1.22 ± 4% +3.3 4.55 ± 6% mpstat.cpu.all.usr%
541348 +1.4% 548949 proc-vmstat.nr_file_pages
245833 +2.9% 252840 proc-vmstat.nr_unevictable
245833 +2.9% 252840 proc-vmstat.nr_zone_unevictable
695285 ± 20% -12.6% 607417 ± 17% proc-vmstat.pgfree
601976 ± 2% +22.0% 734594 ± 2% sched_debug.cpu.avg_idle.avg
1001923 +9.0% 1092207 ± 5% sched_debug.cpu.avg_idle.max
372963 -25.8% 276657 ± 6% sched_debug.cpu.avg_idle.stddev
22130 ± 17% +36.2% 30133 ± 14% sched_debug.cpu.nr_switches.max
3374 ± 18% +28.5% 4336 ± 10% sched_debug.cpu.nr_switches.stddev
-47.00 -45.7% -25.50 sched_debug.cpu.nr_uninterruptible.min
2816 ± 21% +36.5% 3844 ± 13% sched_debug.cpu.sched_count.stddev
26.69 ± 13% -44.0% 14.94 ± 17% sched_debug.cpu.sched_goidle.min
1424 ± 21% +36.2% 1941 ± 13% sched_debug.cpu.sched_goidle.stddev
1411 ± 18% +31.9% 1861 ± 12% sched_debug.cpu.ttwu_count.stddev
15.42 ± 3% -82.8% 2.66 ± 8% perf-stat.i.MPKI
3.417e+09 ± 4% +239.7% 1.161e+10 ± 6% perf-stat.i.branch-instructions
0.72 -0.1 0.64 perf-stat.i.branch-miss-rate%
24883051 ± 3% +181.5% 70036819 ± 4% perf-stat.i.branch-misses
97563341 ± 12% -58.3% 40638724 ± 14% perf-stat.i.cache-misses
2.96e+08 ± 2% -48.4% 1.529e+08 ± 11% perf-stat.i.cache-references
7.06 ± 4% -70.7% 2.06 ± 5% perf-stat.i.cpi
1461 ± 14% +170.2% 3948 ± 19% perf-stat.i.cycles-between-cache-misses
6.17e+09 ± 4% +243.3% 2.119e+10 ± 6% perf-stat.i.dTLB-loads
0.00 ± 11% -0.0 0.00 ± 3% perf-stat.i.dTLB-store-miss-rate%
3.978e+09 ± 4% +257.1% 1.421e+10 ± 6% perf-stat.i.dTLB-stores
83.61 +7.2 90.82 perf-stat.i.iTLB-load-miss-rate%
25688726 ± 3% +126.2% 58108368 ± 5% perf-stat.i.iTLB-load-misses
4852201 +17.7% 5709608 ± 2% perf-stat.i.iTLB-loads
1.962e+10 ± 4% +243.4% 6.738e+10 ± 6% perf-stat.i.instructions
774.43 ± 2% +50.4% 1165 perf-stat.i.instructions-per-iTLB-miss
0.15 ± 4% +235.9% 0.51 ± 6% perf-stat.i.ipc
0.25 ± 2% +51.6% 0.37 ± 3% perf-stat.i.metric.K/sec
144.73 ± 4% +239.5% 491.37 ± 6% perf-stat.i.metric.M/sec
89.29 +2.6 91.93 perf-stat.i.node-load-miss-rate%
12691022 ± 8% -56.3% 5550053 ± 12% perf-stat.i.node-load-misses
1504953 ± 13% -64.4% 535348 ± 15% perf-stat.i.node-loads
9964107 ± 8% -58.8% 4108905 ± 17% perf-stat.i.node-store-misses
15.10 ± 3% -84.9% 2.28 ± 11% perf-stat.overall.MPKI
0.73 -0.1 0.60 perf-stat.overall.branch-miss-rate%
6.86 ± 4% -71.0% 1.99 ± 6% perf-stat.overall.cpi
1401 ± 13% +139.9% 3361 ± 14% perf-stat.overall.cycles-between-cache-misses
0.00 ± 30% -0.0 0.00 ± 45% perf-stat.overall.dTLB-load-miss-rate%
0.00 ± 22% -0.0 0.00 ± 4% perf-stat.overall.dTLB-store-miss-rate%
84.11 +6.9 91.02 perf-stat.overall.iTLB-load-miss-rate%
763.81 ± 2% +51.8% 1159 perf-stat.overall.instructions-per-iTLB-miss
0.15 ± 4% +245.0% 0.50 ± 6% perf-stat.overall.ipc
89.44 +1.8 91.23 perf-stat.overall.node-load-miss-rate%
7276 -20.3% 5801 perf-stat.overall.path-length
3.401e+09 ± 4% +239.6% 1.155e+10 ± 6% perf-stat.ps.branch-instructions
24776511 ± 3% +181.3% 69696643 ± 4% perf-stat.ps.branch-misses
97040508 ± 12% -58.3% 40436979 ± 14% perf-stat.ps.cache-misses
2.945e+08 ± 2% -48.3% 1.522e+08 ± 11% perf-stat.ps.cache-references
6.141e+09 ± 4% +243.2% 2.108e+10 ± 6% perf-stat.ps.dTLB-loads
3.959e+09 ± 4% +257.0% 1.414e+10 ± 6% perf-stat.ps.dTLB-stores
25562318 ± 3% +126.2% 57814503 ± 5% perf-stat.ps.iTLB-load-misses
4826722 +17.7% 5679789 ± 2% perf-stat.ps.iTLB-loads
1.953e+10 ± 4% +243.3% 6.704e+10 ± 6% perf-stat.ps.instructions
12624818 ± 8% -56.3% 5522769 ± 12% perf-stat.ps.node-load-misses
1497174 ± 13% -64.4% 532776 ± 15% perf-stat.ps.node-loads
9912289 ± 8% -58.8% 4087930 ± 17% perf-stat.ps.node-store-misses
3.947e+12 ± 4% +243.4% 1.355e+13 ± 6% perf-stat.total.instructions
290.75 ± 51% -78.1% 63.75 ±128% interrupts.CPU17.RES:Rescheduling_interrupts
6339 ± 25% -35.3% 4101 ± 52% interrupts.CPU19.NMI:Non-maskable_interrupts
6339 ± 25% -35.3% 4101 ± 52% interrupts.CPU19.PMI:Performance_monitoring_interrupts
166.00 ± 46% -91.6% 14.00 ± 72% interrupts.CPU2.RES:Rescheduling_interrupts
429.75 ± 2% +14.0% 490.00 ± 12% interrupts.CPU20.CAL:Function_call_interrupts
6339 ± 25% -35.3% 4100 ± 52% interrupts.CPU20.NMI:Non-maskable_interrupts
6339 ± 25% -35.3% 4100 ± 52% interrupts.CPU20.PMI:Performance_monitoring_interrupts
6338 ± 25% -31.1% 4364 ± 46% interrupts.CPU21.NMI:Non-maskable_interrupts
6338 ± 25% -31.1% 4364 ± 46% interrupts.CPU21.PMI:Performance_monitoring_interrupts
6339 ± 25% -50.8% 3121 ± 14% interrupts.CPU23.NMI:Non-maskable_interrupts
6339 ± 25% -50.8% 3121 ± 14% interrupts.CPU23.PMI:Performance_monitoring_interrupts
68.50 ± 54% +202.2% 207.00 interrupts.CPU24.RES:Rescheduling_interrupts
3328 ± 45% +76.5% 5876 ± 33% interrupts.CPU25.NMI:Non-maskable_interrupts
3328 ± 45% +76.5% 5876 ± 33% interrupts.CPU25.PMI:Performance_monitoring_interrupts
39.75 ± 79% +423.9% 208.25 ± 2% interrupts.CPU25.RES:Rescheduling_interrupts
1766 ±112% -75.2% 438.25 ± 4% interrupts.CPU27.CAL:Function_call_interrupts
82.75 ± 49% -64.0% 29.75 ±122% interrupts.CPU27.TLB:TLB_shootdowns
439.50 ± 2% +74.2% 765.50 ± 38% interrupts.CPU3.CAL:Function_call_interrupts
494.25 ± 5% -10.5% 442.25 ± 5% interrupts.CPU30.CAL:Function_call_interrupts
61.00 ±127% +230.7% 201.75 interrupts.CPU30.RES:Rescheduling_interrupts
56.50 ±140% +255.3% 200.75 interrupts.CPU31.RES:Rescheduling_interrupts
1633 ±123% -73.3% 435.50 ± 3% interrupts.CPU32.CAL:Function_call_interrupts
56.75 ±141% +252.4% 200.00 interrupts.CPU33.RES:Rescheduling_interrupts
56.75 ±139% +227.3% 185.75 ± 12% interrupts.CPU34.RES:Rescheduling_interrupts
56.50 ±142% +185.8% 161.50 ± 39% interrupts.CPU35.RES:Rescheduling_interrupts
79.75 ± 36% -56.4% 34.75 ± 91% interrupts.CPU36.TLB:TLB_shootdowns
65.25 ±117% +176.6% 180.50 ± 30% interrupts.CPU39.RES:Rescheduling_interrupts
78.50 ± 44% -54.1% 36.00 ± 83% interrupts.CPU39.TLB:TLB_shootdowns
62.25 ±120% +151.8% 156.75 ± 45% interrupts.CPU43.RES:Rescheduling_interrupts
86.00 ± 45% -54.4% 39.25 ± 97% interrupts.CPU43.TLB:TLB_shootdowns
487.50 ± 10% -10.8% 434.75 ± 3% interrupts.CPU44.CAL:Function_call_interrupts
93.00 ± 46% -64.5% 33.00 ±119% interrupts.CPU46.TLB:TLB_shootdowns
7330 ± 12% -41.4% 4293 ± 33% interrupts.CPU5.NMI:Non-maskable_interrupts
7330 ± 12% -41.4% 4293 ± 33% interrupts.CPU5.PMI:Performance_monitoring_interrupts
169.25 ± 36% -90.8% 15.50 ± 71% interrupts.CPU5.RES:Rescheduling_interrupts
3285 ± 45% +92.3% 6318 ± 25% interrupts.CPU57.NMI:Non-maskable_interrupts
3285 ± 45% +92.3% 6318 ± 25% interrupts.CPU57.PMI:Performance_monitoring_interrupts
7323 ± 12% -51.2% 3572 ± 34% interrupts.CPU6.NMI:Non-maskable_interrupts
7323 ± 12% -51.2% 3572 ± 34% interrupts.CPU6.PMI:Performance_monitoring_interrupts
32.50 ± 78% +580.0% 221.00 ±125% interrupts.CPU63.TLB:TLB_shootdowns
7323 ± 12% -41.5% 4286 ± 33% interrupts.CPU7.NMI:Non-maskable_interrupts
7323 ± 12% -41.5% 4286 ± 33% interrupts.CPU7.PMI:Performance_monitoring_interrupts
175.50 ± 27% -80.3% 34.50 ± 37% interrupts.CPU72.RES:Rescheduling_interrupts
93.25 ± 45% -57.1% 40.00 ±115% interrupts.CPU72.TLB:TLB_shootdowns
7868 -45.2% 4311 ± 32% interrupts.CPU73.NMI:Non-maskable_interrupts
7868 -45.2% 4311 ± 32% interrupts.CPU73.PMI:Performance_monitoring_interrupts
7330 ± 12% -41.4% 4297 ± 33% interrupts.CPU75.NMI:Non-maskable_interrupts
7330 ± 12% -41.4% 4297 ± 33% interrupts.CPU75.PMI:Performance_monitoring_interrupts
163.50 ± 41% -84.9% 24.75 ±127% interrupts.CPU77.RES:Rescheduling_interrupts
7324 ± 12% -41.4% 4294 ± 33% interrupts.CPU78.NMI:Non-maskable_interrupts
7324 ± 12% -41.4% 4294 ± 33% interrupts.CPU78.PMI:Performance_monitoring_interrupts
161.25 ± 45% -91.5% 13.75 ±109% interrupts.CPU80.RES:Rescheduling_interrupts
7325 ± 12% -41.5% 4287 ± 33% interrupts.CPU81.NMI:Non-maskable_interrupts
7325 ± 12% -41.5% 4287 ± 33% interrupts.CPU81.PMI:Performance_monitoring_interrupts
95.00 ± 50% -59.7% 38.25 ±117% interrupts.CPU92.TLB:TLB_shootdowns
8991 ±108% +161.3% 23491 ± 19% softirqs.CPU2.SCHED
67870 ± 5% +8.4% 73546 ± 2% softirqs.CPU2.TIMER
23244 ± 25% -88.7% 2626 softirqs.CPU24.SCHED
83405 ± 17% -23.4% 63886 ± 2% softirqs.CPU24.TIMER
23963 ± 12% -88.4% 2784 ± 2% softirqs.CPU25.SCHED
83623 ± 19% -23.5% 63968 ± 2% softirqs.CPU25.TIMER
4276 ± 5% +97.6% 8448 ± 13% softirqs.CPU26.RCU
14129 ± 74% -81.4% 2631 ± 4% softirqs.CPU26.SCHED
17203 ± 53% -70.0% 5163 ± 89% softirqs.CPU27.SCHED
70966 ± 5% -10.4% 63583 ± 5% softirqs.CPU27.TIMER
19121 ± 47% -74.6% 4863 ± 88% softirqs.CPU28.SCHED
72354 ± 6% -10.4% 64858 ± 2% softirqs.CPU29.TIMER
9275 ±101% +151.3% 23309 ± 19% softirqs.CPU3.SCHED
19928 ± 46% -84.7% 3042 ± 7% softirqs.CPU30.SCHED
72106 ± 7% -11.8% 63632 ± 2% softirqs.CPU30.TIMER
19845 ± 45% -84.7% 3030 ± 6% softirqs.CPU31.SCHED
72345 ± 6% -10.8% 64523 softirqs.CPU31.TIMER
19559 ± 47% -84.2% 3094 ± 8% softirqs.CPU32.SCHED
19689 ± 47% -83.0% 3352 ± 2% softirqs.CPU33.SCHED
71873 ± 7% -9.4% 65131 softirqs.CPU33.TIMER
16286 ± 48% -63.6% 5928 ± 76% softirqs.CPU34.SCHED
11784 ± 76% +118.7% 25776 softirqs.CPU4.SCHED
70606 ± 5% -9.8% 63713 softirqs.CPU48.TIMER
71122 ± 4% -10.2% 63890 ± 5% softirqs.CPU49.TIMER
8863 ±108% +190.0% 25702 softirqs.CPU5.SCHED
20026 ± 49% -87.1% 2587 ± 5% softirqs.CPU50.SCHED
70832 ± 4% -10.7% 63286 softirqs.CPU50.TIMER
18874 ± 50% -86.1% 2631 ± 4% softirqs.CPU51.SCHED
71694 ± 5% -13.7% 61847 ± 3% softirqs.CPU51.TIMER
17403 ± 56% -85.3% 2560 softirqs.CPU52.SCHED
71831 ± 8% -11.0% 63942 ± 3% softirqs.CPU52.TIMER
20860 ± 49% -87.1% 2689 ± 2% softirqs.CPU53.SCHED
81014 ± 19% -23.0% 62345 ± 2% softirqs.CPU53.TIMER
20180 ± 50% -87.7% 2480 ± 9% softirqs.CPU54.SCHED
71917 ± 5% -12.3% 63071 softirqs.CPU54.TIMER
74057 ± 12% -16.4% 61946 ± 2% softirqs.CPU55.TIMER
20135 ± 50% -86.8% 2667 ± 4% softirqs.CPU56.SCHED
73377 ± 7% -13.4% 63523 ± 3% softirqs.CPU56.TIMER
23019 ± 19% -64.3% 8226 ±118% softirqs.CPU57.SCHED
75540 ± 5% -14.6% 64485 ± 4% softirqs.CPU57.TIMER
20267 ± 49% -59.4% 8236 ±118% softirqs.CPU58.SCHED
72755 ± 7% -11.1% 64699 ± 3% softirqs.CPU58.TIMER
72871 ± 7% -10.9% 64896 ± 4% softirqs.CPU59.TIMER
8781 ±108% +192.7% 25703 softirqs.CPU6.SCHED
72683 ± 7% -10.9% 64778 ± 4% softirqs.CPU60.TIMER
72665 ± 8% -11.1% 64612 ± 4% softirqs.CPU61.TIMER
72308 ± 5% -10.1% 64991 ± 6% softirqs.CPU65.TIMER
20301 ± 49% -58.5% 8419 ±118% softirqs.CPU66.SCHED
11380 ± 79% +123.7% 25453 softirqs.CPU7.SCHED
4027 ± 5% +111.8% 8530 ± 32% softirqs.CPU71.RCU
5823 ± 96% +357.6% 26649 softirqs.CPU72.SCHED
2461 ± 12% +952.7% 25914 softirqs.CPU73.SCHED
8475 ±117% +176.7% 23452 ± 20% softirqs.CPU75.SCHED
8462 ±116% +178.9% 23601 ± 19% softirqs.CPU76.SCHED
8459 ±117% +211.7% 26366 ± 2% softirqs.CPU77.SCHED
8511 ±117% +205.5% 26002 ± 2% softirqs.CPU79.SCHED
8854 ±105% +186.2% 25341 ± 2% softirqs.CPU8.SCHED
8450 ±116% +215.1% 26629 ± 2% softirqs.CPU80.SCHED
8496 ±117% +206.5% 26038 softirqs.CPU81.SCHED
4144 ± 6% +83.5% 7603 ± 21% softirqs.CPU82.RCU
8429 ±117% +179.7% 23575 ± 18% softirqs.CPU82.SCHED
8393 ±117% +138.6% 20028 ± 30% softirqs.CPU84.SCHED
8422 ±116% +140.8% 20281 ± 28% softirqs.CPU92.SCHED
4021 ± 7% +93.4% 7778 ± 29% softirqs.CPU95.RCU
415214 +63.4% 678631 ± 6% softirqs.RCU
38.06 ± 7% -38.1 0.00 perf-profile.calltrace.cycles-pp.__ext4_journal_start_sb.ext4_iomap_begin.iomap_apply.dax_iomap_rw.ext4_file_write_iter
36.28 ± 7% -36.3 0.00 perf-profile.calltrace.cycles-pp.jbd2__journal_start.__ext4_journal_start_sb.ext4_iomap_begin.iomap_apply.dax_iomap_rw
36.07 ± 7% -36.1 0.00 perf-profile.calltrace.cycles-pp.start_this_handle.jbd2__journal_start.__ext4_journal_start_sb.ext4_iomap_begin.iomap_apply
63.15 ± 7% -31.9 31.29 ± 12% perf-profile.calltrace.cycles-pp.ext4_iomap_begin.iomap_apply.dax_iomap_rw.ext4_file_write_iter.new_sync_write
11.15 ± 9% -11.1 0.00 perf-profile.calltrace.cycles-pp.__ext4_journal_stop.ext4_iomap_begin.iomap_apply.dax_iomap_rw.ext4_file_write_iter
10.95 ± 9% -11.0 0.00 perf-profile.calltrace.cycles-pp.jbd2_journal_stop.__ext4_journal_stop.ext4_iomap_begin.iomap_apply.dax_iomap_rw
8.81 ± 7% -8.8 0.00 perf-profile.calltrace.cycles-pp.stop_this_handle.jbd2_journal_stop.__ext4_journal_stop.ext4_iomap_begin.iomap_apply
8.49 ± 6% -8.5 0.00 perf-profile.calltrace.cycles-pp.add_transaction_credits.start_this_handle.jbd2__journal_start.__ext4_journal_start_sb.ext4_iomap_begin
5.93 ± 6% -5.9 0.00 perf-profile.calltrace.cycles-pp._raw_read_lock.start_this_handle.jbd2__journal_start.__ext4_journal_start_sb.ext4_iomap_begin
0.99 ± 9% +0.4 1.44 ± 19% perf-profile.calltrace.cycles-pp.ext4_write_checks.ext4_file_write_iter.new_sync_write.vfs_write.ksys_write
0.00 +1.0 0.96 ± 17% perf-profile.calltrace.cycles-pp.ext4_es_lookup_extent.ext4_map_blocks.ext4_iomap_begin.iomap_apply.dax_iomap_rw
0.00 +1.1 1.10 ± 20% perf-profile.calltrace.cycles-pp.__check_block_validity.ext4_map_blocks.ext4_iomap_begin.iomap_apply.dax_iomap_rw
0.00 +2.2 2.19 ± 17% perf-profile.calltrace.cycles-pp.ext4_map_blocks.ext4_iomap_begin.iomap_apply.dax_iomap_rw.ext4_file_write_iter
1.94 ± 16% +6.6 8.49 ± 13% perf-profile.calltrace.cycles-pp.__copy_user_nocache.__copy_user_flushcache._copy_from_iter_flushcache.dax_iomap_actor.iomap_apply
1.95 ± 16% +6.6 8.54 ± 13% perf-profile.calltrace.cycles-pp.__copy_user_flushcache._copy_from_iter_flushcache.dax_iomap_actor.iomap_apply.dax_iomap_rw
1.99 ± 16% +6.7 8.70 ± 13% perf-profile.calltrace.cycles-pp._copy_from_iter_flushcache.dax_iomap_actor.iomap_apply.dax_iomap_rw.ext4_file_write_iter
7.86 ± 11% +12.8 20.70 ± 13% perf-profile.calltrace.cycles-pp._raw_read_lock.jbd2_transaction_committed.ext4_set_iomap.ext4_iomap_begin.iomap_apply
1.73 ± 15% +13.7 15.42 ± 27% perf-profile.calltrace.cycles-pp.__srcu_read_unlock.dax_iomap_actor.iomap_apply.dax_iomap_rw.ext4_file_write_iter
12.86 ± 7% +14.8 27.69 ± 13% perf-profile.calltrace.cycles-pp.jbd2_transaction_committed.ext4_set_iomap.ext4_iomap_begin.iomap_apply.dax_iomap_rw
13.14 ± 7% +15.7 28.81 ± 13% perf-profile.calltrace.cycles-pp.ext4_set_iomap.ext4_iomap_begin.iomap_apply.dax_iomap_rw.ext4_file_write_iter
3.87 ± 14% +20.9 24.76 ± 20% perf-profile.calltrace.cycles-pp.dax_iomap_actor.iomap_apply.dax_iomap_rw.ext4_file_write_iter.new_sync_write
38.74 ± 7% -38.1 0.65 ± 8% perf-profile.children.cycles-pp.__ext4_journal_start_sb
36.93 ± 7% -36.3 0.61 ± 7% perf-profile.children.cycles-pp.jbd2__journal_start
36.73 ± 7% -36.1 0.60 ± 7% perf-profile.children.cycles-pp.start_this_handle
63.15 ± 7% -31.9 31.30 ± 12% perf-profile.children.cycles-pp.ext4_iomap_begin
11.21 ± 9% -11.2 0.01 ±173% perf-profile.children.cycles-pp.__ext4_journal_stop
11.01 ± 9% -11.0 0.01 ±173% perf-profile.children.cycles-pp.jbd2_journal_stop
8.83 ± 7% -8.8 0.00 perf-profile.children.cycles-pp.stop_this_handle
8.64 ± 7% -8.5 0.14 ± 8% perf-profile.children.cycles-pp.add_transaction_credits
0.00 +0.1 0.05 ± 8% perf-profile.children.cycles-pp.timestamp_truncate
0.00 +0.1 0.06 ± 15% perf-profile.children.cycles-pp.pmem_dax_direct_access
0.00 +0.1 0.06 ± 14% perf-profile.children.cycles-pp.fsnotify_parent
0.00 +0.1 0.06 ± 11% perf-profile.children.cycles-pp.file_modified
0.00 +0.1 0.07 ± 12% perf-profile.children.cycles-pp.aa_file_perm
0.00 +0.1 0.07 ± 12% perf-profile.children.cycles-pp.apparmor_file_permission
0.00 +0.1 0.07 ± 15% perf-profile.children.cycles-pp.ktime_get_coarse_real_ts64
0.00 +0.1 0.08 ± 10% perf-profile.children.cycles-pp.__pmem_direct_access
0.00 +0.1 0.09 ± 9% perf-profile.children.cycles-pp.__x86_indirect_thunk_rax
0.00 +0.1 0.09 ± 7% perf-profile.children.cycles-pp.__might_sleep
0.00 +0.1 0.09 ± 13% perf-profile.children.cycles-pp._cond_resched
0.00 +0.1 0.10 ± 12% perf-profile.children.cycles-pp.___might_sleep
0.00 +0.1 0.12 ± 12% perf-profile.children.cycles-pp.fsnotify
0.04 ± 57% +0.1 0.18 ± 7% perf-profile.children.cycles-pp.__fdget_pos
0.00 +0.1 0.14 ± 7% perf-profile.children.cycles-pp.__fget_light
0.00 +0.2 0.15 ± 10% perf-profile.children.cycles-pp.up_write
0.01 ±173% +0.2 0.17 ± 6% perf-profile.children.cycles-pp.current_time
0.00 +0.2 0.16 ± 11% perf-profile.children.cycles-pp.dax_direct_access
0.06 ± 7% +0.2 0.23 ± 11% perf-profile.children.cycles-pp.__sb_start_write
0.00 +0.2 0.18 ± 72% perf-profile.children.cycles-pp.generic_write_checks
0.04 ± 57% +0.2 0.22 ± 8% perf-profile.children.cycles-pp.__srcu_read_lock
0.06 ± 7% +0.2 0.26 ± 11% perf-profile.children.cycles-pp.entry_SYSCALL_64
0.06 +0.2 0.26 ± 14% perf-profile.children.cycles-pp.common_file_perm
0.05 ± 9% +0.2 0.28 ± 11% perf-profile.children.cycles-pp.down_write
0.00 +0.2 0.23 ± 60% perf-profile.children.cycles-pp.ext4_generic_write_checks
0.09 ± 5% +0.3 0.34 ± 13% perf-profile.children.cycles-pp.syscall_return_via_sysret
0.09 ± 5% +0.3 0.37 ± 14% perf-profile.children.cycles-pp.security_file_permission
0.10 ± 8% +0.4 0.54 ± 25% perf-profile.children.cycles-pp.ext4_inode_block_valid
0.99 ± 9% +0.4 1.44 ± 19% perf-profile.children.cycles-pp.ext4_write_checks
0.04 ± 57% +0.5 0.51 ± 31% perf-profile.children.cycles-pp.percpu_counter_add_batch
0.12 ±173% +0.5 0.65 ± 42% perf-profile.children.cycles-pp.start_kernel
0.17 ± 11% +0.8 0.96 ± 17% perf-profile.children.cycles-pp.ext4_es_lookup_extent
0.19 ± 14% +0.9 1.11 ± 20% perf-profile.children.cycles-pp.__check_block_validity
0.39 ± 12% +1.8 2.20 ± 17% perf-profile.children.cycles-pp.ext4_map_blocks
1.94 ± 16% +6.6 8.50 ± 13% perf-profile.children.cycles-pp.__copy_user_nocache
1.95 ± 16% +6.6 8.54 ± 13% perf-profile.children.cycles-pp.__copy_user_flushcache
1.99 ± 16% +6.7 8.70 ± 13% perf-profile.children.cycles-pp._copy_from_iter_flushcache
13.96 ± 9% +7.1 21.04 ± 13% perf-profile.children.cycles-pp._raw_read_lock
1.73 ± 15% +13.7 15.43 ± 27% perf-profile.children.cycles-pp.__srcu_read_unlock
12.87 ± 7% +14.8 27.70 ± 13% perf-profile.children.cycles-pp.jbd2_transaction_committed
13.15 ± 7% +15.7 28.82 ± 13% perf-profile.children.cycles-pp.ext4_set_iomap
3.88 ± 14% +20.9 24.78 ± 20% perf-profile.children.cycles-pp.dax_iomap_actor
21.95 ± 7% -21.6 0.35 ± 8% perf-profile.self.cycles-pp.start_this_handle
8.79 ± 7% -8.8 0.00 perf-profile.self.cycles-pp.stop_this_handle
8.60 ± 7% -8.5 0.14 ± 8% perf-profile.self.cycles-pp.add_transaction_credits
0.00 +0.1 0.05 ± 8% perf-profile.self.cycles-pp.__x86_indirect_thunk_rax
0.00 +0.1 0.06 ± 9% perf-profile.self.cycles-pp.current_time
0.00 +0.1 0.06 ± 11% perf-profile.self.cycles-pp.aa_file_perm
0.00 +0.1 0.06 ± 20% perf-profile.self.cycles-pp.apparmor_file_permission
0.00 +0.1 0.07 ± 20% perf-profile.self.cycles-pp.generic_write_checks
0.00 +0.1 0.07 ± 15% perf-profile.self.cycles-pp.ktime_get_coarse_real_ts64
0.00 +0.1 0.08 ± 6% perf-profile.self.cycles-pp.__might_sleep
0.00 +0.1 0.08 ± 10% perf-profile.self.cycles-pp.__pmem_direct_access
0.00 +0.1 0.08 ± 13% perf-profile.self.cycles-pp.__sb_start_write
0.00 +0.1 0.09 ± 13% perf-profile.self.cycles-pp.ksys_write
0.00 +0.1 0.10 ± 12% perf-profile.self.cycles-pp.___might_sleep
0.00 +0.1 0.11 ± 16% perf-profile.self.cycles-pp.dax_iomap_rw
0.00 +0.1 0.11 ± 11% perf-profile.self.cycles-pp.fsnotify
0.00 +0.1 0.12 ± 67% perf-profile.self.cycles-pp.file_update_time
0.00 +0.1 0.13 ± 8% perf-profile.self.cycles-pp.__fget_light
0.00 +0.1 0.13 ± 9% perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
0.00 +0.1 0.14 ± 15% perf-profile.self.cycles-pp.ext4_map_blocks
0.00 +0.2 0.15 ± 12% perf-profile.self.cycles-pp._copy_from_iter_flushcache
0.04 ± 57% +0.2 0.19 ± 15% perf-profile.self.cycles-pp.common_file_perm
0.00 +0.2 0.15 ± 10% perf-profile.self.cycles-pp.up_write
0.00 +0.2 0.17 ± 10% perf-profile.self.cycles-pp.down_write
0.04 ± 57% +0.2 0.21 ± 10% perf-profile.self.cycles-pp.dax_iomap_actor
0.01 ±173% +0.2 0.20 ± 11% perf-profile.self.cycles-pp.vfs_write
0.00 +0.2 0.18 ± 15% perf-profile.self.cycles-pp.do_syscall_64
0.08 ± 5% +0.2 0.28 ± 8% perf-profile.self.cycles-pp.ext4_iomap_begin
0.06 ± 15% +0.2 0.25 ± 11% perf-profile.self.cycles-pp.ext4_es_lookup_extent
0.06 ± 7% +0.2 0.26 ± 11% perf-profile.self.cycles-pp.entry_SYSCALL_64
0.01 ±173% +0.2 0.22 ± 10% perf-profile.self.cycles-pp.__srcu_read_lock
0.09 ± 5% +0.3 0.34 ± 13% perf-profile.self.cycles-pp.syscall_return_via_sysret
0.00 +0.3 0.31 ± 80% perf-profile.self.cycles-pp.new_sync_write
0.11 ± 7% +0.3 0.45 ± 9% perf-profile.self.cycles-pp.iomap_apply
0.04 ± 57% +0.4 0.47 ± 32% perf-profile.self.cycles-pp.percpu_counter_add_batch
0.10 ± 8% +0.4 0.53 ± 25% perf-profile.self.cycles-pp.ext4_inode_block_valid
0.25 ± 12% +0.5 0.70 ± 25% perf-profile.self.cycles-pp.ext4_file_write_iter
0.09 ± 27% +0.5 0.56 ± 21% perf-profile.self.cycles-pp.__check_block_validity
0.27 ± 18% +0.8 1.11 ± 28% perf-profile.self.cycles-pp.ext4_set_iomap
4.99 ± 6% +2.0 6.95 ± 14% perf-profile.self.cycles-pp.jbd2_transaction_committed
1.93 ± 16% +6.5 8.46 ± 13% perf-profile.self.cycles-pp.__copy_user_nocache
13.90 ± 9% +7.0 20.92 ± 13% perf-profile.self.cycles-pp._raw_read_lock
1.73 ± 15% +13.6 15.35 ± 27% perf-profile.self.cycles-pp.__srcu_read_unlock
fio.write_bw_MBps
60000 +-------------------------------------------------------------------+
55000 |-+ O |
| O O O |
50000 |-+ O O O O O O |
45000 |-+ O O O O O O O |
40000 |-O O O O O |
35000 |-+ |
| |
30000 |-+ |
25000 |-+ |
20000 |-+ |
15000 |-+ |
|.+..+.+.+.+..+.+.+.+..+.+.+. .+. .+..+.+.+.+..+.+.+. .+. .+.|
10000 |-+ +. + +..+ +.+. |
5000 +-------------------------------------------------------------------+
fio.write_iops
1.6e+07 +-----------------------------------------------------------------+
| O |
1.4e+07 |-+ |
| O O O O O |
1.2e+07 |-+ O O O O O |
| O O O O O O O O O O |
1e+07 |-+ O |
| |
8e+06 |-+ |
| |
6e+06 |-+ |
| |
4e+06 |-+ |
|.+.+..+.+.+.+.+..+.+.+.+.+..+.+.+.+.+..+.+.+.+.+..+.+.+.+.+..+.+.|
2e+06 +-----------------------------------------------------------------+
fio.write_clat_mean_us
20000 +-------------------------------------------------------------------+
| +.+.. |
18000 |-+ .+..+.+.+.. .+..+. + |
16000 |.+..+. .+.+..+. .+.+.. .+.+ +.+.+.+..+.+.+ + +.|
| + + + |
14000 |-+ |
12000 |-+ |
| |
10000 |-+ |
8000 |-+ |
| |
6000 |-+ |
4000 |-O O O O O O O O O |
| O O O O O O O O O O O O O |
2000 +-------------------------------------------------------------------+
fio.write_clat_90__us
35000 +-------------------------------------------------------------------+
| |
30000 |-+ + .+. .+.. .+ + |
|.+.. +. .+ : + .+. .+. + .+. +.+.+.+. : : + .+.|
25000 |-+ +. + +. + : +..+ + + +. .. : : +. |
| + + + + |
20000 |-+ |
| |
15000 |-+ |
| |
10000 |-+ |
| O O O O O O O O O O |
5000 |-+ O O O O O O O O O O O O |
| |
0 +-------------------------------------------------------------------+
fio.write_clat_95__us
40000 +-------------------------------------------------------------------+
| |
35000 |-+ .+. .+.. + |
| +. +. + +. + +. .+ :+ |
30000 |.+.. : +..+ : +.. + + + +.+.+ + +.+.+. : : +..+.|
| +. : + : + + + + : : |
25000 |-+ + + + + |
| |
20000 |-+ |
| |
15000 |-+ |
| |
10000 |-+ O O O O O O O O |
| O O O O O O O O O O O O O |
5000 +-------------------------------------------------------------------+
fio.latency_4us_
70 +----------------------------------------------------------------------+
| O |
60 |-+ O |
| O O O O O O O O O |
50 |-+ O O O O O O |
| O O O O |
40 |-+ O |
| |
30 |-+ |
| |
20 |-+ |
| |
10 |-+ |
| |
0 +----------------------------------------------------------------------+
fio.latency_50us_
45 +----------------------------------------------------------------------+
| + |
40 |-+ .+ :: |
35 |-+ + + +.+..+.+ .+ : : : .+ |
| + :+ + :: +. .. : +. .+. : : +. :|
30 |+++ : + + + : : : + : : + : : :|
25 |-+ + : + + : +.. : +..+.+ : : : |
| +.: + + + : : |
20 |-+ + + + |
15 |-+ |
| |
10 |-+ |
5 |-+ |
| |
0 +----------------------------------------------------------------------+
fio.workload
3e+09 +-----------------------------------------------------------------+
| |
| O O O |
2.5e+09 |-+ O O O O O |
| O O O O O O |
| O O O O O O O |
2e+09 |-+ |
| |
1.5e+09 |-+ |
| |
| |
1e+09 |-+ |
| |
|. .+..+.+.+. .+..+. .+. .+.. |
5e+08 +-----------------------------------------------------------------+
fio.time.user_time
1100 +--------------------------------------------------------------------+
| O |
1000 |-+ O O O O |
900 |-+ O O O O O |
| O O O O O O |
800 |-O O O O O |
700 |-+ |
| |
600 |-+ |
500 |-+ |
| |
400 |-+ + |
300 |-+.. + |
|.+ +.+..+.+.+.+..+.+.+..+. .+.+..+.+.+..+.+. .+.+. .+.|
200 +--------------------------------------------------------------------+
fio.time.system_time
9400 +--------------------------------------------------------------------+
9300 |-+ .+.+..+.+. .+.. .+.+.. |
|.+.. +.+..+.+.+.+..+.+.+..+ +.+..+.+.+..+.+ +.+ +.|
9200 |-+ + |
9100 |-+ + |
| |
9000 |-+ |
8900 |-+ |
8800 |-+ |
| O |
8700 |-O O O O O O O O O O |
8600 |-+ O O O O O |
| O O O O |
8500 |-+ O |
8400 +--------------------------------------------------------------------+
fio.time.voluntary_context_switches
24500 +-------------------------------------------------------------------+
| + + |
24000 |-+ + : : + |
|: + + +. + : : + + |
|: + + + .. +.+. .+. .+ + +. .+. + + |
23500 |-+ + +.+ +..+ + +.+.+. +.+.+..+ +.+..+.|
| |
23000 |-+ |
| |
22500 |-+ |
| O O |
| O O |
22000 |-+ O O O O O O O O |
| O O O O O O O O O O |
21500 +-------------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Rong Chen
View attachment "config-5.8.0-rc4-00047-g4e8fc10115a69" of type "text/plain" (169438 bytes)
View attachment "job-script" of type "text/plain" (8467 bytes)
View attachment "job.yaml" of type "text/plain" (5817 bytes)
View attachment "reproduce" of type "text/plain" (923 bytes)
Powered by blists - more mailing lists