[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20200206120315.GP12867@shao2-debian>
Date: Thu, 6 Feb 2020 20:03:15 +0800
From: kernel test robot <rong.a.chen@...el.com>
To: Yang Guo <guoyang2@...wei.com>
Cc: Theodore Ts'o <tytso@....edu>,
Andreas Dilger <adilger.kernel@...ger.ca>,
Eric Biggers <ebiggers@...nel.org>,
Shaokun Zhang <zhangshaokun@...ilicon.com>,
LKML <linux-kernel@...r.kernel.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
lkp@...ts.01.org
Subject: [ext4] 520f897a35: fio.read_bw_MBps 33.3% improvement
Greeting,
FYI, we noticed a 33.3% improvement of fio.read_bw_MBps due to commit:
commit: 520f897a3554b0665af1ae5d5ba286f290cecf5c ("ext4: use percpu_counters for extent_status cache hits/misses")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
in testcase: fio-basic
on test machine: 96 threads Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 256G memory
with following parameters:
disk: 2pmem
fs: ext4
mount_option: dax
runtime: 200s
nr_task: 50%
time_based: tb
rw: randread
bs: 4k
ioengine: libaio
test_size: 200G
cpufreq_governor: performance
ucode: 0x500002c
test-description: Fio is a tool that will spawn a number of threads or processes doing a particular type of I/O action as specified by the user.
test-url: https://github.com/axboe/fio
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
=========================================================================================
bs/compiler/cpufreq_governor/disk/fs/ioengine/kconfig/mount_option/nr_task/rootfs/runtime/rw/tbox_group/test_size/testcase/time_based/ucode:
4k/gcc-7/performance/2pmem/ext4/libaio/x86_64-rhel-7.6/dax/50%/debian-x86_64-20191114.cgz/200s/randread/lkp-csl-2sp6/200G/fio-basic/tb/0x500002c
commit:
7727ae5297 ("ext4: fix potential use after free after remounting with noblock_validity")
520f897a35 ("ext4: use percpu_counters for extent_status cache hits/misses")
7727ae52975d4f4e 520f897a3554b0665af1ae5d5ba
---------------- ---------------------------
fail:runs %reproduction fail:runs
| | |
:4 50% 2:4 dmesg.WARNING:at_ip__fsnotify_parent/0x
:4 25% 1:4 dmesg.WARNING:at_ip__x64_sys_io_submit/0x
:4 25% 1:4 dmesg.WARNING:at_ip_aio_read/0x
1:4 -25% :4 dmesg.WARNING:at_ip_io_submit_one/0x
%stddev %change %stddev
\ | \
0.99 ± 62% -1.0 0.01 fio.latency_500us%
33853 +33.3% 45121 fio.read_bw_MBps
225792 ± 3% -32.9% 151552 ± 2% fio.read_clat_90%_us
232960 ± 3% -31.9% 158720 ± 2% fio.read_clat_95%_us
247296 ± 3% -28.8% 176128 ± 2% fio.read_clat_99%_us
172016 -24.9% 129123 fio.read_clat_mean_us
45196 ± 11% -47.6% 23675 ± 19% fio.read_clat_stddev
8666452 +33.3% 11551177 fio.read_iops
4884 -28.5% 3491 fio.read_slat_mean_us
8683 -3.7% 8359 fio.time.system_time
925.36 ± 3% +35.1% 1249 fio.time.user_time
1.733e+09 +33.3% 2.31e+09 fio.workload
10831 ± 7% +16.6% 12633 ± 10% cpuidle.POLL.usage
4.82 ± 3% +1.7 6.50 mpstat.cpu.all.usr%
1452 +2.4% 1487 vmstat.system.cs
44.67 -3.6% 43.07 iostat.cpu.system
4.77 ± 3% +35.1% 6.45 iostat.cpu.user
289.30 +2.0% 295.11 turbostat.PkgWatt
138.34 +1.7% 140.76 turbostat.RAMWatt
43270 ± 13% -17.3% 35789 ± 11% numa-vmstat.node0.nr_active_anon
24664 ± 5% -11.0% 21939 ± 8% numa-vmstat.node0.nr_slab_unreclaimable
43270 ± 13% -17.3% 35789 ± 11% numa-vmstat.node0.nr_zone_active_anon
177415 ± 13% -16.9% 147370 ± 10% numa-meminfo.node0.Active
173112 ± 13% -17.3% 143140 ± 11% numa-meminfo.node0.Active(anon)
98646 ± 5% -11.0% 87748 ± 8% numa-meminfo.node0.SUnreclaim
141129 ± 6% -13.4% 122215 ± 8% numa-meminfo.node0.Slab
323.59 ±109% -100.0% 0.00 ± 3% sched_debug.cfs_rq:/.MIN_vruntime.stddev
323.59 ±109% -100.0% 0.00 ± 3% sched_debug.cfs_rq:/.max_vruntime.stddev
7.70 ± 9% -14.9% 6.55 ± 13% sched_debug.cpu.nr_uninterruptible.stddev
24248 ± 16% -33.3% 16171 ± 23% sched_debug.cpu.sched_count.max
3313 ± 8% -16.9% 2753 ± 10% sched_debug.cpu.sched_count.stddev
58.50 ±133% -91.0% 5.25 ± 65% interrupts.CPU2.RES:Rescheduling_interrupts
2416 ± 7% -32.0% 1644 ± 3% interrupts.CPU20.CAL:Function_call_interrupts
2410 ± 7% -31.3% 1656 ± 4% interrupts.CPU21.CAL:Function_call_interrupts
2266 ± 6% +12.1% 2540 ± 3% interrupts.CPU36.CAL:Function_call_interrupts
2258 ± 6% +12.7% 2546 ± 3% interrupts.CPU37.CAL:Function_call_interrupts
22.75 ±123% +184.6% 64.75 ± 62% interrupts.CPU37.TLB:TLB_shootdowns
34.75 ±111% +190.6% 101.00 ± 51% interrupts.CPU38.TLB:TLB_shootdowns
2256 ± 6% +12.0% 2528 ± 3% interrupts.CPU39.CAL:Function_call_interrupts
5801 ± 38% -50.2% 2888 interrupts.CPU47.NMI:Non-maskable_interrupts
5801 ± 38% -50.2% 2888 interrupts.CPU47.PMI:Performance_monitoring_interrupts
93.75 ±149% -96.8% 3.00 ± 97% interrupts.CPU5.RES:Rescheduling_interrupts
44.75 ± 77% +153.1% 113.25 ± 39% interrupts.CPU52.TLB:TLB_shootdowns
32.25 ± 74% +186.8% 92.50 ± 24% interrupts.CPU54.TLB:TLB_shootdowns
37.00 ± 78% +146.6% 91.25 ± 35% interrupts.CPU60.TLB:TLB_shootdowns
55.50 ± 58% +87.4% 104.00 ± 21% interrupts.CPU65.TLB:TLB_shootdowns
36.25 ± 52% +157.9% 93.50 ± 35% interrupts.CPU71.TLB:TLB_shootdowns
12.91 ± 8% -22.6% 10.00 ± 5% perf-stat.i.MPKI
1.549e+10 +32.9% 2.058e+10 perf-stat.i.branch-instructions
63057844 +30.7% 82417950 ± 6% perf-stat.i.branch-misses
69.19 ± 4% +10.3 79.50 ± 3% perf-stat.i.cache-miss-rate%
7.224e+08 ± 3% +18.4% 8.554e+08 ± 2% perf-stat.i.cache-misses
1395 +2.6% 1431 perf-stat.i.context-switches
1.65 -25.0% 1.24 perf-stat.i.cpi
192.37 ± 4% -15.7% 162.11 ± 2% perf-stat.i.cycles-between-cache-misses
2e+10 ± 8% +29.7% 2.594e+10 ± 11% perf-stat.i.dTLB-loads
13727 ± 2% +5.4% 14471 ± 3% perf-stat.i.dTLB-store-misses
1.34e+10 ± 6% +34.0% 1.796e+10 ± 6% perf-stat.i.dTLB-stores
60015893 ± 6% +7.4% 64478755 ± 3% perf-stat.i.iTLB-load-misses
5975439 +5.2% 6284441 perf-stat.i.iTLB-loads
8.135e+10 +32.9% 1.081e+11 perf-stat.i.instructions
1366 ± 8% +23.0% 1680 ± 4% perf-stat.i.instructions-per-iTLB-miss
0.61 +33.4% 0.81 perf-stat.i.ipc
6781479 ± 5% -21.3% 5337894 ± 5% perf-stat.i.node-store-misses
65017 ± 8% +28.7% 83703 ± 3% perf-stat.i.node-stores
12.86 ± 9% -22.8% 9.94 ± 5% perf-stat.overall.MPKI
69.37 ± 4% +10.4 79.79 ± 3% perf-stat.overall.cache-miss-rate%
1.65 -25.2% 1.23 perf-stat.overall.cpi
185.84 ± 3% -16.1% 155.96 ± 2% perf-stat.overall.cycles-between-cache-misses
0.00 ± 7% -0.0 0.00 ± 12% perf-stat.overall.dTLB-load-miss-rate%
0.00 ± 8% -0.0 0.00 ± 7% perf-stat.overall.dTLB-store-miss-rate%
1362 ± 8% +23.2% 1678 ± 4% perf-stat.overall.instructions-per-iTLB-miss
0.61 +33.6% 0.81 perf-stat.overall.ipc
1.541e+10 +32.9% 2.047e+10 perf-stat.ps.branch-instructions
62736602 +30.7% 81998518 ± 6% perf-stat.ps.branch-misses
7.187e+08 ± 3% +18.4% 8.51e+08 ± 2% perf-stat.ps.cache-misses
1388 +2.6% 1424 perf-stat.ps.context-switches
1.99e+10 ± 8% +29.7% 2.581e+10 ± 11% perf-stat.ps.dTLB-loads
13711 ± 2% +5.5% 14459 ± 3% perf-stat.ps.dTLB-store-misses
1.333e+10 ± 6% +34.0% 1.786e+10 ± 6% perf-stat.ps.dTLB-stores
59709963 ± 6% +7.4% 64151654 ± 3% perf-stat.ps.iTLB-load-misses
5945182 +5.2% 6252891 perf-stat.ps.iTLB-loads
8.093e+10 +32.9% 1.075e+11 perf-stat.ps.instructions
6746979 ± 5% -21.3% 5310789 ± 5% perf-stat.ps.node-store-misses
64702 ± 8% +28.7% 83271 ± 3% perf-stat.ps.node-stores
1.632e+13 +32.9% 2.168e+13 perf-stat.total.instructions
7271 ± 11% +32.1% 9603 ± 5% softirqs.CPU0.RCU
7136 ± 13% +40.0% 9989 ± 2% softirqs.CPU10.RCU
7326 ± 12% +36.3% 9985 ± 4% softirqs.CPU11.RCU
7019 ± 5% +42.9% 10032 softirqs.CPU13.RCU
7277 ± 11% +36.9% 9961 ± 6% softirqs.CPU15.RCU
7498 ± 5% +28.2% 9610 ± 10% softirqs.CPU16.RCU
7664 ± 10% +35.1% 10356 softirqs.CPU17.RCU
7573 ± 13% +33.7% 10126 ± 3% softirqs.CPU18.RCU
8033 ± 8% +28.0% 10281 ± 3% softirqs.CPU19.RCU
7395 ± 15% +73.1% 12804 ± 20% softirqs.CPU2.RCU
7656 ± 9% +36.0% 10414 ± 5% softirqs.CPU20.RCU
7355 ± 12% +37.4% 10104 ± 2% softirqs.CPU21.RCU
7585 ± 10% +35.0% 10244 ± 3% softirqs.CPU22.RCU
7947 ± 11% +28.2% 10186 ± 2% softirqs.CPU23.RCU
8479 ± 10% +44.1% 12221 ± 5% softirqs.CPU24.RCU
7699 ± 10% +40.3% 10801 ± 10% softirqs.CPU25.RCU
7463 ± 3% +38.7% 10352 ± 6% softirqs.CPU26.RCU
7431 ± 4% +47.9% 10990 ± 12% softirqs.CPU27.RCU
7292 ± 6% +52.4% 11115 ± 12% softirqs.CPU28.RCU
7359 ± 5% +42.5% 10488 ± 9% softirqs.CPU29.RCU
6844 ± 15% +54.1% 10546 ± 9% softirqs.CPU30.RCU
7111 ± 15% +41.8% 10086 ± 8% softirqs.CPU31.RCU
7538 ± 5% +40.6% 10602 ± 9% softirqs.CPU32.RCU
7631 ± 6% +35.4% 10331 ± 7% softirqs.CPU33.RCU
7627 ± 11% +36.7% 10427 ± 7% softirqs.CPU34.RCU
7783 ± 9% +33.6% 10401 ± 7% softirqs.CPU35.RCU
7574 ± 7% +38.4% 10484 ± 6% softirqs.CPU36.RCU
7626 ± 5% +34.4% 10247 ± 5% softirqs.CPU37.RCU
7668 ± 7% +36.9% 10496 ± 7% softirqs.CPU38.RCU
7752 ± 6% +33.3% 10334 ± 5% softirqs.CPU39.RCU
7868 ± 7% +28.1% 10081 ± 4% softirqs.CPU4.RCU
7780 ± 5% +36.0% 10577 ± 4% softirqs.CPU40.RCU
7489 ± 5% +32.4% 9915 ± 14% softirqs.CPU41.RCU
7669 ± 6% +27.3% 9760 ± 6% softirqs.CPU42.RCU
7552 ± 6% +38.5% 10463 ± 7% softirqs.CPU44.RCU
7337 ± 6% +35.9% 9975 ± 11% softirqs.CPU45.RCU
7663 ± 4% +38.1% 10582 ± 9% softirqs.CPU46.RCU
7202 ± 10% +33.5% 9613 ± 5% softirqs.CPU47.RCU
8444 ± 8% +38.6% 11705 ± 5% softirqs.CPU48.RCU
8371 ± 2% +40.5% 11761 ± 13% softirqs.CPU49.RCU
7047 ± 28% +45.8% 10274 ± 5% softirqs.CPU5.RCU
7671 ± 14% +43.7% 11026 ± 10% softirqs.CPU50.RCU
7325 ± 11% +52.2% 11146 ± 9% softirqs.CPU51.RCU
8199 ± 5% +39.6% 11445 ± 6% softirqs.CPU52.RCU
7986 ± 3% +44.9% 11576 ± 11% softirqs.CPU53.RCU
8141 ± 3% +41.1% 11484 ± 14% softirqs.CPU55.RCU
8114 ± 4% +34.7% 10928 ± 14% softirqs.CPU56.RCU
7931 ± 5% +41.8% 11243 ± 12% softirqs.CPU57.RCU
8007 ± 7% +40.4% 11241 ± 13% softirqs.CPU58.RCU
7744 ± 2% +43.2% 11089 ± 11% softirqs.CPU59.RCU
7492 ± 11% +30.5% 9776 ± 2% softirqs.CPU6.RCU
8171 ± 3% +38.5% 11315 ± 8% softirqs.CPU60.RCU
8151 ± 15% +35.6% 11052 ± 10% softirqs.CPU61.RCU
7663 ± 6% +51.8% 11636 ± 9% softirqs.CPU62.RCU
7905 ± 7% +43.2% 11317 ± 11% softirqs.CPU63.RCU
7910 ± 4% +39.1% 11005 ± 12% softirqs.CPU64.RCU
7795 ± 5% +38.2% 10770 ± 11% softirqs.CPU65.RCU
7812 ± 3% +35.8% 10611 ± 14% softirqs.CPU66.RCU
7761 ± 4% +43.5% 11137 ± 14% softirqs.CPU67.RCU
7705 ± 4% +40.0% 10785 ± 12% softirqs.CPU68.RCU
8023 ± 12% +36.0% 10909 ± 12% softirqs.CPU69.RCU
7252 ± 4% +33.6% 9692 ± 3% softirqs.CPU7.RCU
7836 ± 4% +37.6% 10784 ± 11% softirqs.CPU70.RCU
7909 ± 3% +37.9% 10907 ± 11% softirqs.CPU71.RCU
6832 ± 8% +40.7% 9613 ± 4% softirqs.CPU73.RCU
6585 ± 10% +47.1% 9685 ± 6% softirqs.CPU74.RCU
6827 ± 12% +39.7% 9536 ± 4% softirqs.CPU75.RCU
6667 ± 11% +42.2% 9479 ± 5% softirqs.CPU76.RCU
6955 ± 9% +36.6% 9498 ± 5% softirqs.CPU77.RCU
6539 ± 10% +46.6% 9586 ± 5% softirqs.CPU78.RCU
6645 ± 9% +45.8% 9688 ± 7% softirqs.CPU79.RCU
7613 ± 14% +31.0% 9975 softirqs.CPU8.RCU
6864 ± 5% +45.9% 10016 ± 5% softirqs.CPU80.RCU
6931 ± 6% +39.3% 9656 ± 6% softirqs.CPU81.RCU
7108 ± 4% +46.3% 10401 ± 8% softirqs.CPU82.RCU
7222 ± 8% +42.2% 10268 ± 5% softirqs.CPU83.RCU
7171 ± 9% +41.0% 10110 ± 6% softirqs.CPU84.RCU
7079 ± 7% +39.9% 9902 ± 7% softirqs.CPU85.RCU
7175 ± 9% +37.3% 9855 ± 7% softirqs.CPU86.RCU
7190 ± 8% +37.4% 9877 ± 9% softirqs.CPU87.RCU
6952 ± 7% +40.7% 9782 ± 6% softirqs.CPU88.RCU
7068 ± 4% +40.0% 9896 ± 8% softirqs.CPU89.RCU
7415 ± 11% +33.6% 9906 ± 3% softirqs.CPU9.RCU
7031 ± 6% +46.0% 10268 ± 10% softirqs.CPU90.RCU
7030 ± 7% +43.1% 10060 ± 6% softirqs.CPU91.RCU
6980 ± 5% +41.1% 9850 ± 10% softirqs.CPU92.RCU
6857 ± 7% +43.6% 9847 ± 7% softirqs.CPU93.RCU
7185 ± 9% +40.2% 10072 ± 9% softirqs.CPU94.RCU
7011 ± 3% +51.7% 10638 ± 5% softirqs.CPU95.RCU
726904 ± 4% +37.8% 1001841 ± 3% softirqs.RCU
13.99 ± 6% -11.4 2.55 ± 25% perf-profile.calltrace.cycles-pp.ext4_map_blocks.ext4_iomap_begin.iomap_apply.dax_iomap_rw.ext4_file_read_iter
12.21 ± 6% -10.5 1.66 ± 7% perf-profile.calltrace.cycles-pp.ext4_es_lookup_extent.ext4_map_blocks.ext4_iomap_begin.iomap_apply.dax_iomap_rw
0.51 ± 2% +0.4 0.95 ± 10% perf-profile.calltrace.cycles-pp.security_file_permission.aio_read.io_submit_one.__x64_sys_io_submit.do_syscall_64
0.26 ±100% +0.5 0.78 ± 3% perf-profile.calltrace.cycles-pp.lookup_ioctx.__x64_sys_io_submit.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.00 +0.5 0.52 ± 3% perf-profile.calltrace.cycles-pp._copy_to_user.aio_read_events.read_events.do_io_getevents.__x64_sys_io_getevents
0.27 ±100% +0.5 0.79 ± 2% perf-profile.calltrace.cycles-pp.lookup_ioctx.do_io_getevents.__x64_sys_io_getevents.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.00 +0.6 0.57 ± 2% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64
0.00 +0.6 0.62 ± 6% perf-profile.calltrace.cycles-pp.selinux_file_permission.security_file_permission.aio_read.io_submit_one.__x64_sys_io_submit
0.00 +0.7 0.68 ± 3% perf-profile.calltrace.cycles-pp._copy_from_user.io_submit_one.__x64_sys_io_submit.do_syscall_64.entry_SYSCALL_64_after_hwframe
1.35 ± 8% +0.7 2.05 ± 4% perf-profile.calltrace.cycles-pp.aio_read_events.read_events.do_io_getevents.__x64_sys_io_getevents.do_syscall_64
1.49 ± 8% +0.8 2.27 ± 4% perf-profile.calltrace.cycles-pp.read_events.do_io_getevents.__x64_sys_io_getevents.do_syscall_64.entry_SYSCALL_64_after_hwframe
2.05 ± 8% +1.1 3.16 ± 3% perf-profile.calltrace.cycles-pp.do_io_getevents.__x64_sys_io_getevents.do_syscall_64.entry_SYSCALL_64_after_hwframe
2.18 ± 8% +1.2 3.36 ± 2% perf-profile.calltrace.cycles-pp.__x64_sys_io_getevents.do_syscall_64.entry_SYSCALL_64_after_hwframe
8.54 ± 13% +3.9 12.41 ± 13% perf-profile.calltrace.cycles-pp.__memcpy_mcsafe.copyout_mcsafe._copy_to_iter_mcsafe.dax_iomap_actor.iomap_apply
8.69 ± 13% +4.0 12.64 ± 13% perf-profile.calltrace.cycles-pp.copyout_mcsafe._copy_to_iter_mcsafe.dax_iomap_actor.iomap_apply.dax_iomap_rw
8.97 ± 12% +4.1 13.05 ± 12% perf-profile.calltrace.cycles-pp._copy_to_iter_mcsafe.dax_iomap_actor.iomap_apply.dax_iomap_rw.ext4_file_read_iter
9.54 ± 12% +4.4 13.91 ± 11% perf-profile.calltrace.cycles-pp.dax_iomap_actor.iomap_apply.dax_iomap_rw.ext4_file_read_iter.aio_read
10.86 ± 5% +4.7 15.54 ± 5% perf-profile.calltrace.cycles-pp._raw_read_lock.jbd2_transaction_committed.ext4_iomap_begin.iomap_apply.dax_iomap_rw
28.92 ± 4% +10.4 39.30 ± 5% perf-profile.calltrace.cycles-pp.jbd2_transaction_committed.ext4_iomap_begin.iomap_apply.dax_iomap_rw.ext4_file_read_iter
13.99 ± 6% -11.4 2.56 ± 25% perf-profile.children.cycles-pp.ext4_map_blocks
12.22 ± 6% -10.5 1.67 ± 7% perf-profile.children.cycles-pp.ext4_es_lookup_extent
0.05 ± 9% +0.0 0.08 ± 5% perf-profile.children.cycles-pp.__pmem_direct_access
0.04 ± 58% +0.0 0.07 ± 5% perf-profile.children.cycles-pp.fpregs_assert_state_consistent
0.10 ± 5% +0.0 0.13 ± 10% perf-profile.children.cycles-pp.scheduler_tick
0.07 +0.0 0.11 ± 4% perf-profile.children.cycles-pp.kmem_cache_free
0.07 ± 5% +0.0 0.11 ± 12% perf-profile.children.cycles-pp.task_tick_fair
0.05 ± 9% +0.0 0.09 ± 4% perf-profile.children.cycles-pp.import_single_range
0.07 ± 5% +0.0 0.11 ± 3% perf-profile.children.cycles-pp.rcu_all_qs
0.03 ±100% +0.0 0.07 ± 10% perf-profile.children.cycles-pp.__x86_indirect_thunk_rax
0.01 ±173% +0.0 0.06 perf-profile.children.cycles-pp.pmem_dax_direct_access
0.08 ± 11% +0.0 0.12 ± 6% perf-profile.children.cycles-pp.aio_setup_rw
0.14 ± 11% +0.1 0.20 ± 13% perf-profile.children.cycles-pp.update_process_times
0.15 ± 14% +0.1 0.20 ± 12% perf-profile.children.cycles-pp.tick_sched_handle
0.10 ± 4% +0.1 0.15 ± 2% perf-profile.children.cycles-pp.__srcu_read_unlock
0.00 +0.1 0.06 ± 9% perf-profile.children.cycles-pp.rw_verify_area
0.11 ± 11% +0.1 0.17 ± 2% perf-profile.children.cycles-pp.dax_direct_access
0.12 ± 17% +0.1 0.18 ± 8% perf-profile.children.cycles-pp.ext4_data_block_valid_rcu
0.12 ± 7% +0.1 0.17 ± 4% perf-profile.children.cycles-pp.down_read_trylock
0.11 ± 8% +0.1 0.17 ± 8% perf-profile.children.cycles-pp.__fsnotify_parent
0.13 ± 14% +0.1 0.20 ± 10% perf-profile.children.cycles-pp.__virt_addr_valid
0.15 ± 7% +0.1 0.22 ± 3% perf-profile.children.cycles-pp.aio_complete_rw
0.13 ± 6% +0.1 0.19 ± 5% perf-profile.children.cycles-pp.up_read
0.11 ± 12% +0.1 0.17 ± 4% perf-profile.children.cycles-pp.refill_reqs_available
0.13 ± 12% +0.1 0.20 ± 5% perf-profile.children.cycles-pp.mutex_unlock
0.13 ± 7% +0.1 0.20 ± 4% perf-profile.children.cycles-pp.fput_many
0.21 ± 17% +0.1 0.28 ± 9% perf-profile.children.cycles-pp.__hrtimer_run_queues
0.13 ± 13% +0.1 0.21 ± 10% perf-profile.children.cycles-pp.__inode_security_revalidate
0.15 ± 10% +0.1 0.22 ± 4% perf-profile.children.cycles-pp._cond_resched
0.19 ± 6% +0.1 0.27 ± 2% perf-profile.children.cycles-pp.__get_reqs_available
0.18 ± 7% +0.1 0.27 perf-profile.children.cycles-pp.__lock_text_start
0.18 ± 4% +0.1 0.27 ± 6% perf-profile.children.cycles-pp.__get_user_8
0.17 ± 7% +0.1 0.26 ± 4% perf-profile.children.cycles-pp.copy_user_generic_unrolled
0.14 ± 10% +0.1 0.23 ± 3% perf-profile.children.cycles-pp.__srcu_read_lock
0.23 ± 8% +0.1 0.32 ± 2% perf-profile.children.cycles-pp.mutex_lock
0.18 ± 9% +0.1 0.28 ± 7% perf-profile.children.cycles-pp.__put_user_4
0.19 ± 10% +0.1 0.29 ± 4% perf-profile.children.cycles-pp.put_reqs_available
0.23 ± 9% +0.1 0.35 ± 5% perf-profile.children.cycles-pp.__check_object_size
0.24 ± 6% +0.1 0.38 ± 3% perf-profile.children.cycles-pp.__fget
0.24 ± 9% +0.1 0.38 ± 2% perf-profile.children.cycles-pp._raw_spin_lock_irqsave
0.36 ± 8% +0.2 0.53 ± 4% perf-profile.children.cycles-pp._copy_to_user
0.34 ± 7% +0.2 0.51 ± 3% perf-profile.children.cycles-pp.__might_sleep
0.31 ± 9% +0.2 0.49 ± 8% perf-profile.children.cycles-pp.kmem_cache_alloc
0.37 ± 6% +0.2 0.56 ± 3% perf-profile.children.cycles-pp.__get_user_4
0.39 ± 7% +0.2 0.58 ± 3% perf-profile.children.cycles-pp.copy_user_enhanced_fast_string
0.38 ± 8% +0.2 0.59 ± 2% perf-profile.children.cycles-pp.entry_SYSCALL_64
0.45 ± 8% +0.2 0.68 ± 3% perf-profile.children.cycles-pp._copy_from_user
0.49 ± 10% +0.3 0.75 ± 4% perf-profile.children.cycles-pp.___might_sleep
0.52 ± 9% +0.3 0.79 ± 2% perf-profile.children.cycles-pp.syscall_return_via_sysret
0.35 ± 3% +0.3 0.63 ± 6% perf-profile.children.cycles-pp.selinux_file_permission
0.67 ± 8% +0.3 1.00 ± 3% perf-profile.children.cycles-pp.__might_fault
0.00 +0.3 0.33 ± 30% perf-profile.children.cycles-pp.percpu_counter_add_batch
0.53 ± 2% +0.5 0.98 ± 11% perf-profile.children.cycles-pp.security_file_permission
1.01 ± 7% +0.6 1.59 perf-profile.children.cycles-pp.lookup_ioctx
1.36 ± 8% +0.7 2.08 ± 4% perf-profile.children.cycles-pp.aio_read_events
1.50 ± 8% +0.8 2.29 ± 4% perf-profile.children.cycles-pp.read_events
2.06 ± 8% +1.1 3.18 ± 3% perf-profile.children.cycles-pp.do_io_getevents
2.18 ± 8% +1.2 3.37 ± 2% perf-profile.children.cycles-pp.__x64_sys_io_getevents
8.55 ± 13% +3.9 12.44 ± 13% perf-profile.children.cycles-pp.__memcpy_mcsafe
8.75 ± 13% +4.0 12.74 ± 13% perf-profile.children.cycles-pp.copyout_mcsafe
8.99 ± 12% +4.1 13.07 ± 12% perf-profile.children.cycles-pp._copy_to_iter_mcsafe
9.56 ± 12% +4.4 13.94 ± 11% perf-profile.children.cycles-pp.dax_iomap_actor
11.03 ± 5% +4.8 15.78 ± 5% perf-profile.children.cycles-pp._raw_read_lock
28.94 ± 4% +10.4 39.33 ± 5% perf-profile.children.cycles-pp.jbd2_transaction_committed
12.01 ± 6% -10.9 1.11 ± 5% perf-profile.self.cycles-pp.ext4_es_lookup_extent
0.05 ± 8% +0.0 0.07 ± 5% perf-profile.self.cycles-pp.__pmem_direct_access
0.06 ± 6% +0.0 0.10 ± 5% perf-profile.self.cycles-pp.kmem_cache_free
0.06 ± 13% +0.0 0.10 ± 4% perf-profile.self.cycles-pp.do_io_getevents
0.07 ± 11% +0.0 0.11 ± 6% perf-profile.self.cycles-pp._cond_resched
0.07 ± 12% +0.0 0.11 ± 8% perf-profile.self.cycles-pp.read_events
0.04 ± 57% +0.0 0.08 ± 6% perf-profile.self.cycles-pp.rcu_all_qs
0.03 ±100% +0.0 0.07 ± 7% perf-profile.self.cycles-pp._copy_from_user
0.03 ±100% +0.0 0.07 ± 6% perf-profile.self.cycles-pp.__inode_security_revalidate
0.08 ± 5% +0.0 0.13 ± 6% perf-profile.self.cycles-pp.dax_iomap_rw
0.09 ± 12% +0.0 0.13 ± 14% perf-profile.self.cycles-pp.__check_object_size
0.01 ±173% +0.0 0.06 perf-profile.self.cycles-pp.touch_atime
0.10 ± 5% +0.0 0.14 ± 3% perf-profile.self.cycles-pp.__srcu_read_unlock
0.15 ± 10% +0.0 0.20 ± 4% perf-profile.self.cycles-pp.mutex_lock
0.12 ± 7% +0.1 0.17 ± 7% perf-profile.self.cycles-pp.down_read_trylock
0.00 +0.1 0.05 ± 8% perf-profile.self.cycles-pp.import_single_range
0.00 +0.1 0.06 ± 9% perf-profile.self.cycles-pp.rw_verify_area
0.12 ± 15% +0.1 0.18 ± 7% perf-profile.self.cycles-pp.ext4_data_block_valid_rcu
0.00 +0.1 0.06 ± 7% perf-profile.self.cycles-pp.fpregs_assert_state_consistent
0.12 ± 4% +0.1 0.18 ± 6% perf-profile.self.cycles-pp.up_read
0.15 ± 6% +0.1 0.21 ± 5% perf-profile.self.cycles-pp._copy_to_iter_mcsafe
0.09 ± 4% +0.1 0.15 ± 3% perf-profile.self.cycles-pp.ext4_map_blocks
0.09 ± 4% +0.1 0.16 ± 7% perf-profile.self.cycles-pp.__fsnotify_parent
0.12 ± 8% +0.1 0.19 ± 3% perf-profile.self.cycles-pp.mutex_unlock
0.12 ± 17% +0.1 0.19 ± 9% perf-profile.self.cycles-pp.__virt_addr_valid
0.10 ± 14% +0.1 0.17 ± 5% perf-profile.self.cycles-pp.refill_reqs_available
0.15 ± 5% +0.1 0.21 ± 5% perf-profile.self.cycles-pp.aio_complete_rw
0.12 ± 8% +0.1 0.19 ± 6% perf-profile.self.cycles-pp.fput_many
0.11 ± 9% +0.1 0.18 ± 4% perf-profile.self.cycles-pp.__x64_sys_io_getevents
0.16 ± 7% +0.1 0.23 ± 6% perf-profile.self.cycles-pp.__might_fault
0.18 ± 7% +0.1 0.26 ± 2% perf-profile.self.cycles-pp.__get_reqs_available
0.16 ± 7% +0.1 0.24 perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
0.15 ± 14% +0.1 0.23 ± 5% perf-profile.self.cycles-pp.aio_read
0.17 ± 6% +0.1 0.25 ± 4% perf-profile.self.cycles-pp.dax_iomap_actor
0.17 ± 7% +0.1 0.25 perf-profile.self.cycles-pp.__lock_text_start
0.16 ± 7% +0.1 0.25 ± 6% perf-profile.self.cycles-pp.copy_user_generic_unrolled
0.17 ± 4% +0.1 0.26 ± 6% perf-profile.self.cycles-pp.__get_user_8
0.16 ± 7% +0.1 0.26 ± 7% perf-profile.self.cycles-pp.do_syscall_64
0.13 ± 14% +0.1 0.22 ± 3% perf-profile.self.cycles-pp.__srcu_read_lock
0.17 ± 11% +0.1 0.27 ± 7% perf-profile.self.cycles-pp.__put_user_4
0.19 ± 10% +0.1 0.29 ± 4% perf-profile.self.cycles-pp.put_reqs_available
0.09 ± 7% +0.1 0.20 ± 31% perf-profile.self.cycles-pp.security_file_permission
0.18 ± 6% +0.1 0.29 ± 3% perf-profile.self.cycles-pp.iomap_apply
0.22 ± 10% +0.1 0.34 ± 5% perf-profile.self.cycles-pp.copyout_mcsafe
0.18 ± 9% +0.1 0.30 ± 10% perf-profile.self.cycles-pp.kmem_cache_alloc
0.20 ± 6% +0.1 0.32 ± 9% perf-profile.self.cycles-pp.__x64_sys_io_submit
0.24 ± 6% +0.1 0.38 ± 2% perf-profile.self.cycles-pp.__fget
0.24 ± 7% +0.1 0.38 ± 2% perf-profile.self.cycles-pp._raw_spin_lock_irqsave
0.31 ± 6% +0.1 0.46 ± 2% perf-profile.self.cycles-pp.__might_sleep
0.36 ± 6% +0.2 0.55 ± 4% perf-profile.self.cycles-pp.__get_user_4
0.39 ± 7% +0.2 0.58 ± 2% perf-profile.self.cycles-pp.copy_user_enhanced_fast_string
0.21 ± 13% +0.2 0.40 ± 7% perf-profile.self.cycles-pp.selinux_file_permission
0.38 ± 8% +0.2 0.59 ± 2% perf-profile.self.cycles-pp.entry_SYSCALL_64
0.35 ± 9% +0.2 0.56 ± 9% perf-profile.self.cycles-pp.aio_read_events
0.49 ± 10% +0.3 0.74 ± 4% perf-profile.self.cycles-pp.___might_sleep
0.52 ± 9% +0.3 0.79 ± 2% perf-profile.self.cycles-pp.syscall_return_via_sysret
0.43 ± 7% +0.3 0.71 ± 2% perf-profile.self.cycles-pp.lookup_ioctx
0.00 +0.3 0.30 ± 33% perf-profile.self.cycles-pp.percpu_counter_add_batch
0.63 ± 8% +0.3 0.97 ± 6% perf-profile.self.cycles-pp.io_submit_one
8.48 ± 13% +3.8 12.32 ± 13% perf-profile.self.cycles-pp.__memcpy_mcsafe
10.96 ± 5% +4.7 15.68 ± 5% perf-profile.self.cycles-pp._raw_read_lock
17.97 ± 4% +5.7 23.65 ± 6% perf-profile.self.cycles-pp.jbd2_transaction_committed
fio.read_bw_MBps
50000 +-+-----------------------------------------------------------------+
45000 O-O O O O O O O O O O O O
| O O O O O O O O O O O O O O |
40000 +-+ |
35000 +-+..+. .+. .+.+. .+. .+ |
| +. +. +..+.+..+.+..+.+.+..+.+. +. |
30000 +-+ |
25000 +-+ |
20000 +-+ |
| |
15000 +-+ |
10000 +-+ |
| |
5000 +-+ |
0 +-+-------------------------------------------O----O----------------+
fio.read_iops
1.4e+07 +-+---------------------------------------------------------------+
| |
1.2e+07 O-O O O O O O O O O O
| O O O O O O O O O O O O O O O O |
1e+07 +-+ |
|.+..+. .+..+.+..+. .+. .+. .+..+.+.+..+.+ |
8e+06 +-+ + +.+..+ +. + |
| |
6e+06 +-+ |
| |
4e+06 +-+ |
| |
2e+06 +-+ |
| |
0 +-+------------------------------------------O----O---------------+
fio.read_clat_mean_us
200000 +-+----------------------------------------------------------------+
180000 +-+ .+..+.+. .+. .+. |
|.+..+.+..+.+.+..+ +. +. +.+..+.+.+..+ |
160000 +-+ |
140000 +-+ O O |
O O O O O O O O O O O O O O O O O O O O O O O O O
120000 +-+ |
100000 +-+ |
80000 +-+ |
| |
60000 +-+ |
40000 +-+ |
| |
20000 +-+ |
0 +-+------------------------------------------O----O----------------+
fio.read_slat_mean_us
6000 +-+------------------------------------------------------------------+
| |
5000 +-+ .+.. .+.+..+. .+. .+. .+. |
|.+..+ +.+..+.+. +. +. +.+. +..+.+ |
| |
4000 +-+ O O O O O O |
O O O O O O O O O O O O O O O O O O O O O
3000 +-+ |
| |
2000 +-+ |
| |
| |
1000 +-+ |
| |
0 +-+--------------------------------------------O----O----------------+
fio.workload
2.5e+09 O-+-----------O---------------------------------------------------+
| O O O O O O O O O O O O O O O O O O O
| O O O O O O |
2e+09 +-+ |
|.+..+. .+..+.+..+. .+. .+. .+..+.+.+..+.+ |
| + +.+..+ +. + |
1.5e+09 +-+ |
| |
1e+09 +-+ |
| |
| |
5e+08 +-+ |
| |
| |
0 +-+------------------------------------------O----O---------------+
fio.time.user_time
1400 +-+------------------------------------------------------------------+
O O O O O O O O O O O
1200 +-+ O O O O O O O O O O O O O O O O |
| |
1000 +-+ .+.+..+. .+.. .+.. .+.. |
|.+..+.+. +..+ +.+..+.+..+.+ + +.+ |
800 +-+ |
| |
600 +-+ |
| |
400 +-+ |
| |
200 +-+ |
| |
0 +-+--------------------------------------------O----O----------------+
fio.time.system_time
9000 +-+------------------------------------------------------------------+
O O O O O.O..O.O. O O. O O O O O O O. O O. O O O O O O O O
8000 +-+ |
7000 +-+ |
| |
6000 +-+ |
5000 +-+ |
| |
4000 +-+ |
3000 +-+ |
| |
2000 +-+ |
1000 +-+ |
| |
0 +-+--------------------------------------------O----O----------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Rong Chen
View attachment "config-5.3.0-rc4-00016-g520f897a3554b" of type "text/plain" (199557 bytes)
View attachment "job-script" of type "text/plain" (8290 bytes)
View attachment "job.yaml" of type "text/plain" (5820 bytes)
View attachment "reproduce" of type "text/plain" (912 bytes)
Powered by blists - more mailing lists