lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Date:   Thu, 6 Feb 2020 20:03:15 +0800
From:   kernel test robot <rong.a.chen@...el.com>
To:     Yang Guo <guoyang2@...wei.com>
Cc:     Theodore Ts'o <tytso@....edu>,
        Andreas Dilger <adilger.kernel@...ger.ca>,
        Eric Biggers <ebiggers@...nel.org>,
        Shaokun Zhang <zhangshaokun@...ilicon.com>,
        LKML <linux-kernel@...r.kernel.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        lkp@...ts.01.org
Subject: [ext4] 520f897a35: fio.read_bw_MBps 33.3% improvement

Greeting,

FYI, we noticed a 33.3% improvement of fio.read_bw_MBps due to commit:


commit: 520f897a3554b0665af1ae5d5ba286f290cecf5c ("ext4: use percpu_counters for extent_status cache hits/misses")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

in testcase: fio-basic
on test machine: 96 threads Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 256G memory
with following parameters:

	disk: 2pmem
	fs: ext4
	mount_option: dax
	runtime: 200s
	nr_task: 50%
	time_based: tb
	rw: randread
	bs: 4k
	ioengine: libaio
	test_size: 200G
	cpufreq_governor: performance
	ucode: 0x500002c

test-description: Fio is a tool that will spawn a number of threads or processes doing a particular type of I/O action as specified by the user.
test-url: https://github.com/axboe/fio





Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp install job.yaml  # job file is attached in this email
        bin/lkp run     job.yaml

=========================================================================================
bs/compiler/cpufreq_governor/disk/fs/ioengine/kconfig/mount_option/nr_task/rootfs/runtime/rw/tbox_group/test_size/testcase/time_based/ucode:
  4k/gcc-7/performance/2pmem/ext4/libaio/x86_64-rhel-7.6/dax/50%/debian-x86_64-20191114.cgz/200s/randread/lkp-csl-2sp6/200G/fio-basic/tb/0x500002c

commit: 
  7727ae5297 ("ext4: fix potential use after free after remounting with noblock_validity")
  520f897a35 ("ext4: use percpu_counters for extent_status cache hits/misses")

7727ae52975d4f4e 520f897a3554b0665af1ae5d5ba 
---------------- --------------------------- 
       fail:runs  %reproduction    fail:runs
           |             |             |    
           :4           50%           2:4     dmesg.WARNING:at_ip__fsnotify_parent/0x
           :4           25%           1:4     dmesg.WARNING:at_ip__x64_sys_io_submit/0x
           :4           25%           1:4     dmesg.WARNING:at_ip_aio_read/0x
          1:4          -25%            :4     dmesg.WARNING:at_ip_io_submit_one/0x
         %stddev     %change         %stddev
             \          |                \  
      0.99 ± 62%      -1.0        0.01        fio.latency_500us%
     33853           +33.3%      45121        fio.read_bw_MBps
    225792 ±  3%     -32.9%     151552 ±  2%  fio.read_clat_90%_us
    232960 ±  3%     -31.9%     158720 ±  2%  fio.read_clat_95%_us
    247296 ±  3%     -28.8%     176128 ±  2%  fio.read_clat_99%_us
    172016           -24.9%     129123        fio.read_clat_mean_us
     45196 ± 11%     -47.6%      23675 ± 19%  fio.read_clat_stddev
   8666452           +33.3%   11551177        fio.read_iops
      4884           -28.5%       3491        fio.read_slat_mean_us
      8683            -3.7%       8359        fio.time.system_time
    925.36 ±  3%     +35.1%       1249        fio.time.user_time
 1.733e+09           +33.3%   2.31e+09        fio.workload
     10831 ±  7%     +16.6%      12633 ± 10%  cpuidle.POLL.usage
      4.82 ±  3%      +1.7        6.50        mpstat.cpu.all.usr%
      1452            +2.4%       1487        vmstat.system.cs
     44.67            -3.6%      43.07        iostat.cpu.system
      4.77 ±  3%     +35.1%       6.45        iostat.cpu.user
    289.30            +2.0%     295.11        turbostat.PkgWatt
    138.34            +1.7%     140.76        turbostat.RAMWatt
     43270 ± 13%     -17.3%      35789 ± 11%  numa-vmstat.node0.nr_active_anon
     24664 ±  5%     -11.0%      21939 ±  8%  numa-vmstat.node0.nr_slab_unreclaimable
     43270 ± 13%     -17.3%      35789 ± 11%  numa-vmstat.node0.nr_zone_active_anon
    177415 ± 13%     -16.9%     147370 ± 10%  numa-meminfo.node0.Active
    173112 ± 13%     -17.3%     143140 ± 11%  numa-meminfo.node0.Active(anon)
     98646 ±  5%     -11.0%      87748 ±  8%  numa-meminfo.node0.SUnreclaim
    141129 ±  6%     -13.4%     122215 ±  8%  numa-meminfo.node0.Slab
    323.59 ±109%    -100.0%       0.00 ±  3%  sched_debug.cfs_rq:/.MIN_vruntime.stddev
    323.59 ±109%    -100.0%       0.00 ±  3%  sched_debug.cfs_rq:/.max_vruntime.stddev
      7.70 ±  9%     -14.9%       6.55 ± 13%  sched_debug.cpu.nr_uninterruptible.stddev
     24248 ± 16%     -33.3%      16171 ± 23%  sched_debug.cpu.sched_count.max
      3313 ±  8%     -16.9%       2753 ± 10%  sched_debug.cpu.sched_count.stddev
     58.50 ±133%     -91.0%       5.25 ± 65%  interrupts.CPU2.RES:Rescheduling_interrupts
      2416 ±  7%     -32.0%       1644 ±  3%  interrupts.CPU20.CAL:Function_call_interrupts
      2410 ±  7%     -31.3%       1656 ±  4%  interrupts.CPU21.CAL:Function_call_interrupts
      2266 ±  6%     +12.1%       2540 ±  3%  interrupts.CPU36.CAL:Function_call_interrupts
      2258 ±  6%     +12.7%       2546 ±  3%  interrupts.CPU37.CAL:Function_call_interrupts
     22.75 ±123%    +184.6%      64.75 ± 62%  interrupts.CPU37.TLB:TLB_shootdowns
     34.75 ±111%    +190.6%     101.00 ± 51%  interrupts.CPU38.TLB:TLB_shootdowns
      2256 ±  6%     +12.0%       2528 ±  3%  interrupts.CPU39.CAL:Function_call_interrupts
      5801 ± 38%     -50.2%       2888        interrupts.CPU47.NMI:Non-maskable_interrupts
      5801 ± 38%     -50.2%       2888        interrupts.CPU47.PMI:Performance_monitoring_interrupts
     93.75 ±149%     -96.8%       3.00 ± 97%  interrupts.CPU5.RES:Rescheduling_interrupts
     44.75 ± 77%    +153.1%     113.25 ± 39%  interrupts.CPU52.TLB:TLB_shootdowns
     32.25 ± 74%    +186.8%      92.50 ± 24%  interrupts.CPU54.TLB:TLB_shootdowns
     37.00 ± 78%    +146.6%      91.25 ± 35%  interrupts.CPU60.TLB:TLB_shootdowns
     55.50 ± 58%     +87.4%     104.00 ± 21%  interrupts.CPU65.TLB:TLB_shootdowns
     36.25 ± 52%    +157.9%      93.50 ± 35%  interrupts.CPU71.TLB:TLB_shootdowns
     12.91 ±  8%     -22.6%      10.00 ±  5%  perf-stat.i.MPKI
 1.549e+10           +32.9%  2.058e+10        perf-stat.i.branch-instructions
  63057844           +30.7%   82417950 ±  6%  perf-stat.i.branch-misses
     69.19 ±  4%     +10.3       79.50 ±  3%  perf-stat.i.cache-miss-rate%
 7.224e+08 ±  3%     +18.4%  8.554e+08 ±  2%  perf-stat.i.cache-misses
      1395            +2.6%       1431        perf-stat.i.context-switches
      1.65           -25.0%       1.24        perf-stat.i.cpi
    192.37 ±  4%     -15.7%     162.11 ±  2%  perf-stat.i.cycles-between-cache-misses
     2e+10 ±  8%     +29.7%  2.594e+10 ± 11%  perf-stat.i.dTLB-loads
     13727 ±  2%      +5.4%      14471 ±  3%  perf-stat.i.dTLB-store-misses
  1.34e+10 ±  6%     +34.0%  1.796e+10 ±  6%  perf-stat.i.dTLB-stores
  60015893 ±  6%      +7.4%   64478755 ±  3%  perf-stat.i.iTLB-load-misses
   5975439            +5.2%    6284441        perf-stat.i.iTLB-loads
 8.135e+10           +32.9%  1.081e+11        perf-stat.i.instructions
      1366 ±  8%     +23.0%       1680 ±  4%  perf-stat.i.instructions-per-iTLB-miss
      0.61           +33.4%       0.81        perf-stat.i.ipc
   6781479 ±  5%     -21.3%    5337894 ±  5%  perf-stat.i.node-store-misses
     65017 ±  8%     +28.7%      83703 ±  3%  perf-stat.i.node-stores
     12.86 ±  9%     -22.8%       9.94 ±  5%  perf-stat.overall.MPKI
     69.37 ±  4%     +10.4       79.79 ±  3%  perf-stat.overall.cache-miss-rate%
      1.65           -25.2%       1.23        perf-stat.overall.cpi
    185.84 ±  3%     -16.1%     155.96 ±  2%  perf-stat.overall.cycles-between-cache-misses
      0.00 ±  7%      -0.0        0.00 ± 12%  perf-stat.overall.dTLB-load-miss-rate%
      0.00 ±  8%      -0.0        0.00 ±  7%  perf-stat.overall.dTLB-store-miss-rate%
      1362 ±  8%     +23.2%       1678 ±  4%  perf-stat.overall.instructions-per-iTLB-miss
      0.61           +33.6%       0.81        perf-stat.overall.ipc
 1.541e+10           +32.9%  2.047e+10        perf-stat.ps.branch-instructions
  62736602           +30.7%   81998518 ±  6%  perf-stat.ps.branch-misses
 7.187e+08 ±  3%     +18.4%   8.51e+08 ±  2%  perf-stat.ps.cache-misses
      1388            +2.6%       1424        perf-stat.ps.context-switches
  1.99e+10 ±  8%     +29.7%  2.581e+10 ± 11%  perf-stat.ps.dTLB-loads
     13711 ±  2%      +5.5%      14459 ±  3%  perf-stat.ps.dTLB-store-misses
 1.333e+10 ±  6%     +34.0%  1.786e+10 ±  6%  perf-stat.ps.dTLB-stores
  59709963 ±  6%      +7.4%   64151654 ±  3%  perf-stat.ps.iTLB-load-misses
   5945182            +5.2%    6252891        perf-stat.ps.iTLB-loads
 8.093e+10           +32.9%  1.075e+11        perf-stat.ps.instructions
   6746979 ±  5%     -21.3%    5310789 ±  5%  perf-stat.ps.node-store-misses
     64702 ±  8%     +28.7%      83271 ±  3%  perf-stat.ps.node-stores
 1.632e+13           +32.9%  2.168e+13        perf-stat.total.instructions
      7271 ± 11%     +32.1%       9603 ±  5%  softirqs.CPU0.RCU
      7136 ± 13%     +40.0%       9989 ±  2%  softirqs.CPU10.RCU
      7326 ± 12%     +36.3%       9985 ±  4%  softirqs.CPU11.RCU
      7019 ±  5%     +42.9%      10032        softirqs.CPU13.RCU
      7277 ± 11%     +36.9%       9961 ±  6%  softirqs.CPU15.RCU
      7498 ±  5%     +28.2%       9610 ± 10%  softirqs.CPU16.RCU
      7664 ± 10%     +35.1%      10356        softirqs.CPU17.RCU
      7573 ± 13%     +33.7%      10126 ±  3%  softirqs.CPU18.RCU
      8033 ±  8%     +28.0%      10281 ±  3%  softirqs.CPU19.RCU
      7395 ± 15%     +73.1%      12804 ± 20%  softirqs.CPU2.RCU
      7656 ±  9%     +36.0%      10414 ±  5%  softirqs.CPU20.RCU
      7355 ± 12%     +37.4%      10104 ±  2%  softirqs.CPU21.RCU
      7585 ± 10%     +35.0%      10244 ±  3%  softirqs.CPU22.RCU
      7947 ± 11%     +28.2%      10186 ±  2%  softirqs.CPU23.RCU
      8479 ± 10%     +44.1%      12221 ±  5%  softirqs.CPU24.RCU
      7699 ± 10%     +40.3%      10801 ± 10%  softirqs.CPU25.RCU
      7463 ±  3%     +38.7%      10352 ±  6%  softirqs.CPU26.RCU
      7431 ±  4%     +47.9%      10990 ± 12%  softirqs.CPU27.RCU
      7292 ±  6%     +52.4%      11115 ± 12%  softirqs.CPU28.RCU
      7359 ±  5%     +42.5%      10488 ±  9%  softirqs.CPU29.RCU
      6844 ± 15%     +54.1%      10546 ±  9%  softirqs.CPU30.RCU
      7111 ± 15%     +41.8%      10086 ±  8%  softirqs.CPU31.RCU
      7538 ±  5%     +40.6%      10602 ±  9%  softirqs.CPU32.RCU
      7631 ±  6%     +35.4%      10331 ±  7%  softirqs.CPU33.RCU
      7627 ± 11%     +36.7%      10427 ±  7%  softirqs.CPU34.RCU
      7783 ±  9%     +33.6%      10401 ±  7%  softirqs.CPU35.RCU
      7574 ±  7%     +38.4%      10484 ±  6%  softirqs.CPU36.RCU
      7626 ±  5%     +34.4%      10247 ±  5%  softirqs.CPU37.RCU
      7668 ±  7%     +36.9%      10496 ±  7%  softirqs.CPU38.RCU
      7752 ±  6%     +33.3%      10334 ±  5%  softirqs.CPU39.RCU
      7868 ±  7%     +28.1%      10081 ±  4%  softirqs.CPU4.RCU
      7780 ±  5%     +36.0%      10577 ±  4%  softirqs.CPU40.RCU
      7489 ±  5%     +32.4%       9915 ± 14%  softirqs.CPU41.RCU
      7669 ±  6%     +27.3%       9760 ±  6%  softirqs.CPU42.RCU
      7552 ±  6%     +38.5%      10463 ±  7%  softirqs.CPU44.RCU
      7337 ±  6%     +35.9%       9975 ± 11%  softirqs.CPU45.RCU
      7663 ±  4%     +38.1%      10582 ±  9%  softirqs.CPU46.RCU
      7202 ± 10%     +33.5%       9613 ±  5%  softirqs.CPU47.RCU
      8444 ±  8%     +38.6%      11705 ±  5%  softirqs.CPU48.RCU
      8371 ±  2%     +40.5%      11761 ± 13%  softirqs.CPU49.RCU
      7047 ± 28%     +45.8%      10274 ±  5%  softirqs.CPU5.RCU
      7671 ± 14%     +43.7%      11026 ± 10%  softirqs.CPU50.RCU
      7325 ± 11%     +52.2%      11146 ±  9%  softirqs.CPU51.RCU
      8199 ±  5%     +39.6%      11445 ±  6%  softirqs.CPU52.RCU
      7986 ±  3%     +44.9%      11576 ± 11%  softirqs.CPU53.RCU
      8141 ±  3%     +41.1%      11484 ± 14%  softirqs.CPU55.RCU
      8114 ±  4%     +34.7%      10928 ± 14%  softirqs.CPU56.RCU
      7931 ±  5%     +41.8%      11243 ± 12%  softirqs.CPU57.RCU
      8007 ±  7%     +40.4%      11241 ± 13%  softirqs.CPU58.RCU
      7744 ±  2%     +43.2%      11089 ± 11%  softirqs.CPU59.RCU
      7492 ± 11%     +30.5%       9776 ±  2%  softirqs.CPU6.RCU
      8171 ±  3%     +38.5%      11315 ±  8%  softirqs.CPU60.RCU
      8151 ± 15%     +35.6%      11052 ± 10%  softirqs.CPU61.RCU
      7663 ±  6%     +51.8%      11636 ±  9%  softirqs.CPU62.RCU
      7905 ±  7%     +43.2%      11317 ± 11%  softirqs.CPU63.RCU
      7910 ±  4%     +39.1%      11005 ± 12%  softirqs.CPU64.RCU
      7795 ±  5%     +38.2%      10770 ± 11%  softirqs.CPU65.RCU
      7812 ±  3%     +35.8%      10611 ± 14%  softirqs.CPU66.RCU
      7761 ±  4%     +43.5%      11137 ± 14%  softirqs.CPU67.RCU
      7705 ±  4%     +40.0%      10785 ± 12%  softirqs.CPU68.RCU
      8023 ± 12%     +36.0%      10909 ± 12%  softirqs.CPU69.RCU
      7252 ±  4%     +33.6%       9692 ±  3%  softirqs.CPU7.RCU
      7836 ±  4%     +37.6%      10784 ± 11%  softirqs.CPU70.RCU
      7909 ±  3%     +37.9%      10907 ± 11%  softirqs.CPU71.RCU
      6832 ±  8%     +40.7%       9613 ±  4%  softirqs.CPU73.RCU
      6585 ± 10%     +47.1%       9685 ±  6%  softirqs.CPU74.RCU
      6827 ± 12%     +39.7%       9536 ±  4%  softirqs.CPU75.RCU
      6667 ± 11%     +42.2%       9479 ±  5%  softirqs.CPU76.RCU
      6955 ±  9%     +36.6%       9498 ±  5%  softirqs.CPU77.RCU
      6539 ± 10%     +46.6%       9586 ±  5%  softirqs.CPU78.RCU
      6645 ±  9%     +45.8%       9688 ±  7%  softirqs.CPU79.RCU
      7613 ± 14%     +31.0%       9975        softirqs.CPU8.RCU
      6864 ±  5%     +45.9%      10016 ±  5%  softirqs.CPU80.RCU
      6931 ±  6%     +39.3%       9656 ±  6%  softirqs.CPU81.RCU
      7108 ±  4%     +46.3%      10401 ±  8%  softirqs.CPU82.RCU
      7222 ±  8%     +42.2%      10268 ±  5%  softirqs.CPU83.RCU
      7171 ±  9%     +41.0%      10110 ±  6%  softirqs.CPU84.RCU
      7079 ±  7%     +39.9%       9902 ±  7%  softirqs.CPU85.RCU
      7175 ±  9%     +37.3%       9855 ±  7%  softirqs.CPU86.RCU
      7190 ±  8%     +37.4%       9877 ±  9%  softirqs.CPU87.RCU
      6952 ±  7%     +40.7%       9782 ±  6%  softirqs.CPU88.RCU
      7068 ±  4%     +40.0%       9896 ±  8%  softirqs.CPU89.RCU
      7415 ± 11%     +33.6%       9906 ±  3%  softirqs.CPU9.RCU
      7031 ±  6%     +46.0%      10268 ± 10%  softirqs.CPU90.RCU
      7030 ±  7%     +43.1%      10060 ±  6%  softirqs.CPU91.RCU
      6980 ±  5%     +41.1%       9850 ± 10%  softirqs.CPU92.RCU
      6857 ±  7%     +43.6%       9847 ±  7%  softirqs.CPU93.RCU
      7185 ±  9%     +40.2%      10072 ±  9%  softirqs.CPU94.RCU
      7011 ±  3%     +51.7%      10638 ±  5%  softirqs.CPU95.RCU
    726904 ±  4%     +37.8%    1001841 ±  3%  softirqs.RCU
     13.99 ±  6%     -11.4        2.55 ± 25%  perf-profile.calltrace.cycles-pp.ext4_map_blocks.ext4_iomap_begin.iomap_apply.dax_iomap_rw.ext4_file_read_iter
     12.21 ±  6%     -10.5        1.66 ±  7%  perf-profile.calltrace.cycles-pp.ext4_es_lookup_extent.ext4_map_blocks.ext4_iomap_begin.iomap_apply.dax_iomap_rw
      0.51 ±  2%      +0.4        0.95 ± 10%  perf-profile.calltrace.cycles-pp.security_file_permission.aio_read.io_submit_one.__x64_sys_io_submit.do_syscall_64
      0.26 ±100%      +0.5        0.78 ±  3%  perf-profile.calltrace.cycles-pp.lookup_ioctx.__x64_sys_io_submit.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.00            +0.5        0.52 ±  3%  perf-profile.calltrace.cycles-pp._copy_to_user.aio_read_events.read_events.do_io_getevents.__x64_sys_io_getevents
      0.27 ±100%      +0.5        0.79 ±  2%  perf-profile.calltrace.cycles-pp.lookup_ioctx.do_io_getevents.__x64_sys_io_getevents.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.00            +0.6        0.57 ±  2%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64
      0.00            +0.6        0.62 ±  6%  perf-profile.calltrace.cycles-pp.selinux_file_permission.security_file_permission.aio_read.io_submit_one.__x64_sys_io_submit
      0.00            +0.7        0.68 ±  3%  perf-profile.calltrace.cycles-pp._copy_from_user.io_submit_one.__x64_sys_io_submit.do_syscall_64.entry_SYSCALL_64_after_hwframe
      1.35 ±  8%      +0.7        2.05 ±  4%  perf-profile.calltrace.cycles-pp.aio_read_events.read_events.do_io_getevents.__x64_sys_io_getevents.do_syscall_64
      1.49 ±  8%      +0.8        2.27 ±  4%  perf-profile.calltrace.cycles-pp.read_events.do_io_getevents.__x64_sys_io_getevents.do_syscall_64.entry_SYSCALL_64_after_hwframe
      2.05 ±  8%      +1.1        3.16 ±  3%  perf-profile.calltrace.cycles-pp.do_io_getevents.__x64_sys_io_getevents.do_syscall_64.entry_SYSCALL_64_after_hwframe
      2.18 ±  8%      +1.2        3.36 ±  2%  perf-profile.calltrace.cycles-pp.__x64_sys_io_getevents.do_syscall_64.entry_SYSCALL_64_after_hwframe
      8.54 ± 13%      +3.9       12.41 ± 13%  perf-profile.calltrace.cycles-pp.__memcpy_mcsafe.copyout_mcsafe._copy_to_iter_mcsafe.dax_iomap_actor.iomap_apply
      8.69 ± 13%      +4.0       12.64 ± 13%  perf-profile.calltrace.cycles-pp.copyout_mcsafe._copy_to_iter_mcsafe.dax_iomap_actor.iomap_apply.dax_iomap_rw
      8.97 ± 12%      +4.1       13.05 ± 12%  perf-profile.calltrace.cycles-pp._copy_to_iter_mcsafe.dax_iomap_actor.iomap_apply.dax_iomap_rw.ext4_file_read_iter
      9.54 ± 12%      +4.4       13.91 ± 11%  perf-profile.calltrace.cycles-pp.dax_iomap_actor.iomap_apply.dax_iomap_rw.ext4_file_read_iter.aio_read
     10.86 ±  5%      +4.7       15.54 ±  5%  perf-profile.calltrace.cycles-pp._raw_read_lock.jbd2_transaction_committed.ext4_iomap_begin.iomap_apply.dax_iomap_rw
     28.92 ±  4%     +10.4       39.30 ±  5%  perf-profile.calltrace.cycles-pp.jbd2_transaction_committed.ext4_iomap_begin.iomap_apply.dax_iomap_rw.ext4_file_read_iter
     13.99 ±  6%     -11.4        2.56 ± 25%  perf-profile.children.cycles-pp.ext4_map_blocks
     12.22 ±  6%     -10.5        1.67 ±  7%  perf-profile.children.cycles-pp.ext4_es_lookup_extent
      0.05 ±  9%      +0.0        0.08 ±  5%  perf-profile.children.cycles-pp.__pmem_direct_access
      0.04 ± 58%      +0.0        0.07 ±  5%  perf-profile.children.cycles-pp.fpregs_assert_state_consistent
      0.10 ±  5%      +0.0        0.13 ± 10%  perf-profile.children.cycles-pp.scheduler_tick
      0.07            +0.0        0.11 ±  4%  perf-profile.children.cycles-pp.kmem_cache_free
      0.07 ±  5%      +0.0        0.11 ± 12%  perf-profile.children.cycles-pp.task_tick_fair
      0.05 ±  9%      +0.0        0.09 ±  4%  perf-profile.children.cycles-pp.import_single_range
      0.07 ±  5%      +0.0        0.11 ±  3%  perf-profile.children.cycles-pp.rcu_all_qs
      0.03 ±100%      +0.0        0.07 ± 10%  perf-profile.children.cycles-pp.__x86_indirect_thunk_rax
      0.01 ±173%      +0.0        0.06        perf-profile.children.cycles-pp.pmem_dax_direct_access
      0.08 ± 11%      +0.0        0.12 ±  6%  perf-profile.children.cycles-pp.aio_setup_rw
      0.14 ± 11%      +0.1        0.20 ± 13%  perf-profile.children.cycles-pp.update_process_times
      0.15 ± 14%      +0.1        0.20 ± 12%  perf-profile.children.cycles-pp.tick_sched_handle
      0.10 ±  4%      +0.1        0.15 ±  2%  perf-profile.children.cycles-pp.__srcu_read_unlock
      0.00            +0.1        0.06 ±  9%  perf-profile.children.cycles-pp.rw_verify_area
      0.11 ± 11%      +0.1        0.17 ±  2%  perf-profile.children.cycles-pp.dax_direct_access
      0.12 ± 17%      +0.1        0.18 ±  8%  perf-profile.children.cycles-pp.ext4_data_block_valid_rcu
      0.12 ±  7%      +0.1        0.17 ±  4%  perf-profile.children.cycles-pp.down_read_trylock
      0.11 ±  8%      +0.1        0.17 ±  8%  perf-profile.children.cycles-pp.__fsnotify_parent
      0.13 ± 14%      +0.1        0.20 ± 10%  perf-profile.children.cycles-pp.__virt_addr_valid
      0.15 ±  7%      +0.1        0.22 ±  3%  perf-profile.children.cycles-pp.aio_complete_rw
      0.13 ±  6%      +0.1        0.19 ±  5%  perf-profile.children.cycles-pp.up_read
      0.11 ± 12%      +0.1        0.17 ±  4%  perf-profile.children.cycles-pp.refill_reqs_available
      0.13 ± 12%      +0.1        0.20 ±  5%  perf-profile.children.cycles-pp.mutex_unlock
      0.13 ±  7%      +0.1        0.20 ±  4%  perf-profile.children.cycles-pp.fput_many
      0.21 ± 17%      +0.1        0.28 ±  9%  perf-profile.children.cycles-pp.__hrtimer_run_queues
      0.13 ± 13%      +0.1        0.21 ± 10%  perf-profile.children.cycles-pp.__inode_security_revalidate
      0.15 ± 10%      +0.1        0.22 ±  4%  perf-profile.children.cycles-pp._cond_resched
      0.19 ±  6%      +0.1        0.27 ±  2%  perf-profile.children.cycles-pp.__get_reqs_available
      0.18 ±  7%      +0.1        0.27        perf-profile.children.cycles-pp.__lock_text_start
      0.18 ±  4%      +0.1        0.27 ±  6%  perf-profile.children.cycles-pp.__get_user_8
      0.17 ±  7%      +0.1        0.26 ±  4%  perf-profile.children.cycles-pp.copy_user_generic_unrolled
      0.14 ± 10%      +0.1        0.23 ±  3%  perf-profile.children.cycles-pp.__srcu_read_lock
      0.23 ±  8%      +0.1        0.32 ±  2%  perf-profile.children.cycles-pp.mutex_lock
      0.18 ±  9%      +0.1        0.28 ±  7%  perf-profile.children.cycles-pp.__put_user_4
      0.19 ± 10%      +0.1        0.29 ±  4%  perf-profile.children.cycles-pp.put_reqs_available
      0.23 ±  9%      +0.1        0.35 ±  5%  perf-profile.children.cycles-pp.__check_object_size
      0.24 ±  6%      +0.1        0.38 ±  3%  perf-profile.children.cycles-pp.__fget
      0.24 ±  9%      +0.1        0.38 ±  2%  perf-profile.children.cycles-pp._raw_spin_lock_irqsave
      0.36 ±  8%      +0.2        0.53 ±  4%  perf-profile.children.cycles-pp._copy_to_user
      0.34 ±  7%      +0.2        0.51 ±  3%  perf-profile.children.cycles-pp.__might_sleep
      0.31 ±  9%      +0.2        0.49 ±  8%  perf-profile.children.cycles-pp.kmem_cache_alloc
      0.37 ±  6%      +0.2        0.56 ±  3%  perf-profile.children.cycles-pp.__get_user_4
      0.39 ±  7%      +0.2        0.58 ±  3%  perf-profile.children.cycles-pp.copy_user_enhanced_fast_string
      0.38 ±  8%      +0.2        0.59 ±  2%  perf-profile.children.cycles-pp.entry_SYSCALL_64
      0.45 ±  8%      +0.2        0.68 ±  3%  perf-profile.children.cycles-pp._copy_from_user
      0.49 ± 10%      +0.3        0.75 ±  4%  perf-profile.children.cycles-pp.___might_sleep
      0.52 ±  9%      +0.3        0.79 ±  2%  perf-profile.children.cycles-pp.syscall_return_via_sysret
      0.35 ±  3%      +0.3        0.63 ±  6%  perf-profile.children.cycles-pp.selinux_file_permission
      0.67 ±  8%      +0.3        1.00 ±  3%  perf-profile.children.cycles-pp.__might_fault
      0.00            +0.3        0.33 ± 30%  perf-profile.children.cycles-pp.percpu_counter_add_batch
      0.53 ±  2%      +0.5        0.98 ± 11%  perf-profile.children.cycles-pp.security_file_permission
      1.01 ±  7%      +0.6        1.59        perf-profile.children.cycles-pp.lookup_ioctx
      1.36 ±  8%      +0.7        2.08 ±  4%  perf-profile.children.cycles-pp.aio_read_events
      1.50 ±  8%      +0.8        2.29 ±  4%  perf-profile.children.cycles-pp.read_events
      2.06 ±  8%      +1.1        3.18 ±  3%  perf-profile.children.cycles-pp.do_io_getevents
      2.18 ±  8%      +1.2        3.37 ±  2%  perf-profile.children.cycles-pp.__x64_sys_io_getevents
      8.55 ± 13%      +3.9       12.44 ± 13%  perf-profile.children.cycles-pp.__memcpy_mcsafe
      8.75 ± 13%      +4.0       12.74 ± 13%  perf-profile.children.cycles-pp.copyout_mcsafe
      8.99 ± 12%      +4.1       13.07 ± 12%  perf-profile.children.cycles-pp._copy_to_iter_mcsafe
      9.56 ± 12%      +4.4       13.94 ± 11%  perf-profile.children.cycles-pp.dax_iomap_actor
     11.03 ±  5%      +4.8       15.78 ±  5%  perf-profile.children.cycles-pp._raw_read_lock
     28.94 ±  4%     +10.4       39.33 ±  5%  perf-profile.children.cycles-pp.jbd2_transaction_committed
     12.01 ±  6%     -10.9        1.11 ±  5%  perf-profile.self.cycles-pp.ext4_es_lookup_extent
      0.05 ±  8%      +0.0        0.07 ±  5%  perf-profile.self.cycles-pp.__pmem_direct_access
      0.06 ±  6%      +0.0        0.10 ±  5%  perf-profile.self.cycles-pp.kmem_cache_free
      0.06 ± 13%      +0.0        0.10 ±  4%  perf-profile.self.cycles-pp.do_io_getevents
      0.07 ± 11%      +0.0        0.11 ±  6%  perf-profile.self.cycles-pp._cond_resched
      0.07 ± 12%      +0.0        0.11 ±  8%  perf-profile.self.cycles-pp.read_events
      0.04 ± 57%      +0.0        0.08 ±  6%  perf-profile.self.cycles-pp.rcu_all_qs
      0.03 ±100%      +0.0        0.07 ±  7%  perf-profile.self.cycles-pp._copy_from_user
      0.03 ±100%      +0.0        0.07 ±  6%  perf-profile.self.cycles-pp.__inode_security_revalidate
      0.08 ±  5%      +0.0        0.13 ±  6%  perf-profile.self.cycles-pp.dax_iomap_rw
      0.09 ± 12%      +0.0        0.13 ± 14%  perf-profile.self.cycles-pp.__check_object_size
      0.01 ±173%      +0.0        0.06        perf-profile.self.cycles-pp.touch_atime
      0.10 ±  5%      +0.0        0.14 ±  3%  perf-profile.self.cycles-pp.__srcu_read_unlock
      0.15 ± 10%      +0.0        0.20 ±  4%  perf-profile.self.cycles-pp.mutex_lock
      0.12 ±  7%      +0.1        0.17 ±  7%  perf-profile.self.cycles-pp.down_read_trylock
      0.00            +0.1        0.05 ±  8%  perf-profile.self.cycles-pp.import_single_range
      0.00            +0.1        0.06 ±  9%  perf-profile.self.cycles-pp.rw_verify_area
      0.12 ± 15%      +0.1        0.18 ±  7%  perf-profile.self.cycles-pp.ext4_data_block_valid_rcu
      0.00            +0.1        0.06 ±  7%  perf-profile.self.cycles-pp.fpregs_assert_state_consistent
      0.12 ±  4%      +0.1        0.18 ±  6%  perf-profile.self.cycles-pp.up_read
      0.15 ±  6%      +0.1        0.21 ±  5%  perf-profile.self.cycles-pp._copy_to_iter_mcsafe
      0.09 ±  4%      +0.1        0.15 ±  3%  perf-profile.self.cycles-pp.ext4_map_blocks
      0.09 ±  4%      +0.1        0.16 ±  7%  perf-profile.self.cycles-pp.__fsnotify_parent
      0.12 ±  8%      +0.1        0.19 ±  3%  perf-profile.self.cycles-pp.mutex_unlock
      0.12 ± 17%      +0.1        0.19 ±  9%  perf-profile.self.cycles-pp.__virt_addr_valid
      0.10 ± 14%      +0.1        0.17 ±  5%  perf-profile.self.cycles-pp.refill_reqs_available
      0.15 ±  5%      +0.1        0.21 ±  5%  perf-profile.self.cycles-pp.aio_complete_rw
      0.12 ±  8%      +0.1        0.19 ±  6%  perf-profile.self.cycles-pp.fput_many
      0.11 ±  9%      +0.1        0.18 ±  4%  perf-profile.self.cycles-pp.__x64_sys_io_getevents
      0.16 ±  7%      +0.1        0.23 ±  6%  perf-profile.self.cycles-pp.__might_fault
      0.18 ±  7%      +0.1        0.26 ±  2%  perf-profile.self.cycles-pp.__get_reqs_available
      0.16 ±  7%      +0.1        0.24        perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
      0.15 ± 14%      +0.1        0.23 ±  5%  perf-profile.self.cycles-pp.aio_read
      0.17 ±  6%      +0.1        0.25 ±  4%  perf-profile.self.cycles-pp.dax_iomap_actor
      0.17 ±  7%      +0.1        0.25        perf-profile.self.cycles-pp.__lock_text_start
      0.16 ±  7%      +0.1        0.25 ±  6%  perf-profile.self.cycles-pp.copy_user_generic_unrolled
      0.17 ±  4%      +0.1        0.26 ±  6%  perf-profile.self.cycles-pp.__get_user_8
      0.16 ±  7%      +0.1        0.26 ±  7%  perf-profile.self.cycles-pp.do_syscall_64
      0.13 ± 14%      +0.1        0.22 ±  3%  perf-profile.self.cycles-pp.__srcu_read_lock
      0.17 ± 11%      +0.1        0.27 ±  7%  perf-profile.self.cycles-pp.__put_user_4
      0.19 ± 10%      +0.1        0.29 ±  4%  perf-profile.self.cycles-pp.put_reqs_available
      0.09 ±  7%      +0.1        0.20 ± 31%  perf-profile.self.cycles-pp.security_file_permission
      0.18 ±  6%      +0.1        0.29 ±  3%  perf-profile.self.cycles-pp.iomap_apply
      0.22 ± 10%      +0.1        0.34 ±  5%  perf-profile.self.cycles-pp.copyout_mcsafe
      0.18 ±  9%      +0.1        0.30 ± 10%  perf-profile.self.cycles-pp.kmem_cache_alloc
      0.20 ±  6%      +0.1        0.32 ±  9%  perf-profile.self.cycles-pp.__x64_sys_io_submit
      0.24 ±  6%      +0.1        0.38 ±  2%  perf-profile.self.cycles-pp.__fget
      0.24 ±  7%      +0.1        0.38 ±  2%  perf-profile.self.cycles-pp._raw_spin_lock_irqsave
      0.31 ±  6%      +0.1        0.46 ±  2%  perf-profile.self.cycles-pp.__might_sleep
      0.36 ±  6%      +0.2        0.55 ±  4%  perf-profile.self.cycles-pp.__get_user_4
      0.39 ±  7%      +0.2        0.58 ±  2%  perf-profile.self.cycles-pp.copy_user_enhanced_fast_string
      0.21 ± 13%      +0.2        0.40 ±  7%  perf-profile.self.cycles-pp.selinux_file_permission
      0.38 ±  8%      +0.2        0.59 ±  2%  perf-profile.self.cycles-pp.entry_SYSCALL_64
      0.35 ±  9%      +0.2        0.56 ±  9%  perf-profile.self.cycles-pp.aio_read_events
      0.49 ± 10%      +0.3        0.74 ±  4%  perf-profile.self.cycles-pp.___might_sleep
      0.52 ±  9%      +0.3        0.79 ±  2%  perf-profile.self.cycles-pp.syscall_return_via_sysret
      0.43 ±  7%      +0.3        0.71 ±  2%  perf-profile.self.cycles-pp.lookup_ioctx
      0.00            +0.3        0.30 ± 33%  perf-profile.self.cycles-pp.percpu_counter_add_batch
      0.63 ±  8%      +0.3        0.97 ±  6%  perf-profile.self.cycles-pp.io_submit_one
      8.48 ± 13%      +3.8       12.32 ± 13%  perf-profile.self.cycles-pp.__memcpy_mcsafe
     10.96 ±  5%      +4.7       15.68 ±  5%  perf-profile.self.cycles-pp._raw_read_lock
     17.97 ±  4%      +5.7       23.65 ±  6%  perf-profile.self.cycles-pp.jbd2_transaction_committed


                                                                                
                                  fio.read_bw_MBps                              
                                                                                
  50000 +-+-----------------------------------------------------------------+   
  45000 O-O  O O  O O  O   O                             O   O       O    O O   
        |                O    O O  O O  O O O  O O  O           O O    O    |   
  40000 +-+                                                                 |   
  35000 +-+..+.  .+.  .+.+.                        .+.  .+                  |   
        |      +.   +.     +..+.+..+.+..+.+.+..+.+.   +.                    |   
  30000 +-+                                                                 |   
  25000 +-+                                                                 |   
  20000 +-+                                                                 |   
        |                                                                   |   
  15000 +-+                                                                 |   
  10000 +-+                                                                 |   
        |                                                                   |   
   5000 +-+                                                                 |   
      0 +-+-------------------------------------------O----O----------------+   
                                                                                
                                                                                                                                                                
                                     fio.read_iops                              
                                                                                
  1.4e+07 +-+---------------------------------------------------------------+   
          |                                                                 |   
  1.2e+07 O-O  O O O  O O    O                                       O    O O   
          |                O   O  O O O  O O O  O O O    O    O O  O   O    |   
    1e+07 +-+                                                               |   
          |.+..+. .+..+.+..+.      .+.  .+. .+..+.+.+..+.+                  |   
    8e+06 +-+    +           +.+..+   +.   +                                |   
          |                                                                 |   
    6e+06 +-+                                                               |   
          |                                                                 |   
    4e+06 +-+                                                               |   
          |                                                                 |   
    2e+06 +-+                                                               |   
          |                                                                 |   
        0 +-+------------------------------------------O----O---------------+   
                                                                                
                                                                                                                                                                
                                fio.read_clat_mean_us                           
                                                                                
  200000 +-+----------------------------------------------------------------+   
  180000 +-+               .+..+.+.  .+.  .+.                               |   
         |.+..+.+..+.+.+..+        +.   +.   +.+..+.+.+..+                  |   
  160000 +-+                                                                |   
  140000 +-+                                   O                  O         |   
         O O  O O  O O O  O O  O O O  O O  O O    O O    O    O O    O O  O O   
  120000 +-+                                                                |   
  100000 +-+                                                                |   
   80000 +-+                                                                |   
         |                                                                  |   
   60000 +-+                                                                |   
   40000 +-+                                                                |   
         |                                                                  |   
   20000 +-+                                                                |   
       0 +-+------------------------------------------O----O----------------+   
                                                                                
                                                                                                                                                                
                               fio.read_slat_mean_us                            
                                                                                
  6000 +-+------------------------------------------------------------------+   
       |                                                                    |   
  5000 +-+   .+..         .+.+..+.  .+.  .+.    .+.                         |   
       |.+..+    +.+..+.+.        +.   +.   +.+.   +..+.+                   |   
       |                                                                    |   
  4000 +-+              O       O O           O                 O O         |   
       O O  O O  O O  O    O O       O O  O O    O O    O    O       O O  O O   
  3000 +-+                                                                  |   
       |                                                                    |   
  2000 +-+                                                                  |   
       |                                                                    |   
       |                                                                    |   
  1000 +-+                                                                  |   
       |                                                                    |   
     0 +-+--------------------------------------------O----O----------------+   
                                                                                
                                                                                                                                                                
                                     fio.workload                               
                                                                                
  2.5e+09 O-+-----------O---------------------------------------------------+   
          | O  O O O  O      O O      O  O O O    O O    O    O      O O  O O   
          |                O      O O           O               O  O        |   
    2e+09 +-+                                                               |   
          |.+..+. .+..+.+..+.      .+.  .+. .+..+.+.+..+.+                  |   
          |      +           +.+..+   +.   +                                |   
  1.5e+09 +-+                                                               |   
          |                                                                 |   
    1e+09 +-+                                                               |   
          |                                                                 |   
          |                                                                 |   
    5e+08 +-+                                                               |   
          |                                                                 |   
          |                                                                 |   
        0 +-+------------------------------------------O----O---------------+   
                                                                                
                                                                                                                                                                
                                fio.time.user_time                              
                                                                                
  1400 +-+------------------------------------------------------------------+   
       O O  O O  O O  O O  O                     O                          O   
  1200 +-+                   O  O O  O O  O O O    O    O    O  O O  O O  O |   
       |                                                                    |   
  1000 +-+      .+.+..+.    .+..             .+.. .+..                      |   
       |.+..+.+.        +..+    +.+..+.+..+.+    +    +.+                   |   
   800 +-+                                                                  |   
       |                                                                    |   
   600 +-+                                                                  |   
       |                                                                    |   
   400 +-+                                                                  |   
       |                                                                    |   
   200 +-+                                                                  |   
       |                                                                    |   
     0 +-+--------------------------------------------O----O----------------+   
                                                                                
                                                                                                                                                                
                               fio.time.system_time                             
                                                                                
  9000 +-+------------------------------------------------------------------+   
       O O  O O  O.O..O.O. O O. O O  O O  O O O. O O.   O    O  O O  O O  O O   
  8000 +-+                                                                  |   
  7000 +-+                                                                  |   
       |                                                                    |   
  6000 +-+                                                                  |   
  5000 +-+                                                                  |   
       |                                                                    |   
  4000 +-+                                                                  |   
  3000 +-+                                                                  |   
       |                                                                    |   
  2000 +-+                                                                  |   
  1000 +-+                                                                  |   
       |                                                                    |   
     0 +-+--------------------------------------------O----O----------------+   
                                                                                
                                                                                
[*] bisect-good sample
[O] bisect-bad  sample



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Rong Chen


View attachment "config-5.3.0-rc4-00016-g520f897a3554b" of type "text/plain" (199557 bytes)

View attachment "job-script" of type "text/plain" (8290 bytes)

View attachment "job.yaml" of type "text/plain" (5820 bytes)

View attachment "reproduce" of type "text/plain" (912 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ