lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 25 Sep 2020 15:12:17 +0800
From:   kernel test robot <rong.a.chen@...el.com>
To:     Ritesh Harjani <riteshh@...ux.ibm.com>
Cc:     linux-ext4@...r.kernel.org, tytso@....edu, jack@...e.cz,
        dan.j.williams@...el.com, anju@...ux.vnet.ibm.com,
        linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
        Ritesh Harjani <riteshh@...ux.ibm.com>,
        0day robot <lkp@...el.com>, lkp@...ts.01.org,
        ying.huang@...el.com, feng.tang@...el.com, zhengjun.xing@...el.com
Subject: [ext4] 4e8fc10115: fio.write_iops 330.6% improvement

Greeting,

FYI, we noticed a 330.6% improvement of fio.write_iops due to commit:


commit: 4e8fc10115a6978060fe8a90f6a3a05463fa0660 ("[PATCHv3 1/1] ext4: Optimize file overwrites")
url: https://github.com/0day-ci/linux/commits/Ritesh-Harjani/Optimize-ext4-file-overwrites-perf-improvement/20200918-131139
base: https://git.kernel.org/cgit/linux/kernel/git/tytso/ext4.git dev

in testcase: fio-basic
on test machine: 96 threads Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 256G memory
with following parameters:

	disk: 2pmem
	fs: ext4
	mount_option: dax
	runtime: 200s
	nr_task: 50%
	time_based: tb
	rw: write
	bs: 4k
	ioengine: sync
	test_size: 200G
	cpufreq_governor: performance
	ucode: 0x5002f01

test-description: Fio is a tool that will spawn a number of threads or processes doing a particular type of I/O action as specified by the user.
test-url: https://github.com/axboe/fio





Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp install job.yaml  # job file is attached in this email
        bin/lkp run     job.yaml

=========================================================================================
bs/compiler/cpufreq_governor/disk/fs/ioengine/kconfig/mount_option/nr_task/rootfs/runtime/rw/tbox_group/test_size/testcase/time_based/ucode:
  4k/gcc-9/performance/2pmem/ext4/sync/x86_64-rhel-8.3/dax/50%/debian-10.4-x86_64-20200603.cgz/200s/write/lkp-csl-2sp6/200G/fio-basic/tb/0x5002f01

commit: 
  27bc446e2d ("ext4: limit the length of per-inode prealloc list")
  4e8fc10115 ("ext4: Optimize file overwrites")

27bc446e2def38db 4e8fc10115a6978060fe8a90f6a 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
      0.12 ±106%      -0.1        0.01        fio.latency_100us%
     51.38 ± 23%     -48.5        2.85 ± 20%  fio.latency_20us%
      0.01           +16.6       16.64 ± 28%  fio.latency_2us%
      0.24 ±135%     +54.7       54.89 ±  3%  fio.latency_4us%
     32.62 ± 18%     -31.7        0.91 ± 15%  fio.latency_50us%
     14780 ±  3%      -9.4%      13390        fio.time.involuntary_context_switches
      9299            -7.0%       8647        fio.time.system_time
    228.71 ±  4%    +281.9%     873.42 ±  6%  fio.time.user_time
     23448            -6.5%      21915        fio.time.voluntary_context_switches
 5.426e+08 ±  5%    +330.6%  2.337e+09 ±  6%  fio.workload
     10597 ±  5%    +330.6%      45638 ±  6%  fio.write_bw_MBps
     26944 ±  8%     -76.8%       6240 ±  9%  fio.write_clat_90%_us
     30368 ±  8%     -72.0%       8512 ± 11%  fio.write_clat_95%_us
     38016 ±  9%     -49.0%      19392 ±  4%  fio.write_clat_99%_us
     17448 ±  5%     -77.9%       3855 ±  7%  fio.write_clat_mean_us
     11052 ± 32%     -68.3%       3502 ± 10%  fio.write_clat_stddev
   2713004 ±  5%    +330.6%   11683335 ±  6%  fio.write_iops
  13639680 ±  7%     +26.6%   17267712 ±  5%  meminfo.DirectMap2M
      2704 ± 97%    +131.9%       6269 ± 26%  numa-meminfo.node0.PageTables
    676.50 ± 96%    +131.1%       1563 ± 26%  numa-vmstat.node0.nr_page_table_pages
     48.36            -6.8%      45.09        iostat.cpu.system
      1.21 ±  4%    +271.5%       4.51 ±  6%  iostat.cpu.user
      0.74 ±  2%      +0.1        0.81 ±  5%  mpstat.cpu.all.irq%
      1.22 ±  4%      +3.3        4.55 ±  6%  mpstat.cpu.all.usr%
    541348            +1.4%     548949        proc-vmstat.nr_file_pages
    245833            +2.9%     252840        proc-vmstat.nr_unevictable
    245833            +2.9%     252840        proc-vmstat.nr_zone_unevictable
    695285 ± 20%     -12.6%     607417 ± 17%  proc-vmstat.pgfree
    601976 ±  2%     +22.0%     734594 ±  2%  sched_debug.cpu.avg_idle.avg
   1001923            +9.0%    1092207 ±  5%  sched_debug.cpu.avg_idle.max
    372963           -25.8%     276657 ±  6%  sched_debug.cpu.avg_idle.stddev
     22130 ± 17%     +36.2%      30133 ± 14%  sched_debug.cpu.nr_switches.max
      3374 ± 18%     +28.5%       4336 ± 10%  sched_debug.cpu.nr_switches.stddev
    -47.00           -45.7%     -25.50        sched_debug.cpu.nr_uninterruptible.min
      2816 ± 21%     +36.5%       3844 ± 13%  sched_debug.cpu.sched_count.stddev
     26.69 ± 13%     -44.0%      14.94 ± 17%  sched_debug.cpu.sched_goidle.min
      1424 ± 21%     +36.2%       1941 ± 13%  sched_debug.cpu.sched_goidle.stddev
      1411 ± 18%     +31.9%       1861 ± 12%  sched_debug.cpu.ttwu_count.stddev
     15.42 ±  3%     -82.8%       2.66 ±  8%  perf-stat.i.MPKI
 3.417e+09 ±  4%    +239.7%  1.161e+10 ±  6%  perf-stat.i.branch-instructions
      0.72            -0.1        0.64        perf-stat.i.branch-miss-rate%
  24883051 ±  3%    +181.5%   70036819 ±  4%  perf-stat.i.branch-misses
  97563341 ± 12%     -58.3%   40638724 ± 14%  perf-stat.i.cache-misses
  2.96e+08 ±  2%     -48.4%  1.529e+08 ± 11%  perf-stat.i.cache-references
      7.06 ±  4%     -70.7%       2.06 ±  5%  perf-stat.i.cpi
      1461 ± 14%    +170.2%       3948 ± 19%  perf-stat.i.cycles-between-cache-misses
  6.17e+09 ±  4%    +243.3%  2.119e+10 ±  6%  perf-stat.i.dTLB-loads
      0.00 ± 11%      -0.0        0.00 ±  3%  perf-stat.i.dTLB-store-miss-rate%
 3.978e+09 ±  4%    +257.1%  1.421e+10 ±  6%  perf-stat.i.dTLB-stores
     83.61            +7.2       90.82        perf-stat.i.iTLB-load-miss-rate%
  25688726 ±  3%    +126.2%   58108368 ±  5%  perf-stat.i.iTLB-load-misses
   4852201           +17.7%    5709608 ±  2%  perf-stat.i.iTLB-loads
 1.962e+10 ±  4%    +243.4%  6.738e+10 ±  6%  perf-stat.i.instructions
    774.43 ±  2%     +50.4%       1165        perf-stat.i.instructions-per-iTLB-miss
      0.15 ±  4%    +235.9%       0.51 ±  6%  perf-stat.i.ipc
      0.25 ±  2%     +51.6%       0.37 ±  3%  perf-stat.i.metric.K/sec
    144.73 ±  4%    +239.5%     491.37 ±  6%  perf-stat.i.metric.M/sec
     89.29            +2.6       91.93        perf-stat.i.node-load-miss-rate%
  12691022 ±  8%     -56.3%    5550053 ± 12%  perf-stat.i.node-load-misses
   1504953 ± 13%     -64.4%     535348 ± 15%  perf-stat.i.node-loads
   9964107 ±  8%     -58.8%    4108905 ± 17%  perf-stat.i.node-store-misses
     15.10 ±  3%     -84.9%       2.28 ± 11%  perf-stat.overall.MPKI
      0.73            -0.1        0.60        perf-stat.overall.branch-miss-rate%
      6.86 ±  4%     -71.0%       1.99 ±  6%  perf-stat.overall.cpi
      1401 ± 13%    +139.9%       3361 ± 14%  perf-stat.overall.cycles-between-cache-misses
      0.00 ± 30%      -0.0        0.00 ± 45%  perf-stat.overall.dTLB-load-miss-rate%
      0.00 ± 22%      -0.0        0.00 ±  4%  perf-stat.overall.dTLB-store-miss-rate%
     84.11            +6.9       91.02        perf-stat.overall.iTLB-load-miss-rate%
    763.81 ±  2%     +51.8%       1159        perf-stat.overall.instructions-per-iTLB-miss
      0.15 ±  4%    +245.0%       0.50 ±  6%  perf-stat.overall.ipc
     89.44            +1.8       91.23        perf-stat.overall.node-load-miss-rate%
      7276           -20.3%       5801        perf-stat.overall.path-length
 3.401e+09 ±  4%    +239.6%  1.155e+10 ±  6%  perf-stat.ps.branch-instructions
  24776511 ±  3%    +181.3%   69696643 ±  4%  perf-stat.ps.branch-misses
  97040508 ± 12%     -58.3%   40436979 ± 14%  perf-stat.ps.cache-misses
 2.945e+08 ±  2%     -48.3%  1.522e+08 ± 11%  perf-stat.ps.cache-references
 6.141e+09 ±  4%    +243.2%  2.108e+10 ±  6%  perf-stat.ps.dTLB-loads
 3.959e+09 ±  4%    +257.0%  1.414e+10 ±  6%  perf-stat.ps.dTLB-stores
  25562318 ±  3%    +126.2%   57814503 ±  5%  perf-stat.ps.iTLB-load-misses
   4826722           +17.7%    5679789 ±  2%  perf-stat.ps.iTLB-loads
 1.953e+10 ±  4%    +243.3%  6.704e+10 ±  6%  perf-stat.ps.instructions
  12624818 ±  8%     -56.3%    5522769 ± 12%  perf-stat.ps.node-load-misses
   1497174 ± 13%     -64.4%     532776 ± 15%  perf-stat.ps.node-loads
   9912289 ±  8%     -58.8%    4087930 ± 17%  perf-stat.ps.node-store-misses
 3.947e+12 ±  4%    +243.4%  1.355e+13 ±  6%  perf-stat.total.instructions
    290.75 ± 51%     -78.1%      63.75 ±128%  interrupts.CPU17.RES:Rescheduling_interrupts
      6339 ± 25%     -35.3%       4101 ± 52%  interrupts.CPU19.NMI:Non-maskable_interrupts
      6339 ± 25%     -35.3%       4101 ± 52%  interrupts.CPU19.PMI:Performance_monitoring_interrupts
    166.00 ± 46%     -91.6%      14.00 ± 72%  interrupts.CPU2.RES:Rescheduling_interrupts
    429.75 ±  2%     +14.0%     490.00 ± 12%  interrupts.CPU20.CAL:Function_call_interrupts
      6339 ± 25%     -35.3%       4100 ± 52%  interrupts.CPU20.NMI:Non-maskable_interrupts
      6339 ± 25%     -35.3%       4100 ± 52%  interrupts.CPU20.PMI:Performance_monitoring_interrupts
      6338 ± 25%     -31.1%       4364 ± 46%  interrupts.CPU21.NMI:Non-maskable_interrupts
      6338 ± 25%     -31.1%       4364 ± 46%  interrupts.CPU21.PMI:Performance_monitoring_interrupts
      6339 ± 25%     -50.8%       3121 ± 14%  interrupts.CPU23.NMI:Non-maskable_interrupts
      6339 ± 25%     -50.8%       3121 ± 14%  interrupts.CPU23.PMI:Performance_monitoring_interrupts
     68.50 ± 54%    +202.2%     207.00        interrupts.CPU24.RES:Rescheduling_interrupts
      3328 ± 45%     +76.5%       5876 ± 33%  interrupts.CPU25.NMI:Non-maskable_interrupts
      3328 ± 45%     +76.5%       5876 ± 33%  interrupts.CPU25.PMI:Performance_monitoring_interrupts
     39.75 ± 79%    +423.9%     208.25 ±  2%  interrupts.CPU25.RES:Rescheduling_interrupts
      1766 ±112%     -75.2%     438.25 ±  4%  interrupts.CPU27.CAL:Function_call_interrupts
     82.75 ± 49%     -64.0%      29.75 ±122%  interrupts.CPU27.TLB:TLB_shootdowns
    439.50 ±  2%     +74.2%     765.50 ± 38%  interrupts.CPU3.CAL:Function_call_interrupts
    494.25 ±  5%     -10.5%     442.25 ±  5%  interrupts.CPU30.CAL:Function_call_interrupts
     61.00 ±127%    +230.7%     201.75        interrupts.CPU30.RES:Rescheduling_interrupts
     56.50 ±140%    +255.3%     200.75        interrupts.CPU31.RES:Rescheduling_interrupts
      1633 ±123%     -73.3%     435.50 ±  3%  interrupts.CPU32.CAL:Function_call_interrupts
     56.75 ±141%    +252.4%     200.00        interrupts.CPU33.RES:Rescheduling_interrupts
     56.75 ±139%    +227.3%     185.75 ± 12%  interrupts.CPU34.RES:Rescheduling_interrupts
     56.50 ±142%    +185.8%     161.50 ± 39%  interrupts.CPU35.RES:Rescheduling_interrupts
     79.75 ± 36%     -56.4%      34.75 ± 91%  interrupts.CPU36.TLB:TLB_shootdowns
     65.25 ±117%    +176.6%     180.50 ± 30%  interrupts.CPU39.RES:Rescheduling_interrupts
     78.50 ± 44%     -54.1%      36.00 ± 83%  interrupts.CPU39.TLB:TLB_shootdowns
     62.25 ±120%    +151.8%     156.75 ± 45%  interrupts.CPU43.RES:Rescheduling_interrupts
     86.00 ± 45%     -54.4%      39.25 ± 97%  interrupts.CPU43.TLB:TLB_shootdowns
    487.50 ± 10%     -10.8%     434.75 ±  3%  interrupts.CPU44.CAL:Function_call_interrupts
     93.00 ± 46%     -64.5%      33.00 ±119%  interrupts.CPU46.TLB:TLB_shootdowns
      7330 ± 12%     -41.4%       4293 ± 33%  interrupts.CPU5.NMI:Non-maskable_interrupts
      7330 ± 12%     -41.4%       4293 ± 33%  interrupts.CPU5.PMI:Performance_monitoring_interrupts
    169.25 ± 36%     -90.8%      15.50 ± 71%  interrupts.CPU5.RES:Rescheduling_interrupts
      3285 ± 45%     +92.3%       6318 ± 25%  interrupts.CPU57.NMI:Non-maskable_interrupts
      3285 ± 45%     +92.3%       6318 ± 25%  interrupts.CPU57.PMI:Performance_monitoring_interrupts
      7323 ± 12%     -51.2%       3572 ± 34%  interrupts.CPU6.NMI:Non-maskable_interrupts
      7323 ± 12%     -51.2%       3572 ± 34%  interrupts.CPU6.PMI:Performance_monitoring_interrupts
     32.50 ± 78%    +580.0%     221.00 ±125%  interrupts.CPU63.TLB:TLB_shootdowns
      7323 ± 12%     -41.5%       4286 ± 33%  interrupts.CPU7.NMI:Non-maskable_interrupts
      7323 ± 12%     -41.5%       4286 ± 33%  interrupts.CPU7.PMI:Performance_monitoring_interrupts
    175.50 ± 27%     -80.3%      34.50 ± 37%  interrupts.CPU72.RES:Rescheduling_interrupts
     93.25 ± 45%     -57.1%      40.00 ±115%  interrupts.CPU72.TLB:TLB_shootdowns
      7868           -45.2%       4311 ± 32%  interrupts.CPU73.NMI:Non-maskable_interrupts
      7868           -45.2%       4311 ± 32%  interrupts.CPU73.PMI:Performance_monitoring_interrupts
      7330 ± 12%     -41.4%       4297 ± 33%  interrupts.CPU75.NMI:Non-maskable_interrupts
      7330 ± 12%     -41.4%       4297 ± 33%  interrupts.CPU75.PMI:Performance_monitoring_interrupts
    163.50 ± 41%     -84.9%      24.75 ±127%  interrupts.CPU77.RES:Rescheduling_interrupts
      7324 ± 12%     -41.4%       4294 ± 33%  interrupts.CPU78.NMI:Non-maskable_interrupts
      7324 ± 12%     -41.4%       4294 ± 33%  interrupts.CPU78.PMI:Performance_monitoring_interrupts
    161.25 ± 45%     -91.5%      13.75 ±109%  interrupts.CPU80.RES:Rescheduling_interrupts
      7325 ± 12%     -41.5%       4287 ± 33%  interrupts.CPU81.NMI:Non-maskable_interrupts
      7325 ± 12%     -41.5%       4287 ± 33%  interrupts.CPU81.PMI:Performance_monitoring_interrupts
     95.00 ± 50%     -59.7%      38.25 ±117%  interrupts.CPU92.TLB:TLB_shootdowns
      8991 ±108%    +161.3%      23491 ± 19%  softirqs.CPU2.SCHED
     67870 ±  5%      +8.4%      73546 ±  2%  softirqs.CPU2.TIMER
     23244 ± 25%     -88.7%       2626        softirqs.CPU24.SCHED
     83405 ± 17%     -23.4%      63886 ±  2%  softirqs.CPU24.TIMER
     23963 ± 12%     -88.4%       2784 ±  2%  softirqs.CPU25.SCHED
     83623 ± 19%     -23.5%      63968 ±  2%  softirqs.CPU25.TIMER
      4276 ±  5%     +97.6%       8448 ± 13%  softirqs.CPU26.RCU
     14129 ± 74%     -81.4%       2631 ±  4%  softirqs.CPU26.SCHED
     17203 ± 53%     -70.0%       5163 ± 89%  softirqs.CPU27.SCHED
     70966 ±  5%     -10.4%      63583 ±  5%  softirqs.CPU27.TIMER
     19121 ± 47%     -74.6%       4863 ± 88%  softirqs.CPU28.SCHED
     72354 ±  6%     -10.4%      64858 ±  2%  softirqs.CPU29.TIMER
      9275 ±101%    +151.3%      23309 ± 19%  softirqs.CPU3.SCHED
     19928 ± 46%     -84.7%       3042 ±  7%  softirqs.CPU30.SCHED
     72106 ±  7%     -11.8%      63632 ±  2%  softirqs.CPU30.TIMER
     19845 ± 45%     -84.7%       3030 ±  6%  softirqs.CPU31.SCHED
     72345 ±  6%     -10.8%      64523        softirqs.CPU31.TIMER
     19559 ± 47%     -84.2%       3094 ±  8%  softirqs.CPU32.SCHED
     19689 ± 47%     -83.0%       3352 ±  2%  softirqs.CPU33.SCHED
     71873 ±  7%      -9.4%      65131        softirqs.CPU33.TIMER
     16286 ± 48%     -63.6%       5928 ± 76%  softirqs.CPU34.SCHED
     11784 ± 76%    +118.7%      25776        softirqs.CPU4.SCHED
     70606 ±  5%      -9.8%      63713        softirqs.CPU48.TIMER
     71122 ±  4%     -10.2%      63890 ±  5%  softirqs.CPU49.TIMER
      8863 ±108%    +190.0%      25702        softirqs.CPU5.SCHED
     20026 ± 49%     -87.1%       2587 ±  5%  softirqs.CPU50.SCHED
     70832 ±  4%     -10.7%      63286        softirqs.CPU50.TIMER
     18874 ± 50%     -86.1%       2631 ±  4%  softirqs.CPU51.SCHED
     71694 ±  5%     -13.7%      61847 ±  3%  softirqs.CPU51.TIMER
     17403 ± 56%     -85.3%       2560        softirqs.CPU52.SCHED
     71831 ±  8%     -11.0%      63942 ±  3%  softirqs.CPU52.TIMER
     20860 ± 49%     -87.1%       2689 ±  2%  softirqs.CPU53.SCHED
     81014 ± 19%     -23.0%      62345 ±  2%  softirqs.CPU53.TIMER
     20180 ± 50%     -87.7%       2480 ±  9%  softirqs.CPU54.SCHED
     71917 ±  5%     -12.3%      63071        softirqs.CPU54.TIMER
     74057 ± 12%     -16.4%      61946 ±  2%  softirqs.CPU55.TIMER
     20135 ± 50%     -86.8%       2667 ±  4%  softirqs.CPU56.SCHED
     73377 ±  7%     -13.4%      63523 ±  3%  softirqs.CPU56.TIMER
     23019 ± 19%     -64.3%       8226 ±118%  softirqs.CPU57.SCHED
     75540 ±  5%     -14.6%      64485 ±  4%  softirqs.CPU57.TIMER
     20267 ± 49%     -59.4%       8236 ±118%  softirqs.CPU58.SCHED
     72755 ±  7%     -11.1%      64699 ±  3%  softirqs.CPU58.TIMER
     72871 ±  7%     -10.9%      64896 ±  4%  softirqs.CPU59.TIMER
      8781 ±108%    +192.7%      25703        softirqs.CPU6.SCHED
     72683 ±  7%     -10.9%      64778 ±  4%  softirqs.CPU60.TIMER
     72665 ±  8%     -11.1%      64612 ±  4%  softirqs.CPU61.TIMER
     72308 ±  5%     -10.1%      64991 ±  6%  softirqs.CPU65.TIMER
     20301 ± 49%     -58.5%       8419 ±118%  softirqs.CPU66.SCHED
     11380 ± 79%    +123.7%      25453        softirqs.CPU7.SCHED
      4027 ±  5%    +111.8%       8530 ± 32%  softirqs.CPU71.RCU
      5823 ± 96%    +357.6%      26649        softirqs.CPU72.SCHED
      2461 ± 12%    +952.7%      25914        softirqs.CPU73.SCHED
      8475 ±117%    +176.7%      23452 ± 20%  softirqs.CPU75.SCHED
      8462 ±116%    +178.9%      23601 ± 19%  softirqs.CPU76.SCHED
      8459 ±117%    +211.7%      26366 ±  2%  softirqs.CPU77.SCHED
      8511 ±117%    +205.5%      26002 ±  2%  softirqs.CPU79.SCHED
      8854 ±105%    +186.2%      25341 ±  2%  softirqs.CPU8.SCHED
      8450 ±116%    +215.1%      26629 ±  2%  softirqs.CPU80.SCHED
      8496 ±117%    +206.5%      26038        softirqs.CPU81.SCHED
      4144 ±  6%     +83.5%       7603 ± 21%  softirqs.CPU82.RCU
      8429 ±117%    +179.7%      23575 ± 18%  softirqs.CPU82.SCHED
      8393 ±117%    +138.6%      20028 ± 30%  softirqs.CPU84.SCHED
      8422 ±116%    +140.8%      20281 ± 28%  softirqs.CPU92.SCHED
      4021 ±  7%     +93.4%       7778 ± 29%  softirqs.CPU95.RCU
    415214           +63.4%     678631 ±  6%  softirqs.RCU
     38.06 ±  7%     -38.1        0.00        perf-profile.calltrace.cycles-pp.__ext4_journal_start_sb.ext4_iomap_begin.iomap_apply.dax_iomap_rw.ext4_file_write_iter
     36.28 ±  7%     -36.3        0.00        perf-profile.calltrace.cycles-pp.jbd2__journal_start.__ext4_journal_start_sb.ext4_iomap_begin.iomap_apply.dax_iomap_rw
     36.07 ±  7%     -36.1        0.00        perf-profile.calltrace.cycles-pp.start_this_handle.jbd2__journal_start.__ext4_journal_start_sb.ext4_iomap_begin.iomap_apply
     63.15 ±  7%     -31.9       31.29 ± 12%  perf-profile.calltrace.cycles-pp.ext4_iomap_begin.iomap_apply.dax_iomap_rw.ext4_file_write_iter.new_sync_write
     11.15 ±  9%     -11.1        0.00        perf-profile.calltrace.cycles-pp.__ext4_journal_stop.ext4_iomap_begin.iomap_apply.dax_iomap_rw.ext4_file_write_iter
     10.95 ±  9%     -11.0        0.00        perf-profile.calltrace.cycles-pp.jbd2_journal_stop.__ext4_journal_stop.ext4_iomap_begin.iomap_apply.dax_iomap_rw
      8.81 ±  7%      -8.8        0.00        perf-profile.calltrace.cycles-pp.stop_this_handle.jbd2_journal_stop.__ext4_journal_stop.ext4_iomap_begin.iomap_apply
      8.49 ±  6%      -8.5        0.00        perf-profile.calltrace.cycles-pp.add_transaction_credits.start_this_handle.jbd2__journal_start.__ext4_journal_start_sb.ext4_iomap_begin
      5.93 ±  6%      -5.9        0.00        perf-profile.calltrace.cycles-pp._raw_read_lock.start_this_handle.jbd2__journal_start.__ext4_journal_start_sb.ext4_iomap_begin
      0.99 ±  9%      +0.4        1.44 ± 19%  perf-profile.calltrace.cycles-pp.ext4_write_checks.ext4_file_write_iter.new_sync_write.vfs_write.ksys_write
      0.00            +1.0        0.96 ± 17%  perf-profile.calltrace.cycles-pp.ext4_es_lookup_extent.ext4_map_blocks.ext4_iomap_begin.iomap_apply.dax_iomap_rw
      0.00            +1.1        1.10 ± 20%  perf-profile.calltrace.cycles-pp.__check_block_validity.ext4_map_blocks.ext4_iomap_begin.iomap_apply.dax_iomap_rw
      0.00            +2.2        2.19 ± 17%  perf-profile.calltrace.cycles-pp.ext4_map_blocks.ext4_iomap_begin.iomap_apply.dax_iomap_rw.ext4_file_write_iter
      1.94 ± 16%      +6.6        8.49 ± 13%  perf-profile.calltrace.cycles-pp.__copy_user_nocache.__copy_user_flushcache._copy_from_iter_flushcache.dax_iomap_actor.iomap_apply
      1.95 ± 16%      +6.6        8.54 ± 13%  perf-profile.calltrace.cycles-pp.__copy_user_flushcache._copy_from_iter_flushcache.dax_iomap_actor.iomap_apply.dax_iomap_rw
      1.99 ± 16%      +6.7        8.70 ± 13%  perf-profile.calltrace.cycles-pp._copy_from_iter_flushcache.dax_iomap_actor.iomap_apply.dax_iomap_rw.ext4_file_write_iter
      7.86 ± 11%     +12.8       20.70 ± 13%  perf-profile.calltrace.cycles-pp._raw_read_lock.jbd2_transaction_committed.ext4_set_iomap.ext4_iomap_begin.iomap_apply
      1.73 ± 15%     +13.7       15.42 ± 27%  perf-profile.calltrace.cycles-pp.__srcu_read_unlock.dax_iomap_actor.iomap_apply.dax_iomap_rw.ext4_file_write_iter
     12.86 ±  7%     +14.8       27.69 ± 13%  perf-profile.calltrace.cycles-pp.jbd2_transaction_committed.ext4_set_iomap.ext4_iomap_begin.iomap_apply.dax_iomap_rw
     13.14 ±  7%     +15.7       28.81 ± 13%  perf-profile.calltrace.cycles-pp.ext4_set_iomap.ext4_iomap_begin.iomap_apply.dax_iomap_rw.ext4_file_write_iter
      3.87 ± 14%     +20.9       24.76 ± 20%  perf-profile.calltrace.cycles-pp.dax_iomap_actor.iomap_apply.dax_iomap_rw.ext4_file_write_iter.new_sync_write
     38.74 ±  7%     -38.1        0.65 ±  8%  perf-profile.children.cycles-pp.__ext4_journal_start_sb
     36.93 ±  7%     -36.3        0.61 ±  7%  perf-profile.children.cycles-pp.jbd2__journal_start
     36.73 ±  7%     -36.1        0.60 ±  7%  perf-profile.children.cycles-pp.start_this_handle
     63.15 ±  7%     -31.9       31.30 ± 12%  perf-profile.children.cycles-pp.ext4_iomap_begin
     11.21 ±  9%     -11.2        0.01 ±173%  perf-profile.children.cycles-pp.__ext4_journal_stop
     11.01 ±  9%     -11.0        0.01 ±173%  perf-profile.children.cycles-pp.jbd2_journal_stop
      8.83 ±  7%      -8.8        0.00        perf-profile.children.cycles-pp.stop_this_handle
      8.64 ±  7%      -8.5        0.14 ±  8%  perf-profile.children.cycles-pp.add_transaction_credits
      0.00            +0.1        0.05 ±  8%  perf-profile.children.cycles-pp.timestamp_truncate
      0.00            +0.1        0.06 ± 15%  perf-profile.children.cycles-pp.pmem_dax_direct_access
      0.00            +0.1        0.06 ± 14%  perf-profile.children.cycles-pp.fsnotify_parent
      0.00            +0.1        0.06 ± 11%  perf-profile.children.cycles-pp.file_modified
      0.00            +0.1        0.07 ± 12%  perf-profile.children.cycles-pp.aa_file_perm
      0.00            +0.1        0.07 ± 12%  perf-profile.children.cycles-pp.apparmor_file_permission
      0.00            +0.1        0.07 ± 15%  perf-profile.children.cycles-pp.ktime_get_coarse_real_ts64
      0.00            +0.1        0.08 ± 10%  perf-profile.children.cycles-pp.__pmem_direct_access
      0.00            +0.1        0.09 ±  9%  perf-profile.children.cycles-pp.__x86_indirect_thunk_rax
      0.00            +0.1        0.09 ±  7%  perf-profile.children.cycles-pp.__might_sleep
      0.00            +0.1        0.09 ± 13%  perf-profile.children.cycles-pp._cond_resched
      0.00            +0.1        0.10 ± 12%  perf-profile.children.cycles-pp.___might_sleep
      0.00            +0.1        0.12 ± 12%  perf-profile.children.cycles-pp.fsnotify
      0.04 ± 57%      +0.1        0.18 ±  7%  perf-profile.children.cycles-pp.__fdget_pos
      0.00            +0.1        0.14 ±  7%  perf-profile.children.cycles-pp.__fget_light
      0.00            +0.2        0.15 ± 10%  perf-profile.children.cycles-pp.up_write
      0.01 ±173%      +0.2        0.17 ±  6%  perf-profile.children.cycles-pp.current_time
      0.00            +0.2        0.16 ± 11%  perf-profile.children.cycles-pp.dax_direct_access
      0.06 ±  7%      +0.2        0.23 ± 11%  perf-profile.children.cycles-pp.__sb_start_write
      0.00            +0.2        0.18 ± 72%  perf-profile.children.cycles-pp.generic_write_checks
      0.04 ± 57%      +0.2        0.22 ±  8%  perf-profile.children.cycles-pp.__srcu_read_lock
      0.06 ±  7%      +0.2        0.26 ± 11%  perf-profile.children.cycles-pp.entry_SYSCALL_64
      0.06            +0.2        0.26 ± 14%  perf-profile.children.cycles-pp.common_file_perm
      0.05 ±  9%      +0.2        0.28 ± 11%  perf-profile.children.cycles-pp.down_write
      0.00            +0.2        0.23 ± 60%  perf-profile.children.cycles-pp.ext4_generic_write_checks
      0.09 ±  5%      +0.3        0.34 ± 13%  perf-profile.children.cycles-pp.syscall_return_via_sysret
      0.09 ±  5%      +0.3        0.37 ± 14%  perf-profile.children.cycles-pp.security_file_permission
      0.10 ±  8%      +0.4        0.54 ± 25%  perf-profile.children.cycles-pp.ext4_inode_block_valid
      0.99 ±  9%      +0.4        1.44 ± 19%  perf-profile.children.cycles-pp.ext4_write_checks
      0.04 ± 57%      +0.5        0.51 ± 31%  perf-profile.children.cycles-pp.percpu_counter_add_batch
      0.12 ±173%      +0.5        0.65 ± 42%  perf-profile.children.cycles-pp.start_kernel
      0.17 ± 11%      +0.8        0.96 ± 17%  perf-profile.children.cycles-pp.ext4_es_lookup_extent
      0.19 ± 14%      +0.9        1.11 ± 20%  perf-profile.children.cycles-pp.__check_block_validity
      0.39 ± 12%      +1.8        2.20 ± 17%  perf-profile.children.cycles-pp.ext4_map_blocks
      1.94 ± 16%      +6.6        8.50 ± 13%  perf-profile.children.cycles-pp.__copy_user_nocache
      1.95 ± 16%      +6.6        8.54 ± 13%  perf-profile.children.cycles-pp.__copy_user_flushcache
      1.99 ± 16%      +6.7        8.70 ± 13%  perf-profile.children.cycles-pp._copy_from_iter_flushcache
     13.96 ±  9%      +7.1       21.04 ± 13%  perf-profile.children.cycles-pp._raw_read_lock
      1.73 ± 15%     +13.7       15.43 ± 27%  perf-profile.children.cycles-pp.__srcu_read_unlock
     12.87 ±  7%     +14.8       27.70 ± 13%  perf-profile.children.cycles-pp.jbd2_transaction_committed
     13.15 ±  7%     +15.7       28.82 ± 13%  perf-profile.children.cycles-pp.ext4_set_iomap
      3.88 ± 14%     +20.9       24.78 ± 20%  perf-profile.children.cycles-pp.dax_iomap_actor
     21.95 ±  7%     -21.6        0.35 ±  8%  perf-profile.self.cycles-pp.start_this_handle
      8.79 ±  7%      -8.8        0.00        perf-profile.self.cycles-pp.stop_this_handle
      8.60 ±  7%      -8.5        0.14 ±  8%  perf-profile.self.cycles-pp.add_transaction_credits
      0.00            +0.1        0.05 ±  8%  perf-profile.self.cycles-pp.__x86_indirect_thunk_rax
      0.00            +0.1        0.06 ±  9%  perf-profile.self.cycles-pp.current_time
      0.00            +0.1        0.06 ± 11%  perf-profile.self.cycles-pp.aa_file_perm
      0.00            +0.1        0.06 ± 20%  perf-profile.self.cycles-pp.apparmor_file_permission
      0.00            +0.1        0.07 ± 20%  perf-profile.self.cycles-pp.generic_write_checks
      0.00            +0.1        0.07 ± 15%  perf-profile.self.cycles-pp.ktime_get_coarse_real_ts64
      0.00            +0.1        0.08 ±  6%  perf-profile.self.cycles-pp.__might_sleep
      0.00            +0.1        0.08 ± 10%  perf-profile.self.cycles-pp.__pmem_direct_access
      0.00            +0.1        0.08 ± 13%  perf-profile.self.cycles-pp.__sb_start_write
      0.00            +0.1        0.09 ± 13%  perf-profile.self.cycles-pp.ksys_write
      0.00            +0.1        0.10 ± 12%  perf-profile.self.cycles-pp.___might_sleep
      0.00            +0.1        0.11 ± 16%  perf-profile.self.cycles-pp.dax_iomap_rw
      0.00            +0.1        0.11 ± 11%  perf-profile.self.cycles-pp.fsnotify
      0.00            +0.1        0.12 ± 67%  perf-profile.self.cycles-pp.file_update_time
      0.00            +0.1        0.13 ±  8%  perf-profile.self.cycles-pp.__fget_light
      0.00            +0.1        0.13 ±  9%  perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
      0.00            +0.1        0.14 ± 15%  perf-profile.self.cycles-pp.ext4_map_blocks
      0.00            +0.2        0.15 ± 12%  perf-profile.self.cycles-pp._copy_from_iter_flushcache
      0.04 ± 57%      +0.2        0.19 ± 15%  perf-profile.self.cycles-pp.common_file_perm
      0.00            +0.2        0.15 ± 10%  perf-profile.self.cycles-pp.up_write
      0.00            +0.2        0.17 ± 10%  perf-profile.self.cycles-pp.down_write
      0.04 ± 57%      +0.2        0.21 ± 10%  perf-profile.self.cycles-pp.dax_iomap_actor
      0.01 ±173%      +0.2        0.20 ± 11%  perf-profile.self.cycles-pp.vfs_write
      0.00            +0.2        0.18 ± 15%  perf-profile.self.cycles-pp.do_syscall_64
      0.08 ±  5%      +0.2        0.28 ±  8%  perf-profile.self.cycles-pp.ext4_iomap_begin
      0.06 ± 15%      +0.2        0.25 ± 11%  perf-profile.self.cycles-pp.ext4_es_lookup_extent
      0.06 ±  7%      +0.2        0.26 ± 11%  perf-profile.self.cycles-pp.entry_SYSCALL_64
      0.01 ±173%      +0.2        0.22 ± 10%  perf-profile.self.cycles-pp.__srcu_read_lock
      0.09 ±  5%      +0.3        0.34 ± 13%  perf-profile.self.cycles-pp.syscall_return_via_sysret
      0.00            +0.3        0.31 ± 80%  perf-profile.self.cycles-pp.new_sync_write
      0.11 ±  7%      +0.3        0.45 ±  9%  perf-profile.self.cycles-pp.iomap_apply
      0.04 ± 57%      +0.4        0.47 ± 32%  perf-profile.self.cycles-pp.percpu_counter_add_batch
      0.10 ±  8%      +0.4        0.53 ± 25%  perf-profile.self.cycles-pp.ext4_inode_block_valid
      0.25 ± 12%      +0.5        0.70 ± 25%  perf-profile.self.cycles-pp.ext4_file_write_iter
      0.09 ± 27%      +0.5        0.56 ± 21%  perf-profile.self.cycles-pp.__check_block_validity
      0.27 ± 18%      +0.8        1.11 ± 28%  perf-profile.self.cycles-pp.ext4_set_iomap
      4.99 ±  6%      +2.0        6.95 ± 14%  perf-profile.self.cycles-pp.jbd2_transaction_committed
      1.93 ± 16%      +6.5        8.46 ± 13%  perf-profile.self.cycles-pp.__copy_user_nocache
     13.90 ±  9%      +7.0       20.92 ± 13%  perf-profile.self.cycles-pp._raw_read_lock
      1.73 ± 15%     +13.6       15.35 ± 27%  perf-profile.self.cycles-pp.__srcu_read_unlock


                                                                                
                                  fio.write_bw_MBps                             
                                                                                
  60000 +-------------------------------------------------------------------+   
  55000 |-+    O                                                            |   
        |    O        O O                                                   |   
  50000 |-+                 O           O O    O   O    O                   |   
  45000 |-+        O      O      O O O      O        O                      |   
  40000 |-O      O             O                 O        O                 |   
  35000 |-+                                                                 |   
        |                                                                   |   
  30000 |-+                                                                 |   
  25000 |-+                                                                 |   
  20000 |-+                                                                 |   
  15000 |-+                                                                 |   
        |.+..+.+.+.+..+.+.+.+..+.+.+.  .+. .+..+.+.+.+..+.+.+.    .+.    .+.|   
  10000 |-+                          +.   +                   +..+   +.+.   |   
   5000 +-------------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                                    fio.write_iops                              
                                                                                
  1.6e+07 +-----------------------------------------------------------------+   
          |      O                                                          |   
  1.4e+07 |-+                                                               |   
          |   O        O O                     O        O                   |   
  1.2e+07 |-+                 O   O      O O        O                       |   
          | O        O      O   O   O  O     O    O   O   O                 |   
    1e+07 |-+      O                                                        |   
          |                                                                 |   
    8e+06 |-+                                                               |   
          |                                                                 |   
    6e+06 |-+                                                               |   
          |                                                                 |   
    4e+06 |-+                                                               |   
          |.+.+..+.+.+.+.+..+.+.+.+.+..+.+.+.+.+..+.+.+.+.+..+.+.+.+.+..+.+.|   
    2e+06 +-----------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                               fio.write_clat_mean_us                           
                                                                                
  20000 +-------------------------------------------------------------------+   
        |                                                            +.+..  |   
  18000 |-+                         .+..+.+.+..              .+..+. +       |   
  16000 |.+..+. .+.+..+. .+.+.. .+.+           +.+.+.+..+.+.+      +      +.|   
        |      +        +      +                                            |   
  14000 |-+                                                                 |   
  12000 |-+                                                                 |   
        |                                                                   |   
  10000 |-+                                                                 |   
   8000 |-+                                                                 |   
        |                                                                   |   
   6000 |-+                                                                 |   
   4000 |-O      O O      O    O   O O           O        O                 |   
        |    O O      O O   O    O      O O O  O   O O  O                   |   
   2000 +-------------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                                fio.write_clat_90__us                           
                                                                                
  35000 +-------------------------------------------------------------------+   
        |                                                                   |   
  30000 |-+               +            .+. .+..                 .+   +      |   
        |.+..    +.  .+  : +    .+. .+.   +     .+.     +.+.+.+.  : : +  .+.|   
  25000 |-+  +. +  +.  + :  +..+   +           +   +. ..          : :  +.   |   
        |      +        +                            +             +        |   
  20000 |-+                                                                 |   
        |                                                                   |   
  15000 |-+                                                                 |   
        |                                                                   |   
  10000 |-+                                                                 |   
        | O      O O           O     O  O   O    O   O    O                 |   
   5000 |-+  O O      O O O O    O O      O    O   O    O                   |   
        |                                                                   |   
      0 +-------------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                                fio.write_clat_95__us                           
                                                                                
  40000 +-------------------------------------------------------------------+   
        |                                                                   |   
  35000 |-+                            .+. .+..                      +      |   
        |        +.       +.     +   +.   +             +.      .+   :+     |   
  30000 |.+..   :  +..+  :  +.. + + +          +.+.+   +  +.+.+.  : :  +..+.|   
        |    +. :      + :     +   +                + +           : :       |   
  25000 |-+    +        +                            +             +        |   
        |                                                                   |   
  20000 |-+                                                                 |   
        |                                                                   |   
  15000 |-+                                                                 |   
        |                                                                   |   
  10000 |-+      O O                 O  O   O    O   O    O                 |   
        | O  O        O O O O  O O O      O    O   O    O                   |   
   5000 +-------------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                                fio.latency_4us_                                
                                                                                
  70 +----------------------------------------------------------------------+   
     |                  O                                                   |   
  60 |-+                         O                                          |   
     | O                  O  O           O   O  O O    O O                  |   
  50 |-+      O  O             O    O      O        O                       |   
     |    O O      O                  O                                     |   
  40 |-+              O                                                     |   
     |                                                                      |   
  30 |-+                                                                    |   
     |                                                                      |   
  20 |-+                                                                    |   
     |                                                                      |   
  10 |-+                                                                    |   
     |                                                                      |   
   0 +----------------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                                fio.latency_50us_                               
                                                                                
  45 +----------------------------------------------------------------------+   
     |                                                               +      |   
  40 |-+                                                       .+    ::     |   
  35 |-+      +         +           +.+..+.+                 .+  :  : :  .+ |   
     | +      :+   +    ::     +. ..        :          +. .+.    :  :  +.  :|   
  30 |+++    :  + + +  : :    :  +          :         :  +        : :      :|   
  25 |-+ +   :   +   + :  +.. :              +..+.+   :           : :       |   
     |    +.:         +      +                     + :             :        |   
  20 |-+    +                                       +              +        |   
  15 |-+                                                                    |   
     |                                                                      |   
  10 |-+                                                                    |   
   5 |-+                                                                    |   
     |                                                                      |   
   0 +----------------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                                     fio.workload                               
                                                                                
    3e+09 +-----------------------------------------------------------------+   
          |                                                                 |   
          |   O        O O                                                  |   
  2.5e+09 |-+                 O            O   O    O   O                   |   
          |          O            O O    O   O        O                     |   
          | O      O        O   O      O          O       O                 |   
    2e+09 |-+                                                               |   
          |                                                                 |   
  1.5e+09 |-+                                                               |   
          |                                                                 |   
          |                                                                 |   
    1e+09 |-+                                                               |   
          |                                                                 |   
          |. .+..+.+.+. .+..+. .+. .+..                                     |   
    5e+08 +-----------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                                fio.time.user_time                              
                                                                                
  1100 +--------------------------------------------------------------------+   
       |               O                                                    |   
  1000 |-+  O        O                        O        O                    |   
   900 |-+                  O   O      O  O        O                        |   
       |           O     O         O O      O        O                      |   
   800 |-O      O             O                 O         O                 |   
   700 |-+                                                                  |   
       |                                                                    |   
   600 |-+                                                                  |   
   500 |-+                                                                  |   
       |                                                                    |   
   400 |-+  +                                                               |   
   300 |-+.. +                                                              |   
       |.+    +.+..+.+.+.+..+.+.+..+.        .+.+..+.+.+..+.+.  .+.+.    .+.|   
   200 +--------------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                               fio.time.system_time                             
                                                                                
  9400 +--------------------------------------------------------------------+   
  9300 |-+                          .+.+..+.+.               .+..   .+.+..  |   
       |.+..  +.+..+.+.+.+..+.+.+..+          +.+..+.+.+..+.+    +.+      +.|   
  9200 |-+   +                                                              |   
  9100 |-+  +                                                               |   
       |                                                                    |   
  9000 |-+                                                                  |   
  8900 |-+                                                                  |   
  8800 |-+                                                                  |   
       |        O                                                           |   
  8700 |-O         O     O    O    O O      O   O    O    O                 |   
  8600 |-+                  O   O      O  O        O                        |   
       |    O        O                        O        O                    |   
  8500 |-+             O                                                    |   
  8400 +--------------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                         fio.time.voluntary_context_switches                    
                                                                                
  24500 +-------------------------------------------------------------------+   
        |                               +   +                               |   
  24000 |-+                            + : : +                              |   
        |: +   +      +.              +  : :  +                    +        |   
        |:  + + +   ..  +.+.    .+. .+    +    +.      .+.        + +       |   
  23500 |-+  +   +.+        +..+   +             +.+.+.   +.+.+..+   +.+..+.|   
        |                                                                   |   
  23000 |-+                                                                 |   
        |                                                                   |   
  22500 |-+                                                                 |   
        |                                 O        O                        |   
        |        O                 O                                        |   
  22000 |-+    O   O    O O    O     O           O        O                 |   
        | O  O        O     O    O      O   O  O     O  O                   |   
  21500 +-------------------------------------------------------------------+   
                                                                                
                                                                                
[*] bisect-good sample
[O] bisect-bad  sample



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Rong Chen


View attachment "config-5.8.0-rc4-00047-g4e8fc10115a69" of type "text/plain" (169438 bytes)

View attachment "job-script" of type "text/plain" (8467 bytes)

View attachment "job.yaml" of type "text/plain" (5817 bytes)

View attachment "reproduce" of type "text/plain" (923 bytes)

Powered by blists - more mailing lists