lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20220325071501.GA8478@xsang-OptiPlex-9020>
Date:   Fri, 25 Mar 2022 15:15:01 +0800
From:   kernel test robot <oliver.sang@...el.com>
To:     Filipe Manana <fdmanana@...e.com>
Cc:     lkp@...ts.01.org, lkp@...el.com, ying.huang@...el.com,
        feng.tang@...el.com, zhengjun.xing@...ux.intel.com,
        fengwei.yin@...el.com, LKML <linux-kernel@...r.kernel.org>,
        linux-btrfs@...r.kernel.org
Subject: [btrfs]  a052d3d1b6:  fio.write_iops 3241.7% improvement



Greeting,

FYI, we noticed a 3241.7% improvement of fio.write_iops due to commit:


commit: a052d3d1b6c77f193f7051cd5d4b08138fd57332 ("btrfs: only reserve the needed data space amount during fallocate")
https://git.kernel.org/cgit/linux/kernel/git/fdmanana/linux.git misc-next

in testcase: fio-basic
on test machine: 96 threads 2 sockets Ice Lake with 256G memory
with following parameters:

	runtime: 300s
	disk: 1HDD
	fs: btrfs
	nr_task: 100%
	test_size: 128G
	rw: randwrite
	bs: 4k
	ioengine: falloc
	cpufreq_governor: performance
	ucode: 0xb000280

test-description: Fio is a tool that will spawn a number of threads or processes doing a particular type of I/O action as specified by the user.
test-url: https://github.com/axboe/fio





Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        sudo bin/lkp install job.yaml           # job file is attached in this email
        bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
        sudo bin/lkp run generated-yaml-file

        # if come across any failure that blocks the test,
        # please remove ~/.lkp and /lkp dir to run from a clean state.

=========================================================================================
bs/compiler/cpufreq_governor/disk/fs/ioengine/kconfig/nr_task/rootfs/runtime/rw/tbox_group/test_size/testcase/ucode:
  4k/gcc-9/performance/1HDD/btrfs/falloc/x86_64-rhel-8.3/100%/debian-10.4-x86_64-20200603.cgz/300s/randwrite/lkp-icl-2sp1/128G/fio-basic/0xb000280

commit: 
  3d83c164a0 ("btrfs: move common inode creation code into btrfs_create_new_inode()")
  a052d3d1b6 ("btrfs: only reserve the needed data space amount during fallocate")

3d83c164a02f65c3 a052d3d1b6c77f193f7051cd5d4 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
     99.68           -99.6        0.04 ± 49%  fio.latency_100us%
      0.07 ± 65%      +0.1        0.16 ± 16%  fio.latency_10us%
      0.02 ± 30%      +0.1        0.11 ± 36%  fio.latency_20us%
      0.04 ± 28%      -0.0        0.01        fio.latency_250us%
      0.01 ± 29%     +69.5       69.56 ±  3%  fio.latency_2us%
      0.02 ± 40%      +0.2        0.21 ± 64%  fio.latency_4us%
     28.60 ±  2%     -95.5%       1.28 ± 10%  fio.time.elapsed_time
     28.60 ±  2%     -95.5%       1.28 ± 10%  fio.time.elapsed_time.max
     16835 ±  3%     -96.6%     571.33 ± 13%  fio.time.involuntary_context_switches
     16349 ±  2%     -10.9%      14569        fio.time.minor_page_faults
      9381           -46.1%       5060 ± 10%  fio.time.percent_of_cpu_this_job_got
      2658 ±  2%     -98.8%      32.01        fio.time.system_time
     25.08 ±  5%     +28.4%      32.19 ±  4%  fio.time.user_time
      3616 ±  2%     -76.6%     846.00 ±  5%  fio.time.voluntary_context_switches
      4651 ±  2%   +3241.7%     155452 ± 12%  fio.write_bw_MBps
     85333 ±  2%     -98.5%       1314        fio.write_clat_90%_us
     87210           -98.4%       1397        fio.write_clat_95%_us
     90453           -98.2%       1664 ±  8%  fio.write_clat_99%_us
     80075 ±  2%     -98.5%       1198        fio.write_clat_mean_us
     44839 ± 22%     -53.9%      20667 ± 10%  fio.write_clat_stddev
   1190883 ±  2%   +3241.7%   39795840 ± 12%  fio.write_iops
    207365 ± 19%     -42.9%     118345 ±  4%  numa-numastat.node1.numa_hit
    530.96           -87.7%      65.31 ±223%  pmeter.Average_Active_Power
     80.02 ±  4%     -38.0%      49.62 ±  4%  uptime.boot
     12.79 ± 14%    +545.5%      82.54 ±  2%  iostat.cpu.idle
     86.29 ±  2%     -88.9%       9.55 ± 10%  iostat.cpu.system
      0.91 ±  4%    +771.1%       7.90 ± 13%  iostat.cpu.user
      7.25 ± 28%     +64.1       71.35 ±  8%  mpstat.cpu.all.idle%
      0.75 ±  5%      +0.3        1.05 ± 24%  mpstat.cpu.all.irq%
      0.01 ± 40%      +0.0        0.04 ± 36%  mpstat.cpu.all.soft%
     91.06 ±  2%     -77.0       14.08 ± 21%  mpstat.cpu.all.sys%
      0.93 ±  3%     +12.6       13.48 ± 23%  mpstat.cpu.all.usr%
     12.17 ± 16%    +574.0%      82.00 ±  2%  vmstat.cpu.id
     85.50           -89.1%       9.33 ± 13%  vmstat.cpu.sy
      1186 ±  2%    -100.0%       0.00        vmstat.io.bo
     83.83 ±  2%     -80.1%      16.67 ± 33%  vmstat.procs.r
      2723 ±  2%    +180.4%       7637 ±  6%  vmstat.system.cs
    184148           -25.1%     137836 ±  5%  vmstat.system.in
     10972 ±  8%     +33.5%      14652 ±  8%  numa-vmstat.node0.nr_kernel_stack
      2566 ± 36%     +99.6%       5123 ± 13%  numa-vmstat.node0.nr_page_table_pages
    305.83 ± 56%     -93.8%      18.83 ±218%  numa-vmstat.node1.nr_inactive_file
      1137 ± 74%     -58.5%     472.67 ± 86%  numa-vmstat.node1.nr_page_table_pages
      7671 ± 23%     -35.3%       4967 ± 13%  numa-vmstat.node1.nr_slab_reclaimable
     27154 ± 12%     -20.8%      21509 ±  9%  numa-vmstat.node1.nr_slab_unreclaimable
    305.83 ± 56%     -93.8%      18.83 ±218%  numa-vmstat.node1.nr_zone_inactive_file
     44723 ± 18%     -50.4%      22163 ± 46%  numa-meminfo.node0.AnonHugePages
     10975 ±  8%     +33.7%      14674 ±  8%  numa-meminfo.node0.KernelStack
     10286 ± 36%     +99.5%      20522 ± 13%  numa-meminfo.node0.PageTables
      1921 ± 25%     -46.9%       1021 ± 36%  numa-meminfo.node1.Active
      1226 ± 56%     -93.8%      75.67 ±218%  numa-meminfo.node1.Inactive(file)
     30687 ± 23%     -35.2%      19870 ± 13%  numa-meminfo.node1.KReclaimable
      4537 ± 74%     -58.3%       1893 ± 85%  numa-meminfo.node1.PageTables
     30687 ± 23%     -35.2%      19870 ± 13%  numa-meminfo.node1.SReclaimable
    108615 ± 12%     -20.8%      86038 ±  9%  numa-meminfo.node1.SUnreclaim
    139303 ± 13%     -24.0%     105909 ±  9%  numa-meminfo.node1.Slab
      3647           -37.6%       2275 ± 12%  meminfo.Active
      3419           -39.9%       2056 ± 13%  meminfo.Active(anon)
     58547 ±  3%     -41.6%      34178 ±  4%  meminfo.AnonHugePages
    330276           +34.8%     445049 ±  4%  meminfo.AnonPages
   3798478 ±  2%     -65.4%    1314197 ± 16%  meminfo.Committed_AS
    366737           +27.7%     468352 ±  4%  meminfo.Inactive
    364992           +28.2%     467743 ±  4%  meminfo.Inactive(anon)
     20396           +11.8%      22795 ±  2%  meminfo.KernelStack
     55354           -21.6%      43382        meminfo.Mapped
     14784 ±  3%     +52.0%      22466 ±  9%  meminfo.PageTables
     38638           -34.7%      25212 ±  2%  meminfo.Shmem
      2939 ±  2%     -77.0%     676.67 ± 13%  turbostat.Avg_MHz
     91.92 ±  2%     -66.3       25.58 ± 13%  turbostat.Busy%
      3196           -16.1%       2681        turbostat.Bzy_MHz
      6.41 ± 41%     +39.4       45.81 ± 34%  turbostat.C1E%
      1.70 ±128%     +26.1       27.81 ± 49%  turbostat.C6%
      6.70 ± 37%    +694.4%      53.20 ± 29%  turbostat.CPU%c1
      1.39 ±147%   +1428.2%      21.22 ± 67%  turbostat.CPU%c6
     59.33           -10.4%      53.17 ±  3%  turbostat.CoreTmp
   5954076 ±  3%     -89.3%     638209 ± 16%  turbostat.IRQ
     59.67           -10.6%      53.33 ±  2%  turbostat.PkgTmp
    347.98           -15.4%     294.40 ±  4%  turbostat.PkgWatt
     14.31 ± 29%      -5.8        8.52 ±142%  perf-profile.calltrace.cycles-pp._dl_catch_error
     10.95 ± 75%      -5.4        5.56 ±141%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe._dl_catch_error
     10.95 ± 75%      -5.4        5.56 ±141%  perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe._dl_catch_error
      5.22 ±100%      -5.2        0.00        perf-profile.calltrace.cycles-pp.mmput.do_exit.do_group_exit.__x64_sys_exit_group.do_syscall_64
      5.22 ±100%      -5.2        0.00        perf-profile.calltrace.cycles-pp.exit_mmap.mmput.do_exit.do_group_exit.__x64_sys_exit_group
      7.00 ±111%      -4.6        2.44 ±147%  perf-profile.calltrace.cycles-pp.unmap_vmas.exit_mmap.mmput.do_exit.do_group_exit
      7.00 ±111%      -4.6        2.44 ±147%  perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.exit_mmap.mmput.do_exit
      5.48 ±113%      -3.0        2.44 ±147%  perf-profile.calltrace.cycles-pp.zap_pte_range.unmap_page_range.unmap_vmas.exit_mmap.mmput
     14.31 ± 29%      -6.4        7.96 ±144%  perf-profile.children.cycles-pp._dl_catch_error
      4.89 ±103%      -4.9        0.00        perf-profile.children.cycles-pp.release_pages
      7.00 ±111%      -4.6        2.44 ±147%  perf-profile.children.cycles-pp.unmap_vmas
      7.00 ±111%      -4.6        2.44 ±147%  perf-profile.children.cycles-pp.unmap_page_range
      5.98 ±103%      -4.3        1.67 ±223%  perf-profile.children.cycles-pp.walk_component
      5.48 ±113%      -3.0        2.44 ±147%  perf-profile.children.cycles-pp.zap_pte_range
      5.82 ±110%      -0.1        5.68 ±162%  perf-profile.children.cycles-pp.format_decode
      5.82 ±110%      -2.2        3.60 ±144%  perf-profile.self.cycles-pp.format_decode
    852.67           -39.6%     514.67 ± 13%  proc-vmstat.nr_active_anon
     82485           +35.0%     111385 ±  4%  proc-vmstat.nr_anon_pages
      5711 ± 73%     -99.2%      44.17 ± 63%  proc-vmstat.nr_dirtied
     91158           +28.7%     117287 ±  4%  proc-vmstat.nr_inactive_anon
     20403           +11.5%      22754 ±  2%  proc-vmstat.nr_kernel_stack
     14082           -18.9%      11426        proc-vmstat.nr_mapped
      3707 ±  4%     +51.3%       5608 ±  9%  proc-vmstat.nr_page_table_pages
      9647           -32.2%       6544 ±  2%  proc-vmstat.nr_shmem
     27727            -4.2%      26564        proc-vmstat.nr_slab_reclaimable
    852.67           -39.6%     514.67 ± 13%  proc-vmstat.nr_zone_active_anon
     91158           +28.7%     117287 ±  4%  proc-vmstat.nr_zone_inactive_anon
    380393           -16.7%     316892        proc-vmstat.numa_hit
    293361 ±  2%     -21.7%     229776        proc-vmstat.numa_local
    380455           -16.7%     316873        proc-vmstat.pgalloc_normal
    258327           -29.5%     182208 ±  2%  proc-vmstat.pgfault
    278534 ±  3%     -42.0%     161619 ±  3%  proc-vmstat.pgfree
     12395 ±  3%     -37.0%       7812 ±  6%  proc-vmstat.pgreuse
      0.39 ±  6%      +1.1        1.44 ± 35%  perf-stat.i.branch-miss-rate%
  11226008 ±  7%    +440.8%   60709258 ± 21%  perf-stat.i.branch-misses
     42.60           -25.4       17.23 ± 22%  perf-stat.i.cache-miss-rate%
  29893686           -48.5%   15393933 ± 18%  perf-stat.i.cache-misses
      2142          +161.1%       5594 ± 17%  perf-stat.i.context-switches
     20.93           -90.3%       2.04 ± 20%  perf-stat.i.cpi
     96050            +2.4%      98325 ±  3%  perf-stat.i.cpu-clock
 2.914e+11           -79.6%  5.937e+10 ± 46%  perf-stat.i.cpu-cycles
    161.58 ±  2%    +172.4%     440.20 ± 25%  perf-stat.i.cpu-migrations
      9574 ±  2%     -63.2%       3520 ± 38%  perf-stat.i.cycles-between-cache-misses
      0.00 ± 84%      +0.0        0.05 ± 35%  perf-stat.i.dTLB-load-miss-rate%
    107745 ± 44%   +1472.2%    1693986 ± 34%  perf-stat.i.dTLB-load-misses
      0.00 ± 28%      +0.0        0.03 ± 38%  perf-stat.i.dTLB-store-miss-rate%
     48471 ± 12%   +1086.3%     575011 ± 20%  perf-stat.i.dTLB-store-misses
      0.06 ±  7%    +967.1%       0.62 ± 24%  perf-stat.i.ipc
    169.22 ±  3%   +2000.8%       3555 ± 39%  perf-stat.i.major-faults
      3.04           -79.9%       0.61 ± 47%  perf-stat.i.metric.GHz
      4997 ±  3%    +804.7%      45213 ± 28%  perf-stat.i.minor-faults
     95.51           -33.3       62.24 ± 11%  perf-stat.i.node-load-miss-rate%
   5112839           -76.9%    1180256 ± 48%  perf-stat.i.node-load-misses
    157058 ±  2%    +184.7%     447159 ± 27%  perf-stat.i.node-loads
     69.69           -41.4       28.28 ± 33%  perf-stat.i.node-store-miss-rate%
   6517770           -76.5%    1530670 ± 32%  perf-stat.i.node-store-misses
      5166 ±  3%    +843.9%      48766 ± 28%  perf-stat.i.page-faults
     96050            +2.4%      98326 ±  3%  perf-stat.i.task-clock
     42.43 ±  2%     -24.8       17.59 ± 23%  perf-stat.overall.cache-miss-rate%
     21.12           -92.9%       1.49 ± 41%  perf-stat.overall.cpi
      9748           -60.1%       3886 ± 41%  perf-stat.overall.cycles-between-cache-misses
      0.00 ± 46%      +0.0        0.03 ± 89%  perf-stat.overall.dTLB-load-miss-rate%
      0.05         +1526.5%       0.77 ± 31%  perf-stat.overall.ipc
     97.02           -27.6       69.44 ± 19%  perf-stat.overall.node-load-miss-rate%
     70.99           -40.6       30.34 ± 34%  perf-stat.overall.node-store-miss-rate%
     12059           -73.1%       3245 ± 81%  perf-stat.overall.path-length
  10857952 ±  7%    +231.5%   35995103 ± 25%  perf-stat.ps.branch-misses
  28909823           -68.0%    9254837 ± 27%  perf-stat.ps.cache-misses
  68161639 ±  2%     -22.7%   52690454 ± 13%  perf-stat.ps.cache-references
      2071           +54.7%       3204 ±  8%  perf-stat.ps.context-switches
     92860           -37.4%      58162 ± 16%  perf-stat.ps.cpu-clock
 2.818e+11           -86.4%  3.828e+10 ± 58%  perf-stat.ps.cpu-cycles
    156.29 ±  2%     +61.4%     252.29 ± 15%  perf-stat.ps.cpu-migrations
    104205 ± 44%    +820.5%     959180 ± 20%  perf-stat.ps.dTLB-load-misses
     46858 ± 12%    +603.2%     329524 ± 12%  perf-stat.ps.dTLB-store-misses
    162.82 ±  3%   +1085.3%       1929 ± 25%  perf-stat.ps.major-faults
      4827 ±  3%    +425.3%      25360 ± 15%  perf-stat.ps.minor-faults
   4943794           -84.6%     760468 ± 60%  perf-stat.ps.node-load-misses
    151872 ±  2%     +65.9%     251935 ± 14%  perf-stat.ps.node-loads
   6302973           -84.9%     950283 ± 44%  perf-stat.ps.node-store-misses
   2576891 ±  4%     -19.4%    2077892 ±  9%  perf-stat.ps.node-stores
      4990 ±  3%    +446.8%      27289 ± 15%  perf-stat.ps.page-faults
     92860           -37.4%      58162 ± 16%  perf-stat.ps.task-clock
 4.046e+11           -73.1%  1.089e+11 ± 81%  perf-stat.total.instructions




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://01.org/lkp



View attachment "config-5.17.0-rc8-00144-ga052d3d1b6c7" of type "text/plain" (162136 bytes)

View attachment "job-script" of type "text/plain" (8458 bytes)

View attachment "job.yaml" of type "text/plain" (5712 bytes)

View attachment "reproduce" of type "text/plain" (707 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ