lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20161220190139.GC23441@yexl-desktop>
Date:   Wed, 21 Dec 2016 03:01:39 +0800
From:   kernel test robot <xiaolong.ye@...el.com>
To:     Jan Kara <jack@...e.cz>
Cc:     Theodore Ts'o <tytso@....edu>, LKML <linux-kernel@...r.kernel.org>,
        lkp@...org
Subject: [lkp-developer] [ext4]  96f8ba3dd6: fio.write_bw_MBps +510.6%
 improvement


Greeting,

FYI, we noticed a +510.6% improvement of fio.write_bw_MBps due to commit:


commit: 96f8ba3dd632aff684cc7c67d9f4af435be0341c ("ext4: avoid split extents for DAX writes")
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master

in testcase: fio-basic
on test machine: 56 threads Intel(R) Xeon(R) CPU E5-2695 v3 @ 2.30GHz with 256G memory
with following parameters:

	disk: 2pmem
	fs: ext4
	mount_option: dax
	runtime: 200s
	nr_task: 50%
	time_based: tb
	rw: randwrite
	bs: 4k
	ioengine: sync
	test_size: 200G
	cpufreq_governor: performance

test-description: Fio is a tool that will spawn a number of threads or processes doing a particular type of I/O action as specified by the user.
test-url: https://github.com/axboe/fio


Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
        cd lkp-tests
        bin/lkp install job.yaml  # job file is attached in this email
        bin/lkp run     job.yaml

testcase/path_params/tbox_group/run: fio-basic/2pmem-ext4-dax-200s-50%-tb-randwrite-4k-sync-200G-performance/lkp-hsw-ep6

776722e85d3b0936  96f8ba3dd632aff684cc7c67d9
----------------  --------------------------
       fail:runs  %reproduction    fail:runs
           |             |             |
    820.67 ±  0%    +510.6%       5011 ±  4%  fio.write_bw_MBps
    210091 ±  0%    +510.6%    1282918 ±  4%  fio.write_iops
      0.14 ±  0%     -92.9%       0.01 ±  0%  fio.latency_100ms%
     24.00 ± 10%     -96.9%       0.74 ± 21%  fio.latency_100us%
      0.01 ± 57%  +2.9e+05%      21.76 ± 12%  fio.latency_10us%
      1.22 ± 37%   +3122.9%      39.40 ±  2%  fio.latency_20us%
      0.32 ±  8%     -93.8%       0.02 ±  0%  fio.latency_250us%
     74.28 ±  3%     -48.8%      38.01 ±  9%  fio.latency_50us%
      5511 ±  5%    +117.5%      11986 ±  3%  fio.time.involuntary_context_switches
    977.75 ±  1%    +149.2%       2436 ±  0%  fio.time.percent_of_cpu_this_job_got
      1874 ±  1%    +149.6%       4679 ±  0%  fio.time.system_time
     89.35 ±  3%    +137.8%     212.46 ±  3%  fio.time.user_time
    164733 ±  2%     -58.0%      69111 ±  3%  fio.time.voluntary_context_switches
     58.00 ±  2%     -44.8%      32.00 ±  2%  fio.write_clat_90%_us
     65.50 ±  1%     -44.5%      36.33 ±  2%  fio.write_clat_95%_us
     85.25 ±  1%     -44.9%      47.00 ±  3%  fio.write_clat_99%_us
    131.52 ±  0%     -83.9%      21.12 ±  4%  fio.write_clat_mean_us
      2270 ±  0%     -84.7%     347.78 ±  0%  fio.write_clat_stddev
    133959 ±  4%     +52.6%     204457 ±  3%  softirqs.RCU
   1433931 ±  1%     +94.7%    2791395 ±  0%  softirqs.TIMER
      5511 ±  5%    +117.5%      11986 ±  3%  time.involuntary_context_switches
    977.75 ±  1%    +149.2%       2436 ±  0%  time.percent_of_cpu_this_job_got
      1874 ±  1%    +149.6%       4679 ±  0%  time.system_time
     89.35 ±  3%    +137.8%     212.46 ±  3%  time.user_time
    164733 ±  2%     -58.0%      69111 ±  3%  time.voluntary_context_switches
   2766132 ±  0%     -49.2%    1405817 ±  0%  vmstat.io.bo
    613430 ±  0%     -60.9%     239670 ±  0%  vmstat.memory.buff
   1671149 ±  0%     -39.4%    1012059 ±  0%  vmstat.memory.cache
     10.00 ±  7%    +140.0%      24.00 ±  0%  vmstat.procs.r
     58099 ±  0%      +4.8%      60882 ±  0%  vmstat.system.in
    762597 ±  0%     -47.5%     400049 ±  0%  meminfo.Active
         0            5e+03       5067 ± 95%  latency_stats.max.do_get_write_access.jbd2_journal_get_write_access.__ext4_journal_get_write_access.ext4_split_extent_at.ext4_split_extent.ext4_ext_map_blocks.ext4_map_blocks.ext4_iomap_begin.iomap_apply.dax_iomap_rw.ext4_file_write_iter.__vfs_write
         0            5e+03       4996 ± 98%  latency_stats.max.do_get_write_access.jbd2_journal_get_write_access.__ext4_journal_get_write_access.ext4_ext_map_blocks.ext4_map_blocks.ext4_iomap_begin.iomap_apply.dax_iomap_rw.ext4_file_write_iter.__vfs_write.vfs_write.SyS_write
         0            2e+05     156829 ± 58%  latency_stats.sum.do_get_write_access.jbd2_journal_get_write_access.__ext4_journal_get_write_access.ext4_split_extent_at.ext4_split_extent.ext4_ext_map_blocks.ext4_map_blocks.ext4_iomap_begin.iomap_apply.dax_iomap_rw.ext4_file_write_iter.__vfs_write
         0            1e+04      12881 ± 59%  latency_stats.sum.do_get_write_access.jbd2_journal_get_write_access.__ext4_journal_get_write_access.ext4_ext_map_blocks.ext4_map_blocks.ext4_iomap_begin.iomap_apply.dax_iomap_rw.ext4_file_write_iter.__vfs_write.vfs_write.SyS_write
    209560 ± 33%     -2e+05          0        latency_stats.sum.do_get_write_access.jbd2_journal_get_write_access.__ext4_journal_get_write_access.ext4_split_extent_at.ext4_split_extent.ext4_split_convert_extents.ext4_ext_map_blocks.ext4_map_blocks.ext4_iomap_begin.iomap_apply.dax_iomap_rw.ext4_file_write_iter
   2492341 ±  4%     -2e+06     412777 ± 58%  latency_stats.sum.wait_transaction_locked.add_transaction_credits.start_this_handle.jbd2__journal_start.__ext4_journal_start_sb.ext4_iomap_begin.iomap_apply.dax_iomap_rw.ext4_file_write_iter.__vfs_write.vfs_write.SyS_write
   3976384 ±  8%     -4e+06     338379 ± 57%  latency_stats.sum.wait_transaction_locked.add_transaction_credits.start_this_handle.jbd2__journal_start.__ext4_journal_start_sb.ext4_iomap_end.iomap_apply.dax_iomap_rw.ext4_file_write_iter.__vfs_write.vfs_write.SyS_write
 1.549e+08           -1e+08   21221204 ± 57%  latency_stats.sum.jbd2_log_wait_commit.jbd2_log_do_checkpoint.__jbd2_log_wait_for_space.add_transaction_credits.start_this_handle.jbd2__journal_start.__ext4_journal_start_sb.ext4_iomap_end.iomap_apply.dax_iomap_rw.ext4_file_write_iter.__vfs_write
 1.549e+08           -1e+08   21221204 ± 57%  latency_stats.sum.max



                                 perf-stat.cpu-cycles

  1.4e+13 ++----------------------------------------------------------------+
          |                                                                 |
  1.2e+13 O+       O       O          OO    O  O  O  O O   O                |
          | OO OO O  OO OO  O OO O OO    OO  O  O       O O  OO OO OO OO OO O
    1e+13 ++                                                                |
          |                                                                 |
    8e+12 ++                                                                |
          |                                                                 |
    6e+12 ++                                                                |
          |                                                                 |
    4e+12 ++  .* .**.    *.     .*.    *.  .*                               |
          *.**  *    **.*  **.**   **.*  **  *  *  *.*.**.**                |
    2e+12 ++                                  : :: :                        |
          |                                   :: ::                         |
        0 ++-----------------------------------*--*O------------------------+


                               perf-stat.branch-misses

  1e+10 ++------------------------------------------------------------------+
  9e+09 ++  .*         .*      .*          *                                |
        *.**  *.**.*.**  *.*.**  *.    .* + *   *    .* .*.*                |
  8e+09 ++                         **.*  *  :   :  **  *                    |
  7e+09 ++                                  :   :  :                        |
        |                                    :  :  :                        |
  6e+09 ++                                   : :: :                         |
  5e+09 ++                                   : :: :                         |
  4e+09 ++O                                O :O: O:                         |
        O  O OO OO O OO OO O OO OO OO O OO  O: :O:: O OO O OO OO O OO OO OO O
  3e+09 ++                                   : : ::                         |
  2e+09 ++                                    :: ::                         |
        |                                     :  :                          |
  1e+09 ++                                    :  :                          |
      0 ++------------------------------------*--*-O------------------------+


                                 perf-stat.iTLB-loads

  3.5e+08 ++----------------------------------------------------------------+
          |                                                                 |
    3e+08 ++          O    O   O                             O  O           |
          |  O OO OO        O O  O     O    O  OO O  O O   O        O O  OO |
  2.5e+08 O+O        O  OO         O  O   O  O          O O   O  O O   O    O
          |                         O    O                                  |
    2e+08 ++                                                                |
          |                                                                 |
  1.5e+08 ++     .**.               *. *.* .*           *. *                |
          *.**.**    **.**.**.**.*.*  *   *  *  *  *.*.*  *                 |
    1e+08 ++                                 :  :  :                        |
          |                                   : :: :                        |
    5e+07 ++                                  :: ::                         |
          |                                    :  :                         |
        0 ++-----------------------------------*--*O------------------------+


                                 perf-stat.node-loads

  3.5e+09 ++----------------------------------------------------------------+
          |                                                                 |
    3e+09 *+*     *  **.    *.*                                             |
          |  *.  + :+   *   :  *.   *. *. *.**  *  *.    .*                 |
  2.5e+09 ++   **  *     *.*     *.*  *  *   :  :  : *.**  *                |
          |                                  :  :  :                        |
    2e+09 ++                                  : :: :                        |
          |                                   : :: :                        |
  1.5e+09 ++                                  : :: :                        |
          |                                   :: ::                         |
    1e+09 ++         O  OO O  O  O OO O     O :O :O    O   O OO OO OO     O O
          O OO OO OO  O     O  O       O OO  O::O::  O  O O           OO O  |
    5e+08 ++                                   :  :                         |
          |                                    :  :                         |
        0 ++-----------------------------------*--*O------------------------+


                            perf-stat.branch-miss-rate_

    3 ++--------------------------------------------------------------------+
      *. *.*.*   *. .**.*.* .**.*. *.    .*          *.**.*                 |
  2.5 ++*     *.*  *       *      *  *.**  *   *  *.*                       |
      |                                    :   :  :                         |
      |                                    :   :  :                         |
    2 ++                                    :  :  :                         |
      |                                     : :: :                          |
  1.5 ++                                    : :: :                          |
      |                                     : : ::                          |
    1 ++         O                          : : ::                          |
      O  O O OO O  O OO O OO OO O OO O OO  O: :OO:  O  OO O  O   OO OO O OO O
      | O                                 O  O: ::   O      O  O            |
  0.5 ++                                     :  :                           |
      |                                      :  :                           |
    0 ++-------------------------------------*--*-O-------------------------+


                                   perf-stat.ipc

  0.6 *+*--------*-*------**------------------------------------------------+
      |  *.*.**. :   **.*    **.*.**. .**.**   *  *.**.**.*                 |
  0.5 ++        *                    *     :   :  :                         |
      |                                    :   :  :                         |
      |                                    :   :  :                         |
  0.4 ++                                    : :: :                          |
      |                                     : :: :                          |
  0.3 ++                                  O :O:: :          O  O            |
      | O                 O          O      : : O:   O O                 O  |
  0.2 O+ O O OO OO O OO O  O OO O OO   OO  O: : ::  O   O O  O   OO OO O  O O
      |                                     : :O::                          |
      |                                      :  :                           |
  0.1 ++                                     :  :                           |
      |                                      :  :                           |
    0 ++-------------------------------------*--*-O-------------------------+


                                 fio.write_bw_MBps

  8000 ++-------------------------------------------------------------------+
       |                                  O   O                             |
  7000 ++                                                   O  O            |
  6000 ++O                                            O                     |
       |                  O          O           O                          |
  5000 ++ O    O         O   O O         O  O          O  O         O    OO |
       O    OO   OO OO O    O    OO O  O            O    O    O  OO   OO    O
  4000 ++                                      O                            |
       |                                                                    |
  3000 ++                                                                   |
  2000 ++                                                                   |
       |                                                                    |
  1000 ++                                                                   |
       *.**.**.*.**.**.*.**.**.*.**.**.*.**.*. *. *.*.**.**                 |
     0 ++-------------------------------------*--*O-------------------------+


                                    fio.write_iops

    2e+06 ++----------------------------------------------------------------+
  1.8e+06 ++                                O  O                            |
          |                                            O     O  O           |
  1.6e+06 ++O                                                               |
  1.4e+06 ++                O          O          O     O                   |
          |  O    O        O   O O    O   O  O             O       OO    OO O
  1.2e+06 O+   OO  O OO OO    O    OO    O           O    O   O  O    OO    |
    1e+06 ++                                    O                           |
   800000 ++                                                                |
          |                                                                 |
   600000 ++                                                                |
   400000 ++                                                                |
          *.        .*      *.                                              |
   200000 ++**.**.**  *.**.*  **.*.**.**.**.**. *. *.*.**.**                |
        0 ++-----------------------------------*--*O------------------------+




	[*] bisect-good sample
	[O] bisect-bad  sample


Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Xiaolong

View attachment "config-4.9.0-rc4-00044-g96f8ba3" of type "text/plain" (153623 bytes)

View attachment "job-script" of type "text/plain" (7093 bytes)

View attachment "job.yaml" of type "text/plain" (4701 bytes)

View attachment "reproduce" of type "text/plain" (658 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ