lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 3 Feb 2017 09:52:59 +0800
From:   kernel test robot <xiaolong.ye@...el.com>
To:     Dan Williams <dan.j.williams@...el.com>
Cc:     linux-nvdimm@...ts.01.org, Jan Kara <jack@...e.cz>,
        Matthew Wilcox <mawilcox@...rosoft.com>, x86@...nel.org,
        linux-kernel@...r.kernel.org, Christoph Hellwig <hch@....de>,
        Jeff Moyer <jmoyer@...hat.com>, Ingo Molnar <mingo@...hat.com>,
        Al Viro <viro@...iv.linux.org.uk>,
        "H. Peter Anvin" <hpa@...or.com>, linux-fsdevel@...r.kernel.org,
        Thomas Gleixner <tglx@...utronix.de>,
        Ross Zwisler <ross.zwisler@...ux.intel.com>, lkp@...org
Subject: [lkp-robot] [x86, dax, pmem]  2e12109d1c: fio.write_bw_MBps -75%
 regression


Greeting,

FYI, we noticed a -75% regression of fio.write_bw_MBps due to commit:


commit: 2e12109d1c32c810088820478d21b5b7cd87a805 ("x86, dax, pmem: introduce 'copy_from_iter' dax operation")
url: https://github.com/0day-ci/linux/commits/Dan-Williams/dax-pmem-move-cpu-cache-maintenance-to-libnvdimm/20170121-031649


in testcase: fio-basic
on test machine: 56 threads Intel(R) Xeon(R) CPU E5-2695 v3 @ 2.30GHz with 256G memory
with following parameters:

	disk: 2pmem
	fs: xfs
	mount_option: dax
	runtime: 200s
	nr_task: 50%
	time_based: tb
	rw: randwrite
	bs: 2M
	ioengine: sync
	test_size: 200G
	cpufreq_governor: performance

test-description: Fio is a tool that will spawn a number of threads or processes doing a particular type of I/O action as specified by the user.
test-url: https://github.com/axboe/fio


Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
        cd lkp-tests
        bin/lkp install job.yaml  # job file is attached in this email
        bin/lkp run     job.yaml

testcase/path_params/tbox_group/run: fio-basic/2pmem-xfs-dax-200s-50%-tb-randwrite-2M-sync-200G-performance/lkp-hsw-ep6

c42a4508649e40af  2e12109d1c32c810088820478d  
----------------  --------------------------  
         %stddev      change         %stddev
             \          |                \  
     68769 ±  3%       -75%      17370        fio.write_bw_MBps
     34384 ±  3%       -75%       8685        fio.write_iops
      0.70 ± 20%     14149%      99.39        fio.latency_4ms%
       745 ±  4%       327%       3182        fio.write_clat_mean_us
      1580 ± 14%       137%       3752 ±  5%  fio.write_clat_99%_us
      1405 ± 17%       136%       3320 ±  3%  fio.write_clat_90%_us
      1456 ± 16%       133%       3392        fio.write_clat_95%_us
       435 ± 27%       -60%        175 ± 21%  fio.write_clat_stddev
      0.01            -100%       0.00        fio.latency_250us%
     21.31 ± 36%      -100%       0.00        fio.latency_2ms%
      5122               8%       5530        fio.time.system_time
    487.59 ± 12%       -84%      79.88 ± 10%  fio.time.user_time
     59604                       60810        vmstat.system.in
      2351             -25%       1758        vmstat.system.cs
      1191              19%       1413        turbostat.Avg_MHz
       199               4%        206        turbostat.PkgWatt
     51.36                       50.78        turbostat.%Busy
       205             -42%        118        turbostat.RAMWatt
      0.00 ± 23%       224%       0.01 ±  4%  perf-stat.dTLB-load-miss-rate%
      0.09              66%       0.15        perf-stat.branch-miss-rate%
 1.431e+13 ±  3%        13%  1.617e+13        perf-stat.cpu-cycles
 2.893e+08             -19%  2.356e+08        perf-stat.branch-misses
    468564             -26%     346074        perf-stat.context-switches
 2.444e+08 ±  6%       -39%  1.503e+08        perf-stat.iTLB-loads
  1.26e+08 ± 27%       -42%   72844209 ± 17%  perf-stat.node-load-misses
 3.148e+11             -51%  1.546e+11        perf-stat.branch-instructions
  40098660 ± 22%       -56%   17714563 ± 10%  perf-stat.node-store-misses
    326873 ± 34%       -62%     122748        perf-stat.instructions-per-iTLB-miss
 1.741e+12             -67%    5.7e+11 ±  7%  perf-stat.dTLB-stores
 1.776e+12 ± 11%       -68%  5.699e+11 ± 12%  perf-stat.dTLB-loads
 5.629e+12             -69%  1.725e+12        perf-stat.instructions
      0.39 ±  3%       -73%       0.11        perf-stat.ipc
 1.385e+11 ±  3%       -75%  3.438e+10        perf-stat.cache-references



                               perf-stat.instructions

  7e+12 ++------------------------------------------------------------------+
        |     *.                                *.        *.                |
  6e+12 ++*  :  *. .*   *. .*.*.*. .*.*.*   *  :  *   *  :  *          .*.  |
        |: + :    *  : :  *       *      : : + :   : : + :   :  *.   .*   *.*
  5e+12 ++  *        : :                 : :  *    : :  *    : :  *.*       |
        *             *                   *         *         ::            |
  4e+12 ++                                                    *             |
        |                                                                   |
  3e+12 ++                                                                  |
        |                                                                   |
  2e+12 ++                                                                  |
        O O O O O O O O   O O O O O O O O O O O O O O O O O O               |
  1e+12 ++                                                                  |
        |                                                                   |
      0 ++--------------O---------------------------------------------------+


                              perf-stat.cache-references

  1.8e+11 ++----------------------------------------------------------------+
          |     *.                               *.        *                |
  1.6e+11 ++   +  *   *       .*.*.   .*.*      +  *      + *               |
  1.4e+11 ++*.*    + + :  * .*     *.*    :  *.*    :  *.*   :  *.     .*.*.*
          |+        *  : : *              : +       : +      : :  *.*.*     |
  1.2e+11 *+            ::                 *         *        ::            |
    1e+11 ++            *                                     *             |
          |                                                                 |
    8e+10 ++                                                                |
    6e+10 ++                                                                |
          |                                                                 |
    4e+10 O+    O     O O  O O       O O O   O O O O   O O OO               |
    2e+10 ++O O   O O          O O O       O         O                      |
          |                                                                 |
        0 ++--------------O-------------------------------------------------+


                             perf-stat.branch-instructions

    4e+11 ++----------------------------------------------------------------+
          |     *.                               *.        *                |
  3.5e+11 ++   :  *   *   *  *. .*. .*.*.       :  *      : *               |
    3e+11 ++*. :   + + : : :+  *   *     *   *. :   :  *. :  :  *.   .*.*.*.*
          |+  *     *  : : *              + +  *    : +  *   : :  *.*       |
  2.5e+11 *+            *                  *         *        ::            |
          |                                                   *             |
    2e+11 ++                                                                |
          |                                            O                    |
  1.5e+11 O+O O O O O O O  O O O O O O O O O O O O O O   O OO               |
    1e+11 ++                                                                |
          |                                                                 |
    5e+10 ++                                                                |
          |                                                                 |
        0 ++--------------O-------------------------------------------------+


                                 perf-stat.dTLB-loads

  3.5e+12 ++----------------------------------------------------------------+
          |                                                                 |
    3e+12 ++                             *                                  |
          |                             ::                                  |
  2.5e+12 ++              *             : :                                 |
          | *   *.   .*   :*.*. .*.*. .*  :  *   *.    *   *    *.*         |
    2e+12 ++ + +  *.*  : :     *     *     :+ + +  *. + + + *  :   + .*.    |
          *   *        : :                 *   *     *   *   + :    *   *.*.|
  1.5e+12 ++            *                                     *             *
          |                                                                 |
    1e+12 ++                                                                |
          |                                            O                    |
    5e+11 O+O O O O O O O  O O O O O O O O O O O O O O   O OO               |
          |                                                                 |
        0 ++--------------O-------------------------------------------------+


                                 perf-stat.dTLB-stores

  2.5e+12 ++----------------------------------------------------------------+
          |                             .*                                  |
          |     *.                    .* :       *.        *                |
    2e+12 ++   :  *. .*   * .*.*.*.*.*    :     :  *      : *               |
          | *. :    *  :  :*              :  *. :   :  *. :  :  *.   .*. .*.*
          |+  *        : :                 :+  *    : +  *   : :  *.*   *   |
  1.5e+12 *+            ::                 *         *        ::            |
          |             *                                     *             |
    1e+12 ++                                                                |
          |                                                                 |
          |                                                                 |
    5e+11 O+O O O O O O O  O O O O O O O O O O O O O O O O OO               |
          |                                                                 |
          |                                                                 |
        0 ++--------------O-------------------------------------------------+


                             perf-stat.context-switches

  500000 ++------------*------------------------------------------*---*-----+
  450000 *+       .*. + + .*.     .*.*.   .*       .*.       .*. +  *   *.*.*
         | *.*.*.*   *   *   *.*.*     *.*  *.*.*.*   *.*.*.*   *           |
  400000 ++                                                                 |
  350000 O+O O O O O O O   O O O O O O O O OO O O O O O O O O               |
         |                                                                  |
  300000 ++                                                                 |
  250000 ++                                                                 |
  200000 ++                                                                 |
         |                                                                  |
  150000 ++                                                                 |
  100000 ++                                                                 |
         |                                                                  |
   50000 ++                                                                 |
       0 ++--------------O--------------------------------------------------+


                                   perf-stat.ipc

  0.45 ++--------------*-------*--------------------------------------------+
       | *   *.*.      :+ .*.*  + .*. .*    *   *.*   *   *.*   *      .*.  |
   0.4 ++ + +    *.*  :  *       *   *  +  : + +   : : + +   :  :+   .*   *.*
  0.35 ++  *        + :                  + :  *    : :  *    : :  *.*       |
       *             *                    *         *         ::            |
   0.3 ++                                                     *             |
  0.25 ++                                                                   |
       |                                                                    |
   0.2 ++                                                                   |
  0.15 ++                                                                   |
       |                                                                    |
   0.1 O+O O O O O O O   O O O O O O O O  O O O O O O O O O O               |
  0.05 ++                                                                   |
       |                                                                    |
     0 ++--------------O----------------------------------------------------+


                                  fio.write_bw_MBps

  80000 ++------------------------------------------------------------------+
        | *   *.*.      *. .*.*.*. .*.*.*   *   *.*   *   *.*               *
  70000 ++ + +    *.*   : *       *      : : + +   : : + +   :  *.   .*.*. +|
  60000 ++  *        : :                 : :  *    : :  *    : :  *.*     * |
        *            : :                  *         *         ::            |
  50000 ++            *                                       *             |
        |                                                                   |
  40000 ++                                                                  |
        |                                                                   |
  30000 ++                                                                  |
  20000 ++                                                                  |
        O O O O O O O O   O O O O O O O O O O O O O O O O O O               |
  10000 ++                                                                  |
        |                                                                   |
      0 ++--------------O---------------------------------------------------+


                                   fio.write_iops

  40000 ++------------------------------------------------------------------+
        | *   *.*.      *. .*.*.*. .*.*.*   *   *.*   *   *.*               *
  35000 ++ + +    *.*   : *       *      : : + +   : : + +   :  *.   .*.*. +|
  30000 ++  *        : :                 : :  *    : :  *    : :  *.*     * |
        *            : :                  *         *         ::            |
  25000 ++            *                                       *             |
        |                                                                   |
  20000 ++                                                                  |
        |                                                                   |
  15000 ++                                                                  |
  10000 ++                                                                  |
        O O O O O O O O   O O O O O O O O O O O O O O O O O O               |
   5000 ++                                                                  |
        |                                                                   |
      0 ++--------------O---------------------------------------------------+


                              fio.write_clat_mean_us

  4000 ++-------------------------------------------------------------------+
       |                                            O                       |
  3500 O+O O O O O         O O O O O   O  O O   O O                         |
  3000 ++          O O   O           O        O       O O O O               |
       |                                                                    |
  2500 ++                                                                   |
       |                                                                    |
  2000 ++                                                                   |
       |                                                                    |
  1500 ++                                                                   |
  1000 ++                                                     *             |
       *. .*.   .*.*.*. .*.     .*.      .*. .*.   .*. .*.   + + .*.*.*. .*.*
   500 ++*   *.*       *   *.*.*   *.*.*.   *   *.*   *   *.*   *       *   |
       |                                                                    |
     0 ++--------------O----------------------------------------------------+


                               fio.write_clat_90__us

  4000 ++-------------------------------------------------------------------+
       | O   O O O           O O O        O         O                       |
  3500 O+  O       O O   O O       O O O    O O O O       O O               |
  3000 ++                                             O O                   |
       |                                                                    |
  2500 ++                                                                   |
       |                                                                    |
  2000 ++                                                                   |
       *        .*. .*                   .*        .*        .*.    *.    *.|
  1500 ++ .*.*.*   *  + .*. .*. .*. .*.*.  + .*.*.*  + .*.*.*   *. +  *  :  *
  1000 ++*             *   *   *   *        *         *           *    + :  |
       |                                                                *   |
   500 ++                                                                   |
       |                                                                    |
     0 ++--------------O----------------------------------------------------+


                               fio.write_clat_95__us

  4500 ++-------------------------------------------------------------------+
       |                                            O                       |
  4000 ++O       O                                                          |
  3500 O+  O O O   O O   O O O O O O O O  O O O O O       O O               |
       |                                              O O                   |
  3000 ++                                                                   |
  2500 ++                                                                   |
       |                                                                    |
  2000 *+           .*                    *         *         *             |
  1500 ++ .*.   .*.*  + .*. .*. .*. .*. .. + .*.   + + .*.   + +   .*.*   *.*
       | *   *.*       *   *   *   *   *    *   *.*   *   *.*   *.*    + +  |
  1000 ++                                                               *   |
   500 ++                                                                   |
       |                                                                    |
     0 ++--------------O----------------------------------------------------+


                                 fio.latency_4ms_

  100 O+--O-------O-O----O-O---O---O-O-O---O-O-O-O---O-O-O--O---------------+
   90 ++O   O O O            O   O       O         O                        |
      |                                                                     |
   80 ++                                                                    |
   70 ++                                                                    |
      |                                                                     |
   60 ++                                                                    |
   50 ++                                                                    |
   40 ++                                                                    |
      |                                                                     |
   30 ++                                                                    |
   20 ++                                                                    |
      |                                                                     |
   10 ++                                                                    |
    0 *+*-*-*-*-*-*-*-O--*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*--*-*-*-*-*-*-*-*-*


                                 turbostat.Avg_MHz

  1600 ++-------------------------------------------------------------------+
       |                                                                    |
  1400 O+O O O O O O O   O O O O O O O O  O O O O O O O O O O               |
  1200 ++ .*.*.*. .*.   .*.*.*. .*.*.*.*..   .*.*.*.   .*.*.*. .*.         .*
       *.*       *   *.*       *          *.*       *.*       *   *.*.*.*.* |
  1000 ++                                                                   |
       |                                                                    |
   800 ++                                                                   |
       |                                                                    |
   600 ++                                                                   |
   400 ++                                                                   |
       |                                                                    |
   200 ++                                                                   |
       |                                                                    |
     0 ++--------------O----------------------------------------------------+


                                 turbostat.RAMWatt

  250 ++--------------------------------------------------------------------+
      |                                                                     |
      |.*. .*.*. .*   *.. .*.*.*.*.*.*.*. .*. .*.*. .*. .*..*. .*.     .*. .*
  200 *+  *     *  + +   *               *   *     *   *      *   *.*.*   * |
      |             *                                                       |
      |                                                                     |
  150 ++                                                                    |
      O O O O O O O O    O O O O O O O O O O O O O O O O O  O               |
  100 ++                                                                    |
      |                                                                     |
      |                                                                     |
   50 ++                                                                    |
      |                                                                     |
      |                                                                     |
    0 ++--------------O-----------------------------------------------------+


                                fio.time.user_time

  800 ++--------------------------------------------------------------------+
      |                                                         *.          |
  700 ++                       *                                : *         |
  600 ++                       ::                              :   :        |
      |                       : : .*                           :   :    *   |
  500 ++*.   .*.*.         *. :  *  + .*   *.   .*   *.    .*  :    :  + +  |
      |+  *.*     *.*.*.. +  *       *  + +  *.*  + +  *.*.  +:     *.*   *.*
  400 *+                 *               *         *          *             |
      |                                                                     |
  300 ++                                                                    |
  200 ++                                                                    |
      |                                                                     |
  100 ++                               O       O         O                  |
      O O O O O O O O    O O O O O O O   O O O   O O O O    O               |
    0 ++--------------O-----------------------------------------------------+


                               fio.time.system_time

  6000 ++-------------------------------------------------------------------+
       O O O O O O O O   O O O O O O O O  O O O O O O O O O O               |
  5000 *+*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*..*.*.*.*.*.*.*.*.*.*.*.   .*.*.*.*.*
       |                                                        *.*         |
       |                                                                    |
  4000 ++                                                                   |
       |                                                                    |
  3000 ++                                                                   |
       |                                                                    |
  2000 ++                                                                   |
       |                                                                    |
       |                                                                    |
  1000 ++                                                                   |
       |                                                                    |
     0 ++--------------O----------------------------------------------------+


                                fio.latency_250us_

  0.6 ++--------------------------------------------------------------------+
      |                                                       *             |
  0.5 ++                                                      :             |
      |   *                                  *         *      :             |
      |   :                                  :         :      :             |
  0.4 ++  :                                  :         :     : :            |
      |  : :      *                         : :       : :    : :            |
  0.3 ++ : :      :                         : :       : :    : :            |
      |  : :      ::                        : :       : :    : :            |
  0.2 ++ : :     : :                        : :       : :    : :            |
      |  : :     : :                        : :       : :    : :            |
      |  : :     : :                        : :       : :   :   :           |
  0.1 ++:   :    :  :                      :   :     :   :  :   :           |
      | :   :   :   :                      :   :     :   :  :   :           |
    0 *+*---*-*-*---*-*--*-*-*-*-*-*-*-*-*-*---*-*-*-*---*--*---*-*-*-*-*-*-*

	[*] bisect-good sample
	[O] bisect-bad  sample


Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Xiaolong

View attachment "config-4.10.0-rc4-00111-g2e12109" of type "text/plain" (155570 bytes)

View attachment "job-script" of type "text/plain" (7086 bytes)

View attachment "job.yaml" of type "text/plain" (4695 bytes)

View attachment "reproduce" of type "text/plain" (684 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ