lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200803094257.GA23458@shao2-debian>
Date:   Mon, 3 Aug 2020 17:42:57 +0800
From:   kernel test robot <rong.a.chen@...el.com>
To:     Dan Williams <dan.j.williams@...el.com>
Cc:     tglx@...utronix.de, mingo@...hat.com, vishal.l.verma@...el.com,
        x86@...nel.org, stable@...r.kernel.org,
        Borislav Petkov <bp@...en8.de>,
        Vivek Goyal <vgoyal@...hat.com>,
        "H. Peter Anvin" <hpa@...or.com>,
        Andy Lutomirski <luto@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Tony Luck <tony.luck@...el.com>,
        Erwin Tsaur <erwin.tsaur@...el.com>, linux-nvdimm@...ts.01.org,
        linux-kernel@...r.kernel.org, 0day robot <lkp@...el.com>,
        lkp@...ts.01.org
Subject: [x86/copy_mc] a0ac629ebe: fio.read_iops -43.3% regression

Greeting,

FYI, we noticed a -43.3% regression of fio.read_iops due to commit:


commit: a0ac629ebe7b3d248cb93807782a00d9142fdb98 ("x86/copy_mc: Introduce copy_mc_generic()")
url: https://github.com/0day-ci/linux/commits/Dan-Williams/Renovate-memcpy_mcsafe-with-copy_mc_to_-user-kernel/20200802-014046


in testcase: fio-basic
on test machine: 96 threads Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 256G memory
with following parameters:

	disk: 2pmem
	fs: xfs
	mount_option: dax
	runtime: 200s
	nr_task: 50%
	time_based: tb
	rw: read
	bs: 2M
	ioengine: libaio
	test_size: 200G
	cpufreq_governor: performance
	ucode: 0x5002f01

test-description: Fio is a tool that will spawn a number of threads or processes doing a particular type of I/O action as specified by the user.
test-url: https://github.com/axboe/fio

In addition to that, the commit also has significant impact on the following tests:

+------------------+----------------------------------------------------------------------+
| testcase: change | fio-basic: fio.read_iops -55.6% regression                           |
| test machine     | 96 threads Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 256G memory |
| test parameters  | bs=2M                                                                |
|                  | cpufreq_governor=performance                                         |
|                  | disk=2pmem                                                           |
|                  | fs=xfs                                                               |
|                  | ioengine=sync                                                        |
|                  | mount_option=dax                                                     |
|                  | nr_task=50%                                                          |
|                  | runtime=200s                                                         |
|                  | rw=read                                                              |
|                  | test_size=200G                                                       |
|                  | time_based=tb                                                        |
|                  | ucode=0x5002f01                                                      |
+------------------+----------------------------------------------------------------------+


If you fix the issue, kindly add following tag
Reported-by: kernel test robot <rong.a.chen@...el.com>


Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp install job.yaml  # job file is attached in this email
        bin/lkp run     job.yaml

=========================================================================================
bs/compiler/cpufreq_governor/disk/fs/ioengine/kconfig/mount_option/nr_task/rootfs/runtime/rw/tbox_group/test_size/testcase/time_based/ucode:
  2M/gcc-9/performance/2pmem/xfs/libaio/x86_64-rhel-8.3/dax/50%/debian-10.4-x86_64-20200603.cgz/200s/read/lkp-csl-2sp6/200G/fio-basic/tb/0x5002f01

commit: 
  7476b91d4d ("x86, powerpc: Rename memcpy_mcsafe() to copy_mc_to_{user, kernel}()")
  a0ac629ebe ("x86/copy_mc: Introduce copy_mc_generic()")

7476b91d4db369d8 a0ac629ebe7b3d248cb93807782 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
     97.22           -96.0        1.19 ± 21%  fio.latency_100ms%
      0.14            -0.1        0.05        fio.latency_10ms%
      0.27 ± 13%      -0.1        0.14        fio.latency_20ms%
      0.04 ±  6%      -0.0        0.03 ± 12%  fio.latency_20us%
      1.00 ± 28%     +96.6       97.57        fio.latency_250ms%
      0.05            -0.0        0.05        fio.latency_4ms%
      0.02 ± 48%      +0.3        0.31 ± 15%  fio.latency_500ms%
      1.25 ± 47%      -0.6        0.63 ± 11%  fio.latency_50ms%
      0.01 ±  9%      +0.0        0.02 ± 24%  fio.latency_50us%
     44292           -43.3%      25124        fio.read_bw_MBps
  67895296           +76.8%  1.201e+08        fio.read_clat_90%_us
  68681728           +76.7%  1.214e+08        fio.read_clat_95%_us
  98304000 ± 19%     +80.3%  1.772e+08 ±  4%  fio.read_clat_99%_us
  66674508           +76.2%  1.175e+08        fio.read_clat_mean_us
   9950116 ± 12%     +80.3%   17935634        fio.read_clat_stddev
     22146           -43.3%      12562        fio.read_iops
   2152824           +76.8%    3805428        fio.read_slat_mean_us
    291719 ± 14%     +86.6%     544324        fio.read_slat_stddev
     12923            -2.5%      12594        fio.time.involuntary_context_switches
     77.65 ±  3%     -39.1%      47.29        fio.time.user_time
   4429275           -43.3%    2512537        fio.workload
      0.14 ±  3%      +0.0        0.16 ±  4%  mpstat.cpu.all.soft%
      0.47 ±  3%      -0.2        0.31        mpstat.cpu.all.usr%
     53185 ± 91%    +121.2%     117642 ± 40%  numa-vmstat.node0.numa_other
    122640 ± 39%     -52.6%      58092 ± 81%  numa-vmstat.node1.numa_other
     60096            +1.5%      61021        proc-vmstat.nr_slab_unreclaimable
     20103 ±  5%     -17.9%      16495 ± 12%  proc-vmstat.pgactivate
     49.00            -2.0%      48.00        vmstat.cpu.id
      1612            -1.6%       1587        vmstat.system.cs
      2713 ±  4%      +8.0%       2931 ±  4%  slabinfo.PING.active_objs
      2713 ±  4%      +8.0%       2931 ±  4%  slabinfo.PING.num_objs
      1164 ±  9%     +16.8%       1360 ±  6%  slabinfo.task_group.active_objs
      1164 ±  9%     +16.8%       1360 ±  6%  slabinfo.task_group.num_objs
    379.25 ± 85%    +279.7%       1439 ± 75%  sched_debug.cfs_rq:/.exec_clock.min
     29948 ±  5%     -15.5%      25309 ±  5%  sched_debug.cfs_rq:/.exec_clock.stddev
     21606 ±  7%     +25.1%      27034 ±  7%  sched_debug.cfs_rq:/.min_vruntime.min
     33321 ±  6%     -16.5%      27820 ±  6%  sched_debug.cfs_rq:/.min_vruntime.stddev
     13783 ±109%    +184.1%      39158 ± 20%  sched_debug.cfs_rq:/.spread0.avg
    -38497           -76.6%      -9012        sched_debug.cfs_rq:/.spread0.min
     33321 ±  6%     -16.5%      27820 ±  6%  sched_debug.cfs_rq:/.spread0.stddev
     12.22 ± 10%     +27.9%      15.62 ±  3%  sched_debug.cpu.clock.stddev
      3716 ±173%    -100.0%       1.50 ± 57%  softirqs.CPU10.NET_RX
     17411 ± 36%     -41.8%      10126 ± 19%  softirqs.CPU24.SCHED
      9179 ± 67%     +87.1%      17173 ± 23%  softirqs.CPU35.SCHED
      9611 ± 34%     -58.9%       3951 ± 10%  softirqs.CPU48.SCHED
     17177 ± 30%     -42.6%       9864 ± 37%  softirqs.CPU69.SCHED
     86644 ± 29%     -22.3%      67339 ±  5%  softirqs.CPU76.TIMER
      6339 ± 66%    +115.9%      13686 ± 31%  softirqs.CPU78.SCHED
     10156 ± 64%     +91.8%      19477 ± 25%  softirqs.CPU81.SCHED
      1239 ±172%    -100.0%       0.00        interrupts.62:PCI-MSI.31981595-edge.i40e-eth0-TxRx-26
     47482            +5.4%      50055 ±  4%  interrupts.CAL:Function_call_interrupts
    209.00 ± 23%     -50.4%     103.75 ±  8%  interrupts.CPU0.RES:Rescheduling_interrupts
    146.25 ± 16%     -27.4%     106.25 ± 16%  interrupts.CPU15.RES:Rescheduling_interrupts
    168.75 ± 81%     -64.6%      59.75 ± 33%  interrupts.CPU15.TLB:TLB_shootdowns
      7321 ±  5%     -52.7%       3461 ± 39%  interrupts.CPU20.NMI:Non-maskable_interrupts
      7321 ±  5%     -52.7%       3461 ± 39%  interrupts.CPU20.PMI:Performance_monitoring_interrupts
      6665 ± 14%     -61.2%       2586 ± 26%  interrupts.CPU21.NMI:Non-maskable_interrupts
      6665 ± 14%     -61.2%       2586 ± 26%  interrupts.CPU21.PMI:Performance_monitoring_interrupts
     64.50 ± 23%     +41.9%      91.50 ± 22%  interrupts.CPU21.TLB:TLB_shootdowns
    100.00 ± 41%     +66.0%     166.00 ±  9%  interrupts.CPU24.RES:Rescheduling_interrupts
      1238 ±173%    -100.0%       0.00        interrupts.CPU26.62:PCI-MSI.31981595-edge.i40e-eth0-TxRx-26
    438.25 ±  4%     +16.1%     509.00 ± 18%  interrupts.CPU28.CAL:Function_call_interrupts
    145.50 ± 20%     -34.4%      95.50 ± 25%  interrupts.CPU35.RES:Rescheduling_interrupts
      7134 ± 11%     -28.3%       5118 ± 19%  interrupts.CPU41.NMI:Non-maskable_interrupts
      7134 ± 11%     -28.3%       5118 ± 19%  interrupts.CPU41.PMI:Performance_monitoring_interrupts
    107.75 ± 34%     -47.3%      56.75 ± 40%  interrupts.CPU93.RES:Rescheduling_interrupts
     63.18 ± 12%     -26.1       37.12 ± 15%  perf-profile.calltrace.cycles-pp.copy_mc_fragile.copy_mc_to_user.copyout_mc._copy_mc_to_iter.dax_iomap_actor
      0.00            +3.7        3.72 ± 52%  perf-profile.calltrace.cycles-pp.copy_mc_generic.copy_mc_to_user.copyout_mc._copy_mc_to_iter.dax_iomap_actor
      0.00           +37.8       37.83 ± 12%  perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt.copy_mc_generic.copy_mc_to_user.copyout_mc._copy_mc_to_iter
     63.34 ± 12%     -26.2       37.14 ± 15%  perf-profile.children.cycles-pp.copy_mc_fragile
      2.41 ±112%      -2.2        0.25 ±108%  perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
      2.26 ±109%      -2.0        0.29 ± 89%  perf-profile.children.cycles-pp.asm_call_on_stack
      2.15 ±112%      -1.9        0.23 ±110%  perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
      2.12 ±113%      -1.9        0.23 ±110%  perf-profile.children.cycles-pp.hrtimer_interrupt
      1.68 ±114%      -1.5        0.17 ±119%  perf-profile.children.cycles-pp.__hrtimer_run_queues
      1.48 ±123%      -1.3        0.15 ±121%  perf-profile.children.cycles-pp.tick_sched_timer
      1.34 ±120%      -1.2        0.14 ±122%  perf-profile.children.cycles-pp.tick_sched_handle
      1.28 ±119%      -1.1        0.14 ±122%  perf-profile.children.cycles-pp.update_process_times
      0.70 ±107%      -0.6        0.10 ±120%  perf-profile.children.cycles-pp.scheduler_tick
      2.65 ±106%     +16.5       19.13 ± 12%  perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
      0.00           +22.6       22.58 ±  7%  perf-profile.children.cycles-pp.copy_mc_generic
     62.52 ± 12%     -25.5       37.00 ± 15%  perf-profile.self.cycles-pp.copy_mc_fragile
      0.00           +22.4       22.41 ±  6%  perf-profile.self.cycles-pp.copy_mc_generic
     42.43           +68.7%      71.58        perf-stat.i.MPKI
 5.949e+09           -42.2%   3.44e+09        perf-stat.i.branch-instructions
      0.07            +0.0        0.10 ±  5%  perf-stat.i.branch-miss-rate%
   3554006 ±  2%      -7.5%    3286479 ±  3%  perf-stat.i.branch-misses
     95.02            -2.4       92.63        perf-stat.i.cache-miss-rate%
 1.444e+09            -5.2%  1.369e+09        perf-stat.i.cache-misses
 1.513e+09            -2.8%  1.471e+09        perf-stat.i.cache-references
      3.81           +72.5%       6.58        perf-stat.i.cpi
    102.49            +4.5%     107.13        perf-stat.i.cycles-between-cache-misses
      0.00 ±  4%      +0.0        0.00 ± 41%  perf-stat.i.dTLB-load-miss-rate%
  6.03e+09           -42.0%  3.495e+09        perf-stat.i.dTLB-loads
      0.00 ±  5%      +0.0        0.00 ±  7%  perf-stat.i.dTLB-store-miss-rate%
 5.909e+09           -42.5%    3.4e+09        perf-stat.i.dTLB-stores
     47.00            +1.4       48.45        perf-stat.i.iTLB-load-miss-rate%
   2270674           -11.0%    2021114        perf-stat.i.iTLB-load-misses
   2563127           -16.0%    2151931        perf-stat.i.iTLB-loads
 3.548e+10           -42.4%  2.044e+10        perf-stat.i.instructions
     15634           -35.2%      10127        perf-stat.i.instructions-per-iTLB-miss
      0.26           -41.6%       0.15        perf-stat.i.ipc
    207.77           -37.5%     129.85        perf-stat.i.metric.M/sec
  78061415 ± 13%     +98.0%  1.546e+08 ± 20%  perf-stat.i.node-load-misses
  85582855 ± 11%     +58.1%  1.353e+08 ± 20%  perf-stat.i.node-loads
 3.817e+08            -2.8%  3.709e+08        perf-stat.i.node-stores
     42.66           +68.7%      71.96        perf-stat.overall.MPKI
      0.06            +0.0        0.09 ±  3%  perf-stat.overall.branch-miss-rate%
     95.45            -2.4       93.07        perf-stat.overall.cache-miss-rate%
      3.81           +73.0%       6.59        perf-stat.overall.cpi
     93.55            +5.2%      98.41        perf-stat.overall.cycles-between-cache-misses
      0.00 ±  5%      +0.0        0.00 ± 13%  perf-stat.overall.dTLB-load-miss-rate%
      0.00 ±  5%      +0.0        0.00 ±  2%  perf-stat.overall.dTLB-store-miss-rate%
     46.98            +1.5       48.43        perf-stat.overall.iTLB-load-miss-rate%
     15639           -35.2%      10127        perf-stat.overall.instructions-per-iTLB-miss
      0.26           -42.2%       0.15        perf-stat.overall.ipc
   1605743            +1.5%    1630326        perf-stat.overall.path-length
 5.919e+09           -42.2%  3.422e+09        perf-stat.ps.branch-instructions
   3519866 ±  2%      -7.8%    3245208 ±  3%  perf-stat.ps.branch-misses
 1.437e+09            -5.2%  1.362e+09        perf-stat.ps.cache-misses
 1.506e+09            -2.8%  1.463e+09        perf-stat.ps.cache-references
      1552            -1.4%       1530        perf-stat.ps.context-switches
     6e+09           -42.1%  3.477e+09        perf-stat.ps.dTLB-loads
  5.88e+09           -42.5%  3.382e+09        perf-stat.ps.dTLB-stores
   2257568           -11.0%    2008542        perf-stat.ps.iTLB-load-misses
   2547705           -16.1%    2138603        perf-stat.ps.iTLB-loads
  3.53e+10           -42.4%  2.034e+10        perf-stat.ps.instructions
  77685715 ± 13%     +97.9%  1.538e+08 ± 20%  perf-stat.ps.node-load-misses
  85143339 ± 11%     +58.1%  1.346e+08 ± 20%  perf-stat.ps.node-loads
 3.797e+08            -2.8%   3.69e+08        perf-stat.ps.node-stores
 7.112e+12           -42.4%  4.096e+12        perf-stat.total.instructions


                                                                                
                                  fio.read_bw_MBps                              
                                                                                
  46000 +-------------------------------------------------------------------+   
  44000 |..+.+..+.+..+..+.+..+..+.+.. .+..  .+.+..+.+..+                    |   
        |                            +    +.                                |   
  42000 |-+                                                                 |   
  40000 |-+                                                                 |   
  38000 |-+                                                                 |   
  36000 |-+                                                                 |   
        |                                                                   |   
  34000 |-+                                                                 |   
  32000 |-+                                                                 |   
  30000 |-+                                                                 |   
  28000 |-+                                                                 |   
        |                                                                   |   
  26000 |-+O O  O O  O  O O  O  O O  O O  O  O O  O O  O  O O  O  O O  O O  |   
  24000 +-------------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                                    fio.read_iops                               
                                                                                
  23000 +-------------------------------------------------------------------+   
  22000 |..+.+..+.+..+..+.+..+..+.+.. .+..  .+.+..+.+..+                    |   
        |                            +    +.                                |   
  21000 |-+                                                                 |   
  20000 |-+                                                                 |   
  19000 |-+                                                                 |   
  18000 |-+                                                                 |   
        |                                                                   |   
  17000 |-+                                                                 |   
  16000 |-+                                                                 |   
  15000 |-+                                                                 |   
  14000 |-+                                                                 |   
        |                                                                   |   
  13000 |-+O O  O O  O  O O  O  O O  O O  O  O O  O O  O  O O  O  O O  O O  |   
  12000 +-------------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                                 fio.read_clat_mean_us                          
                                                                                
  1.2e+08 +-----------------------------------------------------------------+   
          |  O O  O O    O  O O  O O  O O  O  O O  O O  O O  O O  O O  O O  |   
  1.1e+08 |-+                                                               |   
          |                                                                 |   
          |                                                                 |   
    1e+08 |-+                                                               |   
          |                                                                 |   
    9e+07 |-+                                                               |   
          |                                                                 |   
    8e+07 |-+                                                               |   
          |                                                                 |   
          |                                                                 |   
    7e+07 |..+.+..      .+..+.    .+..+.+..+..+.+..+.+..                    |   
          |       +.+..+      +..+                      +                   |   
    6e+07 +-----------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                                 fio.read_clat_90__us                           
                                                                                
  1.3e+08 +-----------------------------------------------------------------+   
          |                                                                 |   
  1.2e+08 |-+O O  O O  O O  O O  O O  O O  O  O O  O O  O O  O O  O O  O O  |   
          |                                                                 |   
  1.1e+08 |-+                                                               |   
          |                                                                 |   
    1e+08 |-+                                                               |   
          |                                                                 |   
    9e+07 |-+                                                               |   
          |                                                                 |   
    8e+07 |-+                                                               |   
          |                                                                 |   
    7e+07 |..+.+..+.+..+.+..+.+..+.+..+.+..+..+.+..+.+..+                   |   
          |                                                                 |   
    6e+07 +-----------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                                 fio.read_clat_95__us                           
                                                                                
  1.3e+08 +-----------------------------------------------------------------+   
          |            O O       O O  O                                     |   
  1.2e+08 |-+O O  O O       O O         O  O  O O  O O  O O  O O  O O  O O  |   
          |                                                                 |   
  1.1e+08 |-+                                                               |   
          |                                                                 |   
    1e+08 |-+                                                               |   
          |                                                                 |   
    9e+07 |-+                                                               |   
          |                                                                 |   
    8e+07 |-+                                                               |   
          |                          .+.  .+..                              |   
    7e+07 |..+.+..+.+..+.+..+.+..+.+.   +.    +.+..+.+..+                   |   
          |                                                                 |   
    6e+07 +-----------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                                 fio.read_slat_mean_us                          
                                                                                
    4e+06 +-----------------------------------------------------------------+   
  3.8e+06 |-+O O  O O  O O  O O  O O  O O  O  O O  O O  O O  O O  O O  O O  |   
          |                                                                 |   
  3.6e+06 |-+                                                               |   
  3.4e+06 |-+                                                               |   
          |                                                                 |   
  3.2e+06 |-+                                                               |   
    3e+06 |-+                                                               |   
  2.8e+06 |-+                                                               |   
          |                                                                 |   
  2.6e+06 |-+                                                               |   
  2.4e+06 |-+                                                               |   
          |                                                                 |   
  2.2e+06 |..+.+..+.+..+.+..+.+..+.+..+.+..+..+.+..+.+..+                   |   
    2e+06 +-----------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                                 fio.latency_10ms_                              
                                                                                
  0.15 +--------------------------------------------------------------------+   
  0.14 |..+.+..+..+.+..+..+.+..+..+.+..+..+.+..+.+..+..+                    |   
       |                                                                    |   
  0.13 |-+                                                                  |   
  0.12 |-+                                                                  |   
  0.11 |-+                                                                  |   
   0.1 |-+                                                                  |   
       |                                                                    |   
  0.09 |-+                                                                  |   
  0.08 |-+                                                                  |   
  0.07 |-+                                                                  |   
  0.06 |-+                                                                  |   
       |                                                                    |   
  0.05 |-+O O  O  O O  O  O O  O  O O  O  O O  O O  O  O O  O  O O  O  O O  |   
  0.04 +--------------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                                 fio.latency_20ms_                              
                                                                                
  0.55 +--------------------------------------------------------------------+   
       |  +                            +    +                               |   
   0.5 |-+ :                          + +  ::                               |   
  0.45 |++ :      +                  +   + : :                              |   
       |    +     :                 +     +  :                              |   
   0.4 |-+   :   : :                :         :                             |   
  0.35 |-+   :   : :               :          :                             |   
       |      : :  :               :           :       +                    |   
   0.3 |-+    : :  :      +.       :           +      +                     |   
  0.25 |-+     :    :   ..  +..   :             +    +                      |   
       |       +    +..+       +..+              +..+                       |   
   0.2 |-+                                                                  |   
  0.15 |-+                                                                  |   
       |  O O  O  O O  O  O O  O  O O  O  O O  O O  O  O O  O  O O  O  O O  |   
   0.1 +--------------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                                fio.latency_100ms_                              
                                                                                
  100 +---------------------------------------------------------------------+   
   90 |-+                   +       +.+..+..+    +.                         |   
      |                                                                     |   
   80 |-+                                                                   |   
   70 |-+                                                                   |   
      |                                                                     |   
   60 |-+                                                                   |   
   50 |-+                                                                   |   
   40 |-+                                                                   |   
      |                                                                     |   
   30 |-+                                                                   |   
   20 |-+                                                                   |   
      |                                                                     |   
   10 |-+                                                                   |   
    0 +---------------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                                fio.latency_250ms_                              
                                                                                
  100 +---------------------------------------------------------------------+   
   90 |-+                     O                                             |   
      |                                                                     |   
   80 |-+                                                                   |   
   70 |-+                                                                   |   
      |                                                                     |   
   60 |-+                                                                   |   
   50 |-+                                                                   |   
   40 |-+                                                                   |   
      |                                                                     |   
   30 |-+                                                                   |   
   20 |-+                                                                   |   
      |                                                                     |   
   10 |-+                                                                   |   
    0 +---------------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                                     fio.workload                               
                                                                                
  4.6e+06 +-----------------------------------------------------------------+   
  4.4e+06 |..+.+..+.+..+.+..+.+..+.+.. .+..  .+.+..+.+..+                   |   
          |                           +    +.                               |   
  4.2e+06 |-+                                                               |   
    4e+06 |-+                                                               |   
  3.8e+06 |-+                                                               |   
  3.6e+06 |-+                                                               |   
          |                                                                 |   
  3.4e+06 |-+                                                               |   
  3.2e+06 |-+                                                               |   
    3e+06 |-+                                                               |   
  2.8e+06 |-+                                                               |   
          |                                                                 |   
  2.6e+06 |-+O O  O O  O O  O O  O O  O O  O  O O  O O  O O  O O  O O  O O  |   
  2.4e+06 +-----------------------------------------------------------------+   
                                                                                
                                                                                
[*] bisect-good sample
[O] bisect-bad  sample

***************************************************************************************************
lkp-csl-2sp6: 96 threads Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 256G memory
=========================================================================================
bs/compiler/cpufreq_governor/disk/fs/ioengine/kconfig/mount_option/nr_task/rootfs/runtime/rw/tbox_group/test_size/testcase/time_based/ucode:
  2M/gcc-9/performance/2pmem/xfs/sync/x86_64-rhel-8.3/dax/50%/debian-10.4-x86_64-20200603.cgz/200s/read/lkp-csl-2sp6/200G/fio-basic/tb/0x5002f01

commit: 
  7476b91d4d ("x86, powerpc: Rename memcpy_mcsafe() to copy_mc_to_{user, kernel}()")
  a0ac629ebe ("x86/copy_mc: Introduce copy_mc_generic()")

7476b91d4db369d8 a0ac629ebe7b3d248cb93807782 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
      0.61 ± 15%      -0.4        0.22 ± 94%  fio.latency_1000us%
      0.01 ± 11%      +1.3        1.27 ± 25%  fio.latency_10ms%
     96.06           -95.5        0.60 ± 80%  fio.latency_2ms%
      1.27 ± 33%     +96.2       97.48        fio.latency_4ms%
      1.29 ± 55%      -1.2        0.05 ± 54%  fio.latency_500us%
     75143           -55.6%      33381        fio.read_bw_MBps
   1372160          +118.5%    2998272        fio.read_clat_90%_us
   1409024          +116.9%    3055616        fio.read_clat_95%_us
   2142208 ± 19%    +120.3%    4718592 ± 17%  fio.read_clat_99%_us
   1272849          +125.4%    2869293        fio.read_clat_mean_us
    228201 ± 15%    +103.6%     464620 ± 14%  fio.read_clat_stddev
     37571           -55.6%      16690        fio.read_iops
     69.28 ±  2%     -40.3%      41.38 ±  3%  fio.time.user_time
   7514438           -55.6%    3338252        fio.workload
      0.11 ±  3%      +0.0        0.14 ±  5%  mpstat.cpu.all.soft%
      0.43 ±  3%      -0.1        0.28 ±  2%  mpstat.cpu.all.usr%
    115069            -2.3%     112454        proc-vmstat.nr_shmem
     20846 ±  6%     -27.8%      15052 ±  3%  proc-vmstat.pgactivate
    967.50 ± 27%     -50.0%     483.75 ± 78%  slabinfo.xfs_buf_item.active_objs
    967.50 ± 27%     -50.0%     483.75 ± 78%  slabinfo.xfs_buf_item.num_objs
    100.00            -2.0%      98.00        vmstat.io.bo
      1672            -3.3%       1616        vmstat.system.cs
 9.059e+09 ±  6%     -32.3%  6.131e+09 ± 54%  cpuidle.C1E.time
  19004364 ±  3%     -22.4%   14741281 ± 34%  cpuidle.C1E.usage
 4.034e+08 ±133%    +713.0%   3.28e+09 ±100%  cpuidle.C6.time
    570211 ±122%    +571.6%    3829822 ± 86%  cpuidle.C6.usage
     61.80 ±  9%     -17.6       44.19        perf-profile.calltrace.cycles-pp.copy_mc_fragile.copy_mc_to_user.copyout_mc._copy_mc_to_iter.dax_iomap_actor
      0.00            +7.8        7.81 ±  6%  perf-profile.calltrace.cycles-pp.copy_mc_generic.copy_mc_to_user.copyout_mc._copy_mc_to_iter.dax_iomap_actor
      0.00           +29.2       29.21 ±  5%  perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt.copy_mc_generic.copy_mc_to_user.copyout_mc._copy_mc_to_iter
     61.92 ±  9%     -17.7       44.25        perf-profile.children.cycles-pp.copy_mc_fragile
      3.47 ±132%     +11.7       15.21 ±  5%  perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
      0.00           +22.3       22.32        perf-profile.children.cycles-pp.copy_mc_generic
     61.16 ±  9%     -17.4       43.78        perf-profile.self.cycles-pp.copy_mc_fragile
      0.00           +22.1       22.09        perf-profile.self.cycles-pp.copy_mc_generic
    212.00 ± 38%    +288.6%     823.90 ± 67%  sched_debug.cfs_rq:/.exec_clock.min
     34013 ±  3%     -17.1%      28181 ±  2%  sched_debug.cfs_rq:/.exec_clock.stddev
     36118 ±  5%     -15.0%      30710 ±  2%  sched_debug.cfs_rq:/.min_vruntime.stddev
     36118 ±  5%     -15.0%      30707 ±  2%  sched_debug.cfs_rq:/.spread0.stddev
      9.52 ± 11%     +33.8%      12.73 ±  9%  sched_debug.cpu.clock.stddev
     17832 ± 13%     -47.5%       9368 ± 17%  sched_debug.cpu.sched_count.max
      2475 ±  9%     -34.4%       1624 ±  8%  sched_debug.cpu.sched_count.stddev
      8858 ± 13%     -48.3%       4577 ± 18%  sched_debug.cpu.sched_goidle.max
      1260 ±  9%     -33.2%     841.68 ±  8%  sched_debug.cpu.sched_goidle.stddev
      8285 ± 16%     -32.1%       5622 ±  7%  sched_debug.cpu.ttwu_count.max
      1169 ±  9%     -24.9%     878.40 ±  4%  sched_debug.cpu.ttwu_count.stddev
     26587 ±  8%     -21.9%      20773 ± 22%  softirqs.CPU1.SCHED
     19906 ± 37%     -55.7%       8824 ± 96%  softirqs.CPU10.SCHED
     21997 ± 34%     -82.2%       3910 ± 55%  softirqs.CPU20.SCHED
      5126 ± 70%    +166.6%      13666 ± 15%  softirqs.CPU30.SCHED
      5567 ± 56%    +165.3%      14772 ± 29%  softirqs.CPU31.SCHED
     10027 ± 35%    +101.3%      20182 ± 18%  softirqs.CPU33.SCHED
      4868 ± 50%    +112.6%      10349 ± 14%  softirqs.CPU44.SCHED
      6304 ± 60%    +154.5%      16043 ± 22%  softirqs.CPU46.SCHED
      4127 ± 76%    +198.6%      12326 ± 32%  softirqs.CPU49.SCHED
      6313 ± 62%     +98.5%      12530 ± 19%  softirqs.CPU51.SCHED
      8249 ± 58%    +148.7%      20515 ± 31%  softirqs.CPU57.SCHED
      6971 ±109%    +268.6%      25698 ±  8%  softirqs.CPU68.SCHED
     25116 ± 15%     -32.4%      16974 ± 12%  softirqs.CPU78.SCHED
     24757 ± 12%     -36.8%      15657 ± 27%  softirqs.CPU79.SCHED
     20231 ± 14%     -45.5%      11024 ± 24%  softirqs.CPU81.SCHED
     21830 ± 23%     -55.4%       9733 ± 67%  softirqs.CPU9.SCHED
     24043 ± 16%     -39.9%      14449 ± 23%  softirqs.CPU94.SCHED
     42.31           +68.3%      71.22        perf-stat.i.MPKI
 9.958e+09           -54.7%  4.511e+09        perf-stat.i.branch-instructions
      0.05 ±  2%      +0.0        0.08 ±  4%  perf-stat.i.branch-miss-rate%
   3682118 ±  2%      -8.2%    3381534        perf-stat.i.branch-misses
     67.34           +10.4       77.74        perf-stat.i.cache-miss-rate%
 1.709e+09           -12.2%  1.501e+09        perf-stat.i.cache-misses
 2.531e+09           -24.0%  1.923e+09        perf-stat.i.cache-references
      1639            -4.1%       1571        perf-stat.i.context-switches
      2.25          +121.4%       4.98        perf-stat.i.cpi
     99.03            -1.8%      97.24        perf-stat.i.cpu-migrations
     85.60           +14.2%      97.78        perf-stat.i.cycles-between-cache-misses
      0.00 ± 18%      +0.0        0.00 ± 44%  perf-stat.i.dTLB-load-miss-rate%
 9.996e+09           -54.5%  4.549e+09        perf-stat.i.dTLB-loads
      0.00 ±  7%      +0.0        0.00 ±  6%  perf-stat.i.dTLB-store-miss-rate%
 9.904e+09           -54.9%  4.466e+09        perf-stat.i.dTLB-stores
     44.79            +4.2       48.99        perf-stat.i.iTLB-load-miss-rate%
   2535885           -13.8%    2185118        perf-stat.i.iTLB-load-misses
   3134177           -27.3%    2278467        perf-stat.i.iTLB-loads
 5.952e+10           -54.9%  2.687e+10        perf-stat.i.instructions
     23480           -47.6%      12304        perf-stat.i.instructions-per-iTLB-miss
      0.45           -54.6%       0.20        perf-stat.i.ipc
    342.39           -51.0%     167.90        perf-stat.i.metric.M/sec
 1.165e+08 ± 30%     +72.8%  2.013e+08 ±  9%  perf-stat.i.node-load-misses
 1.257e+08 ± 26%     +41.1%  1.773e+08 ±  9%  perf-stat.i.node-loads
  2.42e+08           +19.6%  2.895e+08        perf-stat.i.node-stores
     42.53           +68.3%      71.58        perf-stat.overall.MPKI
      0.04 ±  2%      +0.0        0.07        perf-stat.overall.branch-miss-rate%
     67.52           +10.5       78.07        perf-stat.overall.cache-miss-rate%
      2.24          +122.4%       4.99        perf-stat.overall.cpi
     78.17           +14.3%      89.34        perf-stat.overall.cycles-between-cache-misses
      0.00 ± 25%      +0.0        0.00 ± 12%  perf-stat.overall.dTLB-load-miss-rate%
      0.00 ± 13%      +0.0        0.00 ± 10%  perf-stat.overall.dTLB-store-miss-rate%
     44.72            +4.2       48.96        perf-stat.overall.iTLB-load-miss-rate%
     23499           -47.6%      12306        perf-stat.overall.instructions-per-iTLB-miss
      0.45           -55.0%       0.20        perf-stat.overall.ipc
   1587395            +1.5%    1611895        perf-stat.overall.path-length
 9.912e+09           -54.7%  4.489e+09        perf-stat.ps.branch-instructions
   3650903 ±  2%      -8.4%    3345674        perf-stat.ps.branch-misses
 1.701e+09           -12.2%  1.494e+09        perf-stat.ps.cache-misses
  2.52e+09           -24.1%  1.914e+09        perf-stat.ps.cache-references
      1616            -3.7%       1556        perf-stat.ps.context-switches
  9.95e+09           -54.5%  4.526e+09        perf-stat.ps.dTLB-loads
 9.859e+09           -54.9%  4.445e+09        perf-stat.ps.dTLB-stores
   2521574           -13.8%    2172342        perf-stat.ps.iTLB-load-misses
   3116655           -27.3%    2264894        perf-stat.ps.iTLB-loads
 5.925e+10           -54.9%  2.673e+10        perf-stat.ps.instructions
 1.159e+08 ± 30%     +72.8%  2.003e+08 ±  9%  perf-stat.ps.node-load-misses
  1.25e+08 ± 26%     +41.1%  1.764e+08 ±  9%  perf-stat.ps.node-loads
 2.407e+08           +19.6%  2.878e+08        perf-stat.ps.node-stores
 1.193e+13           -54.9%  5.381e+12        perf-stat.total.instructions
      0.00       +2.7e+105%       2689 ±171%  interrupts.115:PCI-MSI.31981648-edge.i40e-eth0-TxRx-79
     62.75 ± 27%     +51.8%      95.25 ± 21%  interrupts.CPU1.RES:Rescheduling_interrupts
      6530 ± 17%     -44.1%       3647 ± 35%  interrupts.CPU17.NMI:Non-maskable_interrupts
      6530 ± 17%     -44.1%       3647 ± 35%  interrupts.CPU17.PMI:Performance_monitoring_interrupts
     62.00 ± 74%    +187.9%     178.50 ±  5%  interrupts.CPU20.RES:Rescheduling_interrupts
    365.00 ± 78%     -76.0%      87.50 ± 53%  interrupts.CPU25.TLB:TLB_shootdowns
    170.50 ± 15%     -26.8%     124.75 ± 10%  interrupts.CPU30.RES:Rescheduling_interrupts
      7605           -43.3%       4316 ± 32%  interrupts.CPU31.NMI:Non-maskable_interrupts
      7605           -43.3%       4316 ± 32%  interrupts.CPU31.PMI:Performance_monitoring_interrupts
    169.00 ± 12%     -37.1%     106.25 ± 23%  interrupts.CPU31.RES:Rescheduling_interrupts
      7145 ± 11%     -33.0%       4786 ± 18%  interrupts.CPU36.NMI:Non-maskable_interrupts
      7145 ± 11%     -33.0%       4786 ± 18%  interrupts.CPU36.PMI:Performance_monitoring_interrupts
    136.50 ± 27%     -44.7%      75.50 ± 60%  interrupts.CPU39.TLB:TLB_shootdowns
    149.25 ± 24%     -24.6%     112.50 ± 30%  interrupts.CPU4.RES:Rescheduling_interrupts
      7599           -46.6%       4061 ± 35%  interrupts.CPU41.NMI:Non-maskable_interrupts
      7599           -46.6%       4061 ± 35%  interrupts.CPU41.PMI:Performance_monitoring_interrupts
      6661 ± 24%     -52.1%       3191 ± 51%  interrupts.CPU44.NMI:Non-maskable_interrupts
      6661 ± 24%     -52.1%       3191 ± 51%  interrupts.CPU44.PMI:Performance_monitoring_interrupts
      7622           -43.5%       4307 ± 33%  interrupts.CPU46.NMI:Non-maskable_interrupts
      7622           -43.5%       4307 ± 33%  interrupts.CPU46.PMI:Performance_monitoring_interrupts
      7613           -43.1%       4331 ± 31%  interrupts.CPU47.NMI:Non-maskable_interrupts
      7613           -43.1%       4331 ± 31%  interrupts.CPU47.PMI:Performance_monitoring_interrupts
      5823 ± 32%     -36.4%       3703 ± 34%  interrupts.CPU5.NMI:Non-maskable_interrupts
      5823 ± 32%     -36.4%       3703 ± 34%  interrupts.CPU5.PMI:Performance_monitoring_interrupts
     89.25 ± 48%     -61.1%      34.75 ± 31%  interrupts.CPU53.TLB:TLB_shootdowns
      5698 ± 33%     -42.5%       3277 ± 49%  interrupts.CPU55.NMI:Non-maskable_interrupts
      5698 ± 33%     -42.5%       3277 ± 49%  interrupts.CPU55.PMI:Performance_monitoring_interrupts
    172.00 ± 14%     -35.2%     111.50 ± 41%  interrupts.CPU56.RES:Rescheduling_interrupts
     64.00 ± 42%     -39.5%      38.75 ± 29%  interrupts.CPU56.TLB:TLB_shootdowns
    156.00 ± 17%     -36.2%      99.50 ± 21%  interrupts.CPU57.RES:Rescheduling_interrupts
    146.25 ± 28%     -48.9%      74.75 ± 67%  interrupts.CPU58.RES:Rescheduling_interrupts
      7627           -47.0%       4043 ± 31%  interrupts.CPU62.NMI:Non-maskable_interrupts
      7627           -47.0%       4043 ± 31%  interrupts.CPU62.PMI:Performance_monitoring_interrupts
    174.75 ± 12%     -29.9%     122.50 ± 30%  interrupts.CPU62.RES:Rescheduling_interrupts
     76.00 ± 29%     -48.4%      39.25 ± 29%  interrupts.CPU62.TLB:TLB_shootdowns
      7159 ± 11%     -50.2%       3564 ± 32%  interrupts.CPU63.NMI:Non-maskable_interrupts
      7159 ± 11%     -50.2%       3564 ± 32%  interrupts.CPU63.PMI:Performance_monitoring_interrupts
      7628           -62.9%       2831        interrupts.CPU66.NMI:Non-maskable_interrupts
      7628           -62.9%       2831        interrupts.CPU66.PMI:Performance_monitoring_interrupts
    174.50 ± 10%     -36.4%     111.00 ± 50%  interrupts.CPU66.RES:Rescheduling_interrupts
      4370 ± 18%     -34.7%       2853        interrupts.CPU69.NMI:Non-maskable_interrupts
      4370 ± 18%     -34.7%       2853        interrupts.CPU69.PMI:Performance_monitoring_interrupts
      6885 ± 18%     -45.8%       3731 ± 28%  interrupts.CPU74.NMI:Non-maskable_interrupts
      6885 ± 18%     -45.8%       3731 ± 28%  interrupts.CPU74.PMI:Performance_monitoring_interrupts
      5900 ± 18%     -57.5%       2510 ± 24%  interrupts.CPU77.NMI:Non-maskable_interrupts
      5900 ± 18%     -57.5%       2510 ± 24%  interrupts.CPU77.PMI:Performance_monitoring_interrupts
     62.00 ± 41%     +58.9%      98.50 ± 14%  interrupts.CPU78.RES:Rescheduling_interrupts
      0.00       +2.7e+105%       2689 ±171%  interrupts.CPU79.115:PCI-MSI.31981648-edge.i40e-eth0-TxRx-79
     49.75 ± 47%    +119.6%     109.25 ± 28%  interrupts.CPU79.RES:Rescheduling_interrupts
     61.50 ± 54%    +115.0%     132.25 ± 31%  interrupts.CPU8.RES:Rescheduling_interrupts
      5871 ± 19%     -38.8%       3594 ± 32%  interrupts.CPU80.NMI:Non-maskable_interrupts
      5871 ± 19%     -38.8%       3594 ± 32%  interrupts.CPU80.PMI:Performance_monitoring_interrupts
     60.50 ± 19%    +120.2%     133.25 ± 14%  interrupts.CPU81.RES:Rescheduling_interrupts
     36.00 ± 79%    +179.2%     100.50 ± 35%  interrupts.CPU86.RES:Rescheduling_interrupts
      6322 ± 21%     -60.6%       2490 ± 25%  interrupts.CPU88.NMI:Non-maskable_interrupts
      6322 ± 21%     -60.6%       2490 ± 25%  interrupts.CPU88.PMI:Performance_monitoring_interrupts
     32.50 ± 40%    +150.0%      81.25 ± 41%  interrupts.CPU92.RES:Rescheduling_interrupts
    124.00 ± 11%     -22.8%      95.75 ±  4%  interrupts.IWI:IRQ_work_interrupts
    538989 ±  8%     -28.0%     387910 ±  2%  interrupts.NMI:Non-maskable_interrupts
    538989 ±  8%     -28.0%     387910 ±  2%  interrupts.PMI:Performance_monitoring_interrupts





Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Rong Chen


View attachment "config-5.8.0-rc5-00002-ga0ac629ebe7b3d" of type "text/plain" (158408 bytes)

View attachment "job-script" of type "text/plain" (8311 bytes)

View attachment "job.yaml" of type "text/plain" (5718 bytes)

View attachment "reproduce" of type "text/plain" (948 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ