lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20181102013240.GG24195@shao2-debian>
Date:   Fri, 2 Nov 2018 09:32:41 +0800
From:   kernel test robot <rong.a.chen@...el.com>
To:     Roman Gushchin <guro@...com>
Cc:     Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Rik van Riel <riel@...riel.com>, Josef Bacik <jbacik@...com>,
        Johannes Weiner <hannes@...xchg.org>,
        Shakeel Butt <shakeelb@...gle.com>,
        Michal Hocko <mhocko@...nel.org>,
        Vladimir Davydov <vdavydov.dev@...il.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>, lkp@...org
Subject: [LKP] [mm]  172b06c32b:  fsmark.files_per_sec -2.0% regression

Greeting,

FYI, we noticed a -2.0% regression of fsmark.files_per_sec due to commit:


commit: 172b06c32b949759fe6313abec514bc4f15014f4 ("mm: slowly shrink slabs with a relatively small number of objects")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

in testcase: fsmark
on test machine: 48 threads Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz with 64G memory
with following parameters:

	iterations: 1x
	nr_threads: 64t
	disk: 1BRD_48G
	fs: btrfs
	fs2: nfsv4
	filesize: 4M
	test_size: 40G
	sync_method: fsyncBeforeClose
	ucode: 0x42d
	cpufreq_governor: performance

test-description: The fsmark is a file system benchmark to test synchronous write workloads, for example, mail servers workload.
test-url: https://sourceforge.net/projects/fsmark/

In addition to that, the commit also has significant impact on the following tests:

+------------------+-----------------------------------------------------------------------+
| testcase: change | vm-scalability:                                                       |
| test machine     | 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 128G memory |
| test parameters  | cpufreq_governor=performance                                          |
|                  | runtime=300s                                                          |
|                  | test=lru-file-mmap-read-rand                                          |
+------------------+-----------------------------------------------------------------------+
| testcase: change | vm-scalability:                                                       |
| test machine     | 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 128G memory |
| test parameters  | cpufreq_governor=performance                                          |
|                  | runtime=300s                                                          |
|                  | test=lru-file-readtwice                                               |
+------------------+-----------------------------------------------------------------------+
| testcase: change | fio-basic:                                                            |
| test machine     | 72 threads Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz with 128G memory |
| test parameters  | bs=4k                                                                 |
|                  | cpufreq_governor=performance                                          |
|                  | disk=1SSD                                                             |
|                  | fs2=nfsv4                                                             |
|                  | fs=ext4                                                               |
|                  | ioengine=sync                                                         |
|                  | nr_task=8                                                             |
|                  | runtime=300s                                                          |
|                  | rw=write                                                              |
|                  | test_size=512g                                                        |
|                  | ucode=0x3d                                                            |
+------------------+-----------------------------------------------------------------------+
| testcase: change | vm-scalability: vm-scalability.median -1.5% improvement               |
| test machine     | 80 threads Skylake with 64G memory                                    |
| test parameters  | cpufreq_governor=performance                                          |
|                  | runtime=300s                                                          |
|                  | test=lru-file-readonce                                                |
+------------------+-----------------------------------------------------------------------+


Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp install job.yaml  # job file is attached in this email
        bin/lkp run     job.yaml

=========================================================================================
compiler/cpufreq_governor/disk/filesize/fs2/fs/iterations/kconfig/nr_threads/rootfs/sync_method/tbox_group/test_size/testcase/ucode:
  gcc-7/performance/1BRD_48G/4M/nfsv4/btrfs/1x/x86_64-rhel-7.2/64t/debian-x86_64-2018-04-03.cgz/fsyncBeforeClose/ivb44/40G/fsmark/0x42d

commit: 
  3bf181bc5d ("kernel/sys.c: remove duplicated include")
  172b06c32b ("mm: slowly shrink slabs with a relatively small number of objects")

3bf181bc5d8bc86f 172b06c32b949759fe6313abec 
---------------- -------------------------- 
       fail:runs  %reproduction    fail:runs
           |             |             |    
           :4           25%           1:4     dmesg.WARNING:at_ip__slab_free/0x
          0:4            1%           0:4     perf-profile.children.cycles-pp.schedule_timeout
         %stddev     %change         %stddev
             \          |                \  
    554.27            -2.0%     542.92        fsmark.files_per_sec
    186948 ±  6%     -53.5%      87005 ±  7%  numa-vmstat.node0
    187962 ±  3%     -57.0%      80754 ±  9%  numa-vmstat.node1
      3338 ±  4%     -52.4%       1588 ± 57%  numa-vmstat.node1.workingset_refault
    164.50 ±  4%   +1047.3%       1887 ±  5%  nfsstat.Client.nfs.v4.access
      0.25 ±173%   +1000.0%       2.75 ± 15%  nfsstat.Client.nfs.v4.access.percent
     58.00          +211.2%     180.50 ±  9%  nfsstat.Client.nfs.v4.create
      5450 ±  4%     +30.3%       7102 ±  3%  nfsstat.Server.nfs.v4.operations.access
      3.00           +33.3%       4.00        nfsstat.Server.nfs.v4.operations.access.percent
     58.00          +210.8%     180.25 ±  9%  nfsstat.Server.nfs.v4.operations.create
 2.664e+09            +2.5%  2.731e+09        perf-stat.cache-misses
 5.774e+09            +3.2%   5.96e+09 ±  2%  perf-stat.cache-references
   3588012            +3.5%    3713986        perf-stat.context-switches
 8.779e+08            +3.7%  9.104e+08 ±  2%  perf-stat.node-loads
     17.16 ±  3%      +1.3       18.47 ±  3%  perf-stat.node-store-miss-rate%
 3.888e+08 ±  5%     +12.0%  4.354e+08 ±  4%  perf-stat.node-store-misses
 1.876e+09 ±  2%      +2.4%  1.922e+09        perf-stat.node-stores
    108.71 ±  8%     -25.7%      80.80 ± 18%  sched_debug.cfs_rq:/.load_avg.avg
    274.38 ±  5%     -19.4%     221.05 ± 13%  sched_debug.cfs_rq:/.load_avg.stddev
      2200 ± 18%     -55.8%     972.90 ± 71%  sched_debug.cfs_rq:/.removed.runnable_sum.avg
      9879 ±  8%     -42.7%       5658 ± 61%  sched_debug.cfs_rq:/.removed.runnable_sum.stddev
     16.59 ± 23%     -63.0%       6.14 ± 69%  sched_debug.cfs_rq:/.removed.util_avg.avg
    407.75 ± 16%     -44.9%     224.75 ± 57%  sched_debug.cfs_rq:/.removed.util_avg.max
     75.95 ± 13%     -52.7%      35.90 ± 60%  sched_debug.cfs_rq:/.removed.util_avg.stddev
      3728 ±  6%    +163.5%       9824 ± 93%  sched_debug.cfs_rq:/.runnable_weight.avg
     34.85 ± 10%     +73.1%      60.32 ± 21%  sched_debug.cfs_rq:/.util_est_enqueued.avg
    119.21 ±  3%     +35.2%     161.20 ± 13%  sched_debug.cfs_rq:/.util_est_enqueued.stddev
   5941092            +2.6%    6098104        proc-vmstat.nr_file_pages
   5668396            +2.7%    5824121        proc-vmstat.nr_inactive_file
     33843            -1.3%      33408        proc-vmstat.nr_slab_unreclaimable
   5668429            +2.7%    5824050        proc-vmstat.nr_zone_inactive_file
   2304515 ± 14%     +28.6%    2964651 ± 12%  proc-vmstat.numa_foreign
  29917905            -2.2%   29249794        proc-vmstat.numa_hit
  29908107            -2.2%   29240012        proc-vmstat.numa_local
   2304515 ± 14%     +28.6%    2964651 ± 12%  proc-vmstat.numa_miss
   2314313 ± 14%     +28.5%    2974432 ± 12%  proc-vmstat.numa_other
     35938 ±  9%     +36.9%      49187 ±  5%  proc-vmstat.pgactivate
    595040 ±  4%     +25.6%     747079 ±  3%  proc-vmstat.slabs_scanned
      0.21 ±  8%      -0.1        0.15 ± 19%  perf-profile.children.cycles-pp.nfs_get_lock_context
      0.36 ±  8%      -0.0        0.31 ±  4%  perf-profile.children.cycles-pp.radix_tree_tag_clear
      0.53 ±  5%      -0.0        0.49 ±  6%  perf-profile.children.cycles-pp.sched_ttwu_pending
      0.08 ±  8%      -0.0        0.05 ± 58%  perf-profile.children.cycles-pp.ktime_get_update_offsets_now
      0.13 ±  3%      -0.0        0.11 ± 11%  perf-profile.children.cycles-pp.read_tsc
      0.10 ±  4%      -0.0        0.08 ±  5%  perf-profile.children.cycles-pp.iov_iter_fault_in_readable
      0.08 ±  5%      -0.0        0.07 ± 10%  perf-profile.children.cycles-pp.replace_slot
      0.07 ±  7%      -0.0        0.05 ±  8%  perf-profile.children.cycles-pp.__set_extent_bit
      0.07 ± 20%      +0.0        0.10 ±  5%  perf-profile.children.cycles-pp.__switch_to
      0.15 ± 10%      +0.0        0.17 ±  4%  perf-profile.children.cycles-pp.btrfs_commit_transaction
      0.04 ± 58%      +0.0        0.07 ±  6%  perf-profile.children.cycles-pp.finish_task_switch
      0.24 ±  9%      +0.0        0.27 ± 10%  perf-profile.children.cycles-pp.find_busiest_group
      0.33 ±  8%      +0.0        0.37 ±  3%  perf-profile.children.cycles-pp.iov_iter_advance
      0.36 ±  5%      +0.0        0.40 ±  5%  perf-profile.children.cycles-pp.nfs_scan_commit
      0.00            +0.1        0.05 ±  9%  perf-profile.children.cycles-pp.link_path_walk
      0.23 ±  7%      +0.1        0.29 ±  9%  perf-profile.children.cycles-pp.mutex_spin_on_owner
      0.12 ±  4%      -0.0        0.10 ±  8%  perf-profile.self.cycles-pp.read_tsc
      0.08 ±  8%      -0.0        0.06 ± 11%  perf-profile.self.cycles-pp.replace_slot
      0.10            -0.0        0.08 ±  5%  perf-profile.self.cycles-pp.iov_iter_fault_in_readable
      0.06 ± 59%      +0.0        0.10 ±  5%  perf-profile.self.cycles-pp.__switch_to
      0.23 ±  6%      +0.1        0.29 ±  9%  perf-profile.self.cycles-pp.mutex_spin_on_owner
      1271 ±  8%     -25.1%     951.50 ± 11%  slabinfo.UNIX.active_objs
      1271 ±  8%     -25.1%     951.50 ± 11%  slabinfo.UNIX.num_objs
     28364 ±  3%     -11.5%      25088 ±  4%  slabinfo.anon_vma_chain.active_objs
     28478 ±  3%     -11.8%      25116 ±  4%  slabinfo.anon_vma_chain.num_objs
      5160           +13.0%       5833        slabinfo.btrfs_extent_buffer.active_objs
      5160           +13.0%       5833        slabinfo.btrfs_extent_buffer.num_objs
     63565           -13.7%      54861        slabinfo.dentry.active_objs
      1582           -12.2%       1388        slabinfo.dentry.active_slabs
     66458           -12.2%      58340        slabinfo.dentry.num_objs
      1582           -12.2%       1388        slabinfo.dentry.num_slabs
    821.75 ±  8%     -18.3%     671.25 ±  4%  slabinfo.file_lock_cache.active_objs
    821.75 ±  8%     -18.3%     671.25 ±  4%  slabinfo.file_lock_cache.num_objs
     10074           -18.1%       8251 ±  2%  slabinfo.kmalloc-96.active_objs
     10155            -9.6%       9177 ±  2%  slabinfo.kmalloc-96.num_objs
      1241 ±  4%     -12.0%       1092 ±  7%  slabinfo.nsproxy.active_objs
      1241 ±  4%     -12.0%       1092 ±  7%  slabinfo.nsproxy.num_objs
      6642 ±  3%     -17.7%       5468 ±  5%  slabinfo.proc_inode_cache.active_objs
      7105 ±  4%     -19.3%       5734 ±  4%  slabinfo.proc_inode_cache.num_objs
      2449 ±  4%     -16.9%       2035 ±  9%  slabinfo.sock_inode_cache.active_objs
      2449 ±  4%     -16.9%       2035 ±  9%  slabinfo.sock_inode_cache.num_objs
     22286 ±  3%     -11.1%      19819 ±  6%  slabinfo.vm_area_struct.active_objs
     22396 ±  3%     -11.4%      19832 ±  6%  slabinfo.vm_area_struct.num_objs


                                                                                
                           nfsstat.Client.nfs.v4.create                         
                                                                                
  240 +-+-------------------------------------------------------------------+   
  220 +-+                             O                                     |   
      |                                                                     O   
  200 +-+        O                                                          |   
  180 +-+                        O                                     O    |   
      O    O                O                               O    O          |   
  160 +-+             O                                                     |   
  140 +-+                                        O                          |   
  120 +-+                                                                   |   
      |                                                                     |   
  100 +-+                                                                   |   
   80 +-+                                   O         O                     |   
      |                                                                     |   
   60 +-+..+.....+....+.....+....+....+.....+....+....+.....+               |   
   40 +-+-------------------------------------------------------------------+   
                                                                                
                                                                                
[*] bisect-good sample
[O] bisect-bad  sample

***************************************************************************************************
lkp-bdw-ep2: 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 128G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/runtime/tbox_group/test/testcase:
  gcc-7/performance/x86_64-rhel-7.2/debian-x86_64-2018-04-03.cgz/300s/lkp-bdw-ep2/lru-file-mmap-read-rand/vm-scalability

commit: 
  3bf181bc5d ("kernel/sys.c: remove duplicated include")
  172b06c32b ("mm: slowly shrink slabs with a relatively small number of objects")

3bf181bc5d8bc86f 172b06c32b949759fe6313abec 
---------------- -------------------------- 
       fail:runs  %reproduction    fail:runs
           |             |             |    
           :4           25%           1:4     dmesg.WARNING:at_ip___perf_sw_event/0x
          1:4            1%           1:4     perf-profile.children.cycles-pp.error_entry
         %stddev     %change         %stddev
             \          |                \  
      0.08 ± 17%    +900.8%       0.82        vm-scalability.stddev
    375.98           -13.7%     324.35        vm-scalability.time.elapsed_time
    375.98           -13.7%     324.35        vm-scalability.time.elapsed_time.max
      6877           +16.3%       8001        vm-scalability.time.percent_of_cpu_this_job_got
     -6.04           +32.4%      -8.00        sched_debug.cpu.nr_uninterruptible.min
    110.40 ±  5%      -9.1%     100.31 ±  4%  sched_debug.cpu.sched_goidle.avg
     20.53 ±  2%     -12.8        7.71 ±  6%  mpstat.cpu.idle%
     78.48           +12.7       91.14        mpstat.cpu.sys%
      0.95            +0.2        1.11        mpstat.cpu.usr%
    922231 ±  7%     -16.5%     769963 ±  6%  softirqs.RCU
   1059821 ±  2%     -54.4%     483561 ±  4%  softirqs.SCHED
  13106616           -13.5%   11333791        softirqs.TIMER
   1374581 ±168%     -98.3%      23531 ± 13%  cpuidle.C1.usage
  15076756 ±146%    +609.5%   1.07e+08 ± 68%  cpuidle.C1E.time
    175945 ±149%    +497.9%    1051915 ± 66%  cpuidle.C1E.usage
   4370309 ±172%     -99.7%      11986 ± 17%  cpuidle.POLL.time
     68855 ±170%     -99.3%     484.75 ±  3%  cpuidle.POLL.usage
     31836 ±  5%     -16.8%      26475 ±  7%  meminfo.CmaFree
  84746786           +19.1%  1.009e+08        meminfo.Mapped
  17222404 ±  3%     -38.8%   10542322 ±  5%  meminfo.MemFree
      1458 ± 42%     -69.0%     452.25 ±160%  meminfo.Mlocked
   6445479           +19.0%    7671618        meminfo.PageTables
      7.00           +85.7%      13.00 ± 38%  vmstat.memory.buff
  15720060 ±  3%     -38.4%    9687227 ±  5%  vmstat.memory.free
     71.50           +16.4%      83.25        vmstat.procs.r
      3837           +10.2%       4229        vmstat.system.cs
    371229            +8.6%     402988        vmstat.system.in
     56.00          +224.1%     181.50 ± 39%  slabinfo.btrfs_extent_map.active_objs
     56.00          +224.1%     181.50 ± 39%  slabinfo.btrfs_extent_map.num_objs
      9569 ±  8%     -13.6%       8263 ±  6%  slabinfo.kmalloc-512.num_objs
      1258 ±  3%      +8.8%       1369 ±  4%  slabinfo.nsproxy.active_objs
      1258 ±  3%      +8.8%       1369 ±  4%  slabinfo.nsproxy.num_objs
     10548 ±  5%     -21.0%       8338 ± 14%  slabinfo.proc_inode_cache.active_objs
     11057 ±  5%     -23.2%       8487 ± 14%  slabinfo.proc_inode_cache.num_objs
    884481 ±  6%     +12.7%     996793        numa-meminfo.node0.Active
    730964 ±  2%     +10.0%     803893 ±  5%  numa-meminfo.node0.Active(file)
      7743 ± 12%     +22.1%       9452 ±  8%  numa-meminfo.node0.KernelStack
  41822458           +19.1%   49809098        numa-meminfo.node0.Mapped
   9000212           -39.3%    5463666 ±  5%  numa-meminfo.node0.MemFree
    814.25 ± 42%     -67.8%     262.50 ±160%  numa-meminfo.node0.Mlocked
   3207907           +19.7%    3838678        numa-meminfo.node0.PageTables
  42192684           +19.7%   50502818        numa-meminfo.node1.Mapped
   8659926 ±  8%     -36.4%    5511081 ±  4%  numa-meminfo.node1.MemFree
   3181250 ±  2%     +19.0%    3786464        numa-meminfo.node1.PageTables
      2244           +15.5%       2591        turbostat.Avg_MHz
     80.54           +12.3       92.81        turbostat.Busy%
   1369577 ±169%     -98.8%      16378 ± 21%  turbostat.C1
    173829 ±151%    +503.8%    1049549 ± 66%  turbostat.C1E
      8.53 ± 21%     -63.4%       3.12 ±  9%  turbostat.CPU%c1
     48.50 ±  4%     +19.1%      57.75 ±  2%  turbostat.CoreTmp
      5.97 ±  6%     -73.7%       1.57 ±  7%  turbostat.Pkg%pc2
     52.75 ±  3%     +18.0%      62.25        turbostat.PkgTmp
    209.42            +7.3%     224.61        turbostat.PkgWatt
     19.61            +6.4%      20.88        turbostat.RAMWatt
     28.01            +1.0       29.04        perf-stat.cache-miss-rate%
 1.423e+11            -2.9%  1.381e+11        perf-stat.cache-references
   1450633            -4.8%    1380444        perf-stat.context-switches
      0.04 ±  3%      -0.0        0.03        perf-stat.dTLB-store-miss-rate%
 6.336e+08 ±  4%     -11.0%  5.641e+08        perf-stat.dTLB-store-misses
 1.719e+12            -1.7%  1.689e+12        perf-stat.dTLB-stores
 2.679e+09            -2.5%  2.612e+09        perf-stat.iTLB-load-misses
      8281            +2.5%       8491        perf-stat.instructions-per-iTLB-miss
    917769           -12.1%     806944        perf-stat.minor-faults
     43.94            +0.7       44.63        perf-stat.node-store-miss-rate%
  3.44e+09            +2.1%  3.514e+09        perf-stat.node-store-misses
    182104 ±  2%     +10.3%     200837 ±  5%  numa-vmstat.node0.nr_active_file
   2280457 ±  2%     -39.5%    1379202 ±  5%  numa-vmstat.node0.nr_free_pages
    371.75           +19.4%     444.00 ±  2%  numa-vmstat.node0.nr_isolated_file
      7740 ± 12%     +22.1%       9448 ±  8%  numa-vmstat.node0.nr_kernel_stack
  10407177           +19.6%   12442787        numa-vmstat.node0.nr_mapped
    204.25 ± 42%     -68.3%      64.75 ±161%  numa-vmstat.node0.nr_mlock
    798441           +20.1%     958633        numa-vmstat.node0.nr_page_table_pages
    182117 ±  2%     +10.3%     200831 ±  5%  numa-vmstat.node0.nr_zone_active_file
 2.928e+08           -10.5%  2.621e+08        numa-vmstat.node0.numa_hit
 2.928e+08           -10.5%  2.621e+08        numa-vmstat.node0.numa_local
  83842853           -17.2%   69394085        numa-vmstat.node0.workingset_refault
      8185 ±  5%     -16.5%       6838 ±  7%  numa-vmstat.node1.nr_free_cma
   2193591 ±  8%     -36.8%    1386047 ±  3%  numa-vmstat.node1.nr_free_pages
    387.25           +15.1%     445.75 ±  2%  numa-vmstat.node1.nr_isolated_file
  10499865           +20.2%   12620945        numa-vmstat.node1.nr_mapped
    162.75 ± 44%     -70.8%      47.50 ±161%  numa-vmstat.node1.nr_mlock
    791500 ±  2%     +19.5%     945585        numa-vmstat.node1.nr_page_table_pages
  2.94e+08           -10.4%  2.633e+08        numa-vmstat.node1.numa_hit
 2.938e+08           -10.4%  2.632e+08        numa-vmstat.node1.numa_local
  84630360           -16.8%   70439466        numa-vmstat.node1.workingset_refault
      1.77 ±  2%      -0.4        1.38 ± 12%  perf-profile.calltrace.cycles-pp.shrink_inactive_list.shrink_node_memcg.shrink_node.kswapd.kthread
      1.78 ±  2%      -0.4        1.39 ± 12%  perf-profile.calltrace.cycles-pp.shrink_node.kswapd.kthread.ret_from_fork
      1.78 ±  2%      -0.4        1.39 ± 12%  perf-profile.calltrace.cycles-pp.shrink_node_memcg.shrink_node.kswapd.kthread.ret_from_fork
      1.78 ±  2%      -0.4        1.39 ± 12%  perf-profile.calltrace.cycles-pp.kswapd.kthread.ret_from_fork
      0.61 ±  6%      -0.3        0.28 ±100%  perf-profile.calltrace.cycles-pp.shrink_page_list.shrink_inactive_list.shrink_node_memcg.shrink_node.kswapd
      1.08 ±  5%      -0.3        0.82 ± 12%  perf-profile.calltrace.cycles-pp._raw_spin_lock_irq.shrink_inactive_list.shrink_node_memcg.shrink_node.kswapd
      0.98            -0.0        0.93 ±  2%  perf-profile.calltrace.cycles-pp.isolate_lru_pages.shrink_inactive_list.shrink_node_memcg.shrink_node.do_try_to_free_pages
     90.34            +0.3       90.63        perf-profile.calltrace.cycles-pp.__do_fault.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault
     90.03            +0.3       90.33        perf-profile.calltrace.cycles-pp.filemap_fault.__xfs_filemap_fault.__do_fault.__handle_mm_fault.handle_mm_fault
     90.23            +0.3       90.53        perf-profile.calltrace.cycles-pp.__xfs_filemap_fault.__do_fault.__handle_mm_fault.handle_mm_fault.__do_page_fault
      1.78 ±  2%      -0.4        1.39 ± 12%  perf-profile.children.cycles-pp.kswapd
      1.04            -0.1        0.99 ±  2%  perf-profile.children.cycles-pp.isolate_lru_pages
      0.81            -0.0        0.76 ±  2%  perf-profile.children.cycles-pp.__isolate_lru_page
      0.15 ±  3%      -0.0        0.14 ±  5%  perf-profile.children.cycles-pp.__might_sleep
     90.05            +0.3       90.34        perf-profile.children.cycles-pp.filemap_fault
     90.34            +0.3       90.63        perf-profile.children.cycles-pp.__do_fault
     90.24            +0.3       90.53        perf-profile.children.cycles-pp.__xfs_filemap_fault
      0.81            -0.0        0.76 ±  2%  perf-profile.self.cycles-pp.__isolate_lru_page
      0.26            -0.0        0.24 ±  2%  perf-profile.self.cycles-pp.page_mapping
      2239            -6.3%       2098 ±  3%  proc-vmstat.allocstall_normal
     87129            -2.7%      84785        proc-vmstat.nr_active_anon
    373892            +7.5%     402117        proc-vmstat.nr_active_file
   2853538            -1.5%    2810660        proc-vmstat.nr_dirty_background_threshold
   5714059            -1.5%    5628198        proc-vmstat.nr_dirty_threshold
  24660939            +4.9%   25873984        proc-vmstat.nr_file_pages
      8008 ±  4%     -17.0%       6644 ±  8%  proc-vmstat.nr_free_cma
   4326996 ±  2%     -38.0%    2681477 ±  4%  proc-vmstat.nr_free_pages
  24021387            +4.9%   25208284        proc-vmstat.nr_inactive_file
    771.75           +16.4%     898.00        proc-vmstat.nr_isolated_file
     15610            +3.2%      16107        proc-vmstat.nr_kernel_stack
  21160671           +18.8%   25142084        proc-vmstat.nr_mapped
    365.25 ± 42%     -68.9%     113.75 ±161%  proc-vmstat.nr_mlock
   1608928           +18.8%    1910614        proc-vmstat.nr_page_table_pages
     40353            -6.4%      37773        proc-vmstat.nr_shmem
   2143633            +5.8%    2267588        proc-vmstat.nr_slab_reclaimable
     87131            -2.7%      84787        proc-vmstat.nr_zone_active_anon
    373892            +7.5%     402072        proc-vmstat.nr_zone_active_file
  24021338            +4.9%   25208211        proc-vmstat.nr_zone_inactive_file
  76784100 ±  4%      +8.7%   83481509 ±  4%  proc-vmstat.numa_foreign
  76784100 ±  4%      +8.7%   83481509 ±  4%  proc-vmstat.numa_miss
  76801213 ±  4%      +8.7%   83498734 ±  4%  proc-vmstat.numa_other
      2188           +24.9%       2733 ± 17%  proc-vmstat.pgpgin
     24609          +723.0%     202542 ± 26%  proc-vmstat.slabs_scanned
    243402            -6.3%     228035        proc-vmstat.workingset_activate
 1.665e+08           -16.0%  1.399e+08        proc-vmstat.workingset_refault



***************************************************************************************************
lkp-bdw-ep2: 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 128G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/runtime/tbox_group/test/testcase:
  gcc-7/performance/x86_64-rhel-7.2/debian-x86_64-2018-04-03.cgz/300s/lkp-bdw-ep2/lru-file-readtwice/vm-scalability

commit: 
  3bf181bc5d ("kernel/sys.c: remove duplicated include")
  172b06c32b ("mm: slowly shrink slabs with a relatively small number of objects")

3bf181bc5d8bc86f 172b06c32b949759fe6313abec 
---------------- -------------------------- 
       fail:runs  %reproduction    fail:runs
           |             |             |    
          1:4          -25%            :4     dmesg.WARNING:at_ip_fsnotify/0x
          1:4          -25%            :4     dmesg.page_allocation_failure:order:#,mode:#(GFP_KERNEL|__GFP_COMP),nodemask=(null)
         %stddev     %change         %stddev
             \          |                \  
    332.32            -7.6%     307.22        vm-scalability.time.elapsed_time
    332.32            -7.6%     307.22        vm-scalability.time.elapsed_time.max
    106418            +0.6%     107079        vm-scalability.time.minor_page_faults
      7137            +8.3%       7731        vm-scalability.time.percent_of_cpu_this_job_got
      2087 ± 12%     -10.0%       1879        boot-time.idle
    354695            +2.7%     364398        interrupts.CAL:Function_call_interrupts
      9.96            -8.0        1.97 ± 40%  mpstat.cpu.idle%
    429.04            +4.1%     446.76        pmeter.Average_Active_Power
     45161            -4.1%      43310        pmeter.performance_per_watt
     12174 ± 12%     -25.1%       9113 ±  3%  softirqs.NET_RX
    601034           -47.7%     314172 ±  4%  softirqs.SCHED
     15.33 ±  3%     +10.9%      17.00        vmstat.memory.buff
   8576157 ±  3%     -47.5%    4502812 ± 11%  vmstat.memory.free
     23322 ±  2%     +10.0%      25653        vmstat.system.cs
     87.85 ±  5%     -15.1%      74.58 ±  4%  sched_debug.cfs_rq:/.load_avg.avg
      5.61 ± 76%     -72.3%       1.55 ± 70%  sched_debug.cfs_rq:/.removed.load_avg.avg
    489.37 ±  2%     -14.4%     418.87 ±  9%  sched_debug.cfs_rq:/.util_avg.min
    229709 ± 28%     -36.1%     146855 ± 17%  sched_debug.cpu.avg_idle.min
    -10.70           -18.4%      -8.73        sched_debug.cpu.nr_uninterruptible.min
  86559079 ±  3%     +10.8%   95947528 ±  3%  meminfo.Active
  86196115 ±  3%     +10.9%   95600785 ±  3%  meminfo.Active(file)
 1.067e+08           +10.5%  1.179e+08        meminfo.Cached
     41444 ±  8%     -37.4%      25936 ±  9%  meminfo.CmaFree
  22403260 ±  5%     -49.6%   11282311 ± 10%  meminfo.MemFree
     12931 ±  2%     +23.0%      15906        meminfo.PageTables
    169713           -14.5%     145059 ±  2%  meminfo.Shmem
      2527            +8.0%       2730        turbostat.Avg_MHz
      3.89 ± 26%     -72.5%       1.07 ± 25%  turbostat.CPU%c1
     57.33           +13.4%      65.00 ±  2%  turbostat.CoreTmp
      2.92 ± 13%     -90.9%       0.27 ± 41%  turbostat.Pkg%pc2
     62.00           +11.3%      69.00 ±  2%  turbostat.PkgTmp
    221.76            +4.8%     232.41        turbostat.PkgWatt
     22.55            +2.6%      23.14        turbostat.RAMWatt
  41905986 ±  2%     +13.4%   47529965        numa-meminfo.node0.Active
  41738396 ±  2%     +13.5%   47379816        numa-meminfo.node0.Active(file)
  52477550 ±  2%     +11.6%   58577116        numa-meminfo.node0.FilePages
      8508 ± 14%     +19.9%      10201 ± 15%  numa-meminfo.node0.KernelStack
  11971693 ± 10%     -50.8%    5884484 ± 15%  numa-meminfo.node0.MemFree
  53895558 ±  2%     +11.3%   59982766        numa-meminfo.node0.MemUsed
      5408 ± 21%     +57.9%       8541 ± 16%  numa-meminfo.node0.PageTables
  42248927 ±  4%     +12.4%   47492120 ±  4%  numa-meminfo.node1.Active
  42057118 ±  4%     +12.5%   47298241 ±  4%  numa-meminfo.node1.Active(file)
  52145205           +12.6%   58734118        numa-meminfo.node1.FilePages
  12566901 ±  7%     -51.7%    6069491 ±  8%  numa-meminfo.node1.MemFree
  53456806           +12.2%   59954211        numa-meminfo.node1.MemUsed
    146856 ±  2%      -7.2%     136327 ±  4%  perf-stat.cpu-migrations
      0.06 ± 10%      -0.0        0.05 ±  8%  perf-stat.dTLB-load-miss-rate%
 3.703e+09 ± 10%     -16.9%  3.075e+09 ±  8%  perf-stat.dTLB-load-misses
      0.01 ± 15%      -0.0        0.01 ±  7%  perf-stat.dTLB-store-miss-rate%
 1.386e+08 ± 16%     -29.7%   97420623 ±  7%  perf-stat.dTLB-store-misses
 1.621e+12            -1.3%    1.6e+12        perf-stat.dTLB-stores
     29.38 ±  5%      +3.6       33.00 ±  3%  perf-stat.node-load-miss-rate%
 7.545e+09 ±  4%      +9.8%  8.284e+09        perf-stat.node-load-misses
 1.815e+10 ±  3%      -7.2%  1.684e+10 ±  4%  perf-stat.node-loads
     44.16            +0.9       45.08        perf-stat.node-store-miss-rate%
 2.185e+09            +1.5%  2.218e+09        perf-stat.node-store-misses
 2.763e+09            -2.2%  2.703e+09        perf-stat.node-stores
  10390158 ±  2%     +14.3%   11876538        numa-vmstat.node0.nr_active_file
  13031472 ±  2%     +12.0%   14601129        numa-vmstat.node0.nr_file_pages
   3080338 ±  9%     -50.9%    1513457 ±  9%  numa-vmstat.node0.nr_free_pages
      8498 ± 14%     +20.0%      10197 ± 14%  numa-vmstat.node0.nr_kernel_stack
      1344 ± 21%     +58.9%       2135 ± 15%  numa-vmstat.node0.nr_page_table_pages
  10390258 ±  2%     +14.3%   11876727        numa-vmstat.node0.nr_zone_active_file
  2.25e+08 ±  6%     -26.7%   1.65e+08 ±  7%  numa-vmstat.node0.numa_hit
  2.25e+08 ±  6%     -26.7%   1.65e+08 ±  7%  numa-vmstat.node0.numa_local
  25217978 ± 35%     -43.9%   14136959 ± 30%  numa-vmstat.node0.numa_miss
  25230025 ± 35%     -43.9%   14150587 ± 30%  numa-vmstat.node0.numa_other
   4189684 ± 38%     -74.4%    1071151 ± 95%  numa-vmstat.node0.workingset_activate
   2317192 ± 19%     -34.1%    1527788 ± 15%  numa-vmstat.node0.workingset_nodereclaim
  22321015 ± 62%     -77.9%    4937981 ± 97%  numa-vmstat.node0.workingset_refault
  10477318 ±  4%     +13.8%   11920806 ±  3%  numa-vmstat.node1.nr_active_file
  12940319           +13.8%   14725234        numa-vmstat.node1.nr_file_pages
     11478 ±  8%     -38.3%       7080 ±  4%  numa-vmstat.node1.nr_free_cma
   3236779 ±  6%     -54.5%    1473557 ±  7%  numa-vmstat.node1.nr_free_pages
    576.00 ±  2%     +46.5%     843.67 ±  4%  numa-vmstat.node1.nr_isolated_file
  10477296 ±  4%     +13.8%   11920778 ±  3%  numa-vmstat.node1.nr_zone_active_file
  25309770 ± 35%     -43.9%   14196144 ± 30%  numa-vmstat.node1.numa_foreign
 2.291e+08 ±  3%     -31.2%  1.576e+08 ± 11%  numa-vmstat.node1.numa_hit
 2.289e+08 ±  3%     -31.2%  1.575e+08 ± 11%  numa-vmstat.node1.numa_local
   1523557 ± 10%     -48.7%     781744 ± 39%  numa-vmstat.node1.workingset_nodereclaim
     90427            -4.7%      86190        proc-vmstat.nr_active_anon
  21346726 ±  3%     +11.8%   23872297 ±  3%  proc-vmstat.nr_active_file
     67928            +3.4%      70261        proc-vmstat.nr_anon_pages
  26547872           +10.9%   29438725        proc-vmstat.nr_file_pages
     10585 ±  9%     -37.8%       6579 ±  7%  proc-vmstat.nr_free_cma
   5732507 ±  7%     -50.0%    2868488 ±  9%  proc-vmstat.nr_free_pages
      1313 ± 13%     +27.6%       1675 ±  7%  proc-vmstat.nr_isolated_file
     17385            +8.4%      18851        proc-vmstat.nr_kernel_stack
      3204 ±  3%     +24.0%       3974        proc-vmstat.nr_page_table_pages
     42275 ±  2%     -15.2%      35864 ±  3%  proc-vmstat.nr_shmem
    475820 ±  3%      -7.8%     438760 ±  6%  proc-vmstat.nr_slab_reclaimable
     90429            -4.7%      86195        proc-vmstat.nr_zone_active_anon
  21346841 ±  3%     +11.8%   23872490 ±  3%  proc-vmstat.nr_zone_active_file
 7.064e+08            -3.3%  6.832e+08        proc-vmstat.numa_hit
 7.064e+08            -3.3%  6.832e+08        proc-vmstat.numa_local
      2237 ± 42%     -74.8%     563.67 ± 32%  proc-vmstat.numa_pages_migrated
     12679 ± 26%     -56.1%       5568 ± 35%  proc-vmstat.numa_pte_updates
  7.76e+08            -2.7%  7.548e+08        proc-vmstat.pgalloc_normal
 7.851e+08            -2.7%  7.641e+08        proc-vmstat.pgfree
    286939 ± 71%  +10680.7%   30934078 ±  2%  proc-vmstat.pginodesteal
      2302 ± 38%     -96.5%      80.67 ± 23%  proc-vmstat.pgmigrate_fail
     66.67 ± 31%   +1009.5%     739.67 ± 30%  proc-vmstat.pgrotated
 6.529e+08            -4.9%   6.21e+08        proc-vmstat.pgscan_direct
 6.529e+08            -4.9%   6.21e+08        proc-vmstat.pgsteal_direct
   7996387 ±  2%      +8.0%    8637231 ±  2%  proc-vmstat.slabs_scanned
   5880794 ±  7%     -34.7%    3840169 ±  9%  proc-vmstat.workingset_activate
   3779699 ±  6%     -37.4%    2367951 ± 10%  proc-vmstat.workingset_nodereclaim
  30732651 ± 32%     -58.3%   12802727 ± 22%  proc-vmstat.workingset_refault
      1.80 ± 57%      -1.2        0.62 ± 72%  perf-profile.calltrace.cycles-pp.__alloc_pages_nodemask.__get_free_pages.pgd_alloc.mm_init.__do_execve_file
      1.80 ± 57%      -1.2        0.62 ± 72%  perf-profile.calltrace.cycles-pp.mm_init.__do_execve_file.__x64_sys_execve.do_syscall_64.entry_SYSCALL_64_after_hwframe
      1.80 ± 57%      -1.2        0.62 ± 72%  perf-profile.calltrace.cycles-pp.pgd_alloc.mm_init.__do_execve_file.__x64_sys_execve.do_syscall_64
      1.80 ± 57%      -1.2        0.62 ± 72%  perf-profile.calltrace.cycles-pp.__get_free_pages.pgd_alloc.mm_init.__do_execve_file.__x64_sys_execve
     18.82            -0.9       17.91 ±  4%  perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irq.shrink_inactive_list.shrink_node_memcg.shrink_node
     17.50 ±  2%      -0.9       16.63 ±  3%  perf-profile.calltrace.cycles-pp._raw_spin_lock_irq.shrink_inactive_list.shrink_node_memcg.shrink_node.do_try_to_free_pages
     20.83            -0.6       20.22        perf-profile.calltrace.cycles-pp.shrink_inactive_list.shrink_node_memcg.shrink_node.do_try_to_free_pages.try_to_free_pages
      2.92 ±  2%      -0.1        2.82        perf-profile.calltrace.cycles-pp.iomap_readpage_actor.iomap_readpages_actor.iomap_apply.iomap_readpages.read_pages
      2.14 ±  2%      -0.1        2.05        perf-profile.calltrace.cycles-pp.memset_erms.iomap_readpage_actor.iomap_readpages_actor.iomap_apply.iomap_readpages
      0.68            -0.0        0.67        perf-profile.calltrace.cycles-pp.__add_to_page_cache_locked.add_to_page_cache_lru.iomap_readpages_actor.iomap_apply.iomap_readpages
     38.44            +1.3       39.75        perf-profile.calltrace.cycles-pp.shrink_node_memcg.shrink_node.do_try_to_free_pages.try_to_free_pages.__alloc_pages_slowpath
     38.51            +1.4       39.96        perf-profile.calltrace.cycles-pp.shrink_node.do_try_to_free_pages.try_to_free_pages.__alloc_pages_slowpath.__alloc_pages_nodemask
     16.65 ±  3%      +2.1       18.79 ±  3%  perf-profile.calltrace.cycles-pp.shrink_active_list.shrink_node_memcg.shrink_node.do_try_to_free_pages.try_to_free_pages
     16.10 ±  5%      +2.2       18.34 ±  3%  perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irq.shrink_active_list.shrink_node_memcg.shrink_node
     15.69 ±  4%      +2.2       17.94 ±  3%  perf-profile.calltrace.cycles-pp._raw_spin_lock_irq.shrink_active_list.shrink_node_memcg.shrink_node.do_try_to_free_pages
     34.48 ±  4%      +2.4       36.91 ±  3%  perf-profile.calltrace.cycles-pp.do_try_to_free_pages.try_to_free_pages.__alloc_pages_slowpath.__alloc_pages_nodemask.__do_page_cache_readahead
     34.50 ±  4%      +2.4       36.93 ±  3%  perf-profile.calltrace.cycles-pp.try_to_free_pages.__alloc_pages_slowpath.__alloc_pages_nodemask.__do_page_cache_readahead.ondemand_readahead
     37.52 ±  2%      +2.5       39.98 ±  3%  perf-profile.calltrace.cycles-pp.__alloc_pages_nodemask.__do_page_cache_readahead.ondemand_readahead.generic_file_read_iter.xfs_file_buffered_aio_read
     36.63 ±  2%      +2.7       39.30 ±  3%  perf-profile.calltrace.cycles-pp.__alloc_pages_slowpath.__alloc_pages_nodemask.__do_page_cache_readahead.ondemand_readahead.generic_file_read_iter
      1.93 ± 48%      -1.0        0.90 ± 18%  perf-profile.children.cycles-pp.__x64_sys_execve
      1.93 ± 48%      -1.0        0.90 ± 18%  perf-profile.children.cycles-pp.__do_execve_file
      2.92 ±  2%      -0.1        2.83        perf-profile.children.cycles-pp.iomap_readpage_actor
      2.14 ±  2%      -0.1        2.06        perf-profile.children.cycles-pp.memset_erms
      0.59 ±  2%      -0.0        0.55 ±  2%  perf-profile.children.cycles-pp.__delete_from_page_cache
      0.09            -0.0        0.06        perf-profile.children.cycles-pp.replace_slot
      0.45 ±  2%      -0.0        0.42        perf-profile.children.cycles-pp.__pagevec_lru_add_fn
      0.24 ±  3%      -0.0        0.22 ±  2%  perf-profile.children.cycles-pp.__radix_tree_replace
      0.18 ±  5%      -0.0        0.16        perf-profile.children.cycles-pp.xfs_ilock
      0.08 ±  6%      +0.0        0.09        perf-profile.children.cycles-pp.workingset_update_node
      0.13 ± 10%      +0.0        0.17 ±  8%  perf-profile.children.cycles-pp.__vfs_write
      0.02 ±141%      +0.0        0.06 ± 13%  perf-profile.children.cycles-pp.cmd_record
      0.66            +0.1        0.72 ±  2%  perf-profile.children.cycles-pp.vfs_write
      0.85            +0.1        0.91 ±  3%  perf-profile.children.cycles-pp.ksys_write
      0.00            +0.1        0.06 ± 16%  perf-profile.children.cycles-pp.__softirqentry_text_start
      0.07 ± 35%      +0.1        0.18 ± 43%  perf-profile.children.cycles-pp.do_shrink_slab
      0.07 ± 35%      +0.1        0.18 ± 45%  perf-profile.children.cycles-pp.shrink_slab
      0.00            +0.1        0.13 ± 40%  perf-profile.children.cycles-pp.list_lru_walk_one_irq
      0.00            +0.1        0.13 ± 40%  perf-profile.children.cycles-pp.__list_lru_walk_one
      0.00            +0.1        0.13 ± 40%  perf-profile.children.cycles-pp.shadow_lru_isolate
     40.90            +1.6       42.51 ±  2%  perf-profile.children.cycles-pp.shrink_node_memcg
     36.35 ±  2%      +1.6       37.99 ±  2%  perf-profile.children.cycles-pp._raw_spin_lock_irq
     40.99            +1.7       42.71 ±  2%  perf-profile.children.cycles-pp.shrink_node
     38.69            +1.8       40.46 ±  2%  perf-profile.children.cycles-pp.try_to_free_pages
     38.67            +1.8       40.44 ±  2%  perf-profile.children.cycles-pp.do_try_to_free_pages
     41.80            +1.8       43.60 ±  3%  perf-profile.children.cycles-pp.__alloc_pages_nodemask
     40.90            +2.0       42.91 ±  4%  perf-profile.children.cycles-pp.__alloc_pages_slowpath
     17.79 ±  4%      +2.3       20.05 ±  3%  perf-profile.children.cycles-pp.shrink_active_list
      2.13 ±  2%      -0.1        2.05        perf-profile.self.cycles-pp.memset_erms
      0.09 ±  5%      -0.0        0.05 ±  8%  perf-profile.self.cycles-pp.replace_slot
      0.35 ±  2%      -0.0        0.33        perf-profile.self.cycles-pp.free_pcppages_bulk
      0.07            +0.0        0.08        perf-profile.self.cycles-pp.workingset_update_node
      0.06 ±  7%      +0.0        0.08 ±  6%  perf-profile.self.cycles-pp.shrink_active_list



***************************************************************************************************
lkp-hsw-ep2: 72 threads Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz with 128G memory
=========================================================================================
bs/compiler/cpufreq_governor/disk/fs2/fs/ioengine/kconfig/nr_task/rootfs/runtime/rw/tbox_group/test_size/testcase/ucode:
  4k/gcc-7/performance/1SSD/nfsv4/ext4/sync/x86_64-rhel-7.2/8/debian-x86_64-2018-04-03.cgz/300s/write/lkp-hsw-ep2/512g/fio-basic/0x3d

commit: 
  3bf181bc5d ("kernel/sys.c: remove duplicated include")
  172b06c32b ("mm: slowly shrink slabs with a relatively small number of objects")

3bf181bc5d8bc86f 172b06c32b949759fe6313abec 
---------------- -------------------------- 
       fail:runs  %reproduction    fail:runs
           |             |             |    
           :2           50%           1:2     dmesg.WARNING:at#for_ip_swapgs_restore_regs_and_return_to_usermode/0x
           :2           50%           1:2     dmesg.WARNING:stack_recursion



***************************************************************************************************
lkp-skl-2sp2: 80 threads Skylake with 64G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/runtime/tbox_group/test/testcase:
  gcc-7/performance/x86_64-rhel-7.2/debian-x86_64-2018-04-03.cgz/300s/lkp-skl-2sp2/lru-file-readonce/vm-scalability

commit: 
  3bf181bc5d ("kernel/sys.c: remove duplicated include")
  172b06c32b ("mm: slowly shrink slabs with a relatively small number of objects")

3bf181bc5d8bc86f 172b06c32b949759fe6313abec 
---------------- -------------------------- 
       fail:runs  %reproduction    fail:runs
           |             |             |    
          1:4          -25%            :4     dmesg.WARNING:at_ip_fsnotify/0x
         %stddev     %change         %stddev
             \          |                \  
    273720            -1.5%     269667        vm-scalability.median
  21908859            -1.5%   21588006        vm-scalability.throughput
    211.81            -2.2%     207.25        vm-scalability.time.elapsed_time
    211.81            -2.2%     207.25        vm-scalability.time.elapsed_time.max
    491347 ±  8%      -9.6%     444313        vm-scalability.time.involuntary_context_switches
      7205            +4.1%       7501        vm-scalability.time.percent_of_cpu_this_job_got
     14908            +1.8%      15179        vm-scalability.time.system_time
      8.67 ±  4%      -2.8        5.91 ±  9%  mpstat.cpu.idle%
    381265 ±  2%     -15.6%     321682 ±  4%  softirqs.SCHED
    133688 ± 15%     +24.8%     166796        vmstat.system.in
      2835            +3.0%       2919        turbostat.Avg_MHz
    385790 ± 98%     -97.6%       9385 ± 10%  turbostat.C1
     25.08            +6.3%      26.67 ±  7%  boot-time.boot
     17.28            +2.8%      17.77 ±  2%  boot-time.dhcp
      1665            +6.9%       1780 ±  9%  boot-time.idle
  11349385 ±  7%     +20.0%   13620869 ±  4%  numa-numastat.node0.numa_miss
  11349447 ±  7%     +20.1%   13627432 ±  4%  numa-numastat.node0.other_node
  11349385 ±  7%     +20.0%   13620869 ±  4%  numa-numastat.node1.numa_foreign
  23245346 ±103%     -98.1%     446685 ±  8%  cpuidle.C1.time
    391056 ± 97%     -96.4%      13942 ±  5%  cpuidle.C1.usage
     91033 ± 76%     -93.8%       5676 ± 10%  cpuidle.POLL.time
      1628 ± 63%     -85.8%     232.00 ± 14%  cpuidle.POLL.usage
    903.25 ±  2%     -10.5%     808.25        slabinfo.inode_cache.active_slabs
     48725 ±  2%     -10.4%      43665        slabinfo.inode_cache.num_objs
    903.25 ±  2%     -10.5%     808.25        slabinfo.inode_cache.num_slabs
      7889 ±  2%     -22.2%       6134 ±  5%  slabinfo.proc_inode_cache.active_objs
      8314           -22.9%       6410 ±  6%  slabinfo.proc_inode_cache.num_objs
      6.06            +1.6%       6.16        perf-stat.cpi
  4.78e+13            +1.5%  4.851e+13        perf-stat.cpu-cycles
     18299 ± 17%     -14.2%      15696 ±  3%  perf-stat.cpu-migrations
      0.11 ± 13%      -0.0        0.07 ±  8%  perf-stat.dTLB-store-miss-rate%
 1.252e+09 ± 12%     -41.6%  7.319e+08 ±  8%  perf-stat.dTLB-store-misses
      0.16            -1.5%       0.16        perf-stat.ipc
    579179            -2.6%     564157        perf-stat.minor-faults
 5.966e+08 ± 17%     -12.0%  5.249e+08        perf-stat.node-load-misses
    579181            -2.6%     564130        perf-stat.page-faults
      4174 ±  7%     +19.0%       4967 ±  4%  proc-vmstat.kswapd_low_wmark_hit_quickly
     12972            -1.6%      12764        proc-vmstat.nr_inactive_anon
      2332            +2.3%       2387        proc-vmstat.nr_page_table_pages
     22134            -3.4%      21380        proc-vmstat.nr_shmem
    322945            -7.0%     300191        proc-vmstat.nr_slab_reclaimable
     12972            -1.6%      12764        proc-vmstat.nr_zone_inactive_anon
      4029 ± 55%     -41.9%       2341 ± 52%  proc-vmstat.numa_hint_faults
      4178 ±  7%     +18.9%       4970 ±  4%  proc-vmstat.pageoutrun
      2203            +2.8%       2265        proc-vmstat.pgpgin
  89860749 ±  3%      -8.8%   81938878 ±  3%  proc-vmstat.pgscan_kswapd
  89860664 ±  3%      -8.8%   81938795 ±  3%  proc-vmstat.pgsteal_kswapd
  18891910            -3.2%   18286025        proc-vmstat.slabs_scanned
   6953958            -2.3%    6796141        proc-vmstat.workingset_nodereclaim
    199.50 ±  6%     -18.0%     163.58 ± 13%  sched_debug.cfs_rq:/.exec_clock.stddev
    932.31 ±  5%     -11.2%     827.75 ±  5%  sched_debug.cfs_rq:/.load_avg.max
     13.22 ±  9%     +17.7%      15.56 ±  6%  sched_debug.cfs_rq:/.nr_spread_over.avg
     21.75 ± 11%     -20.1%      17.38 ± 15%  sched_debug.cfs_rq:/.runnable_load_avg.avg
    631.62 ±  8%     -34.5%     413.50 ± 19%  sched_debug.cfs_rq:/.runnable_load_avg.max
     82.41 ± 14%     -35.4%      53.24 ± 24%  sched_debug.cfs_rq:/.runnable_load_avg.stddev
    107.73 ± 81%     -66.8%      35.81 ± 15%  sched_debug.cfs_rq:/.util_est_enqueued.avg
    893.38 ± 18%     -24.8%     671.69 ±  9%  sched_debug.cfs_rq:/.util_est_enqueued.max
    163.51 ± 12%     -27.2%     119.09 ± 11%  sched_debug.cfs_rq:/.util_est_enqueued.stddev
     21.81 ± 11%     -20.5%      17.33 ± 14%  sched_debug.cpu.cpu_load[0].avg
    632.25 ±  8%     -34.7%     412.81 ± 19%  sched_debug.cpu.cpu_load[0].max
     82.46 ± 14%     -35.6%      53.08 ± 24%  sched_debug.cpu.cpu_load[0].stddev
    630.31 ±  8%     -31.2%     433.38 ± 21%  sched_debug.cpu.cpu_load[1].max
     82.02 ± 14%     -33.0%      54.94 ± 26%  sched_debug.cpu.cpu_load[1].stddev
    630.00 ±  8%     -27.9%     454.25 ± 21%  sched_debug.cpu.cpu_load[2].max
     81.88 ± 14%     -31.0%      56.52 ± 27%  sched_debug.cpu.cpu_load[2].stddev
     22.57 ±  8%     -20.9%      17.85 ± 15%  sched_debug.cpu.cpu_load[3].avg
    626.38 ±  9%     -28.4%     448.62 ± 20%  sched_debug.cpu.cpu_load[3].max
     81.57 ± 13%     -32.3%      55.20 ± 26%  sched_debug.cpu.cpu_load[3].stddev
     22.72 ±  7%     -21.9%      17.74 ± 12%  sched_debug.cpu.cpu_load[4].avg
    648.12 ±  9%     -30.5%     450.62 ± 18%  sched_debug.cpu.cpu_load[4].max
     83.79 ± 13%     -34.6%      54.79 ± 24%  sched_debug.cpu.cpu_load[4].stddev
      4542 ±  3%      -8.8%       4142        sched_debug.cpu.nr_switches.min
      4252 ±  2%     -12.0%       3741        sched_debug.cpu.sched_count.min
      0.19 ±173%   +4900.0%       9.38 ± 84%  sched_debug.cpu.sched_goidle.min
      1938           -11.2%       1720        sched_debug.cpu.ttwu_count.min
      1829           -11.5%       1618        sched_debug.cpu.ttwu_local.min
     12.73 ±  2%      -1.4       11.37 ±  2%  perf-profile.calltrace.cycles-pp.add_to_page_cache_lru.iomap_readpages_actor.iomap_apply.iomap_readpages.read_pages
      9.92 ±  3%      -1.3        8.58 ±  2%  perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.pagevec_lru_move_fn.__lru_cache_add.add_to_page_cache_lru
      9.99 ±  2%      -1.3        8.66 ±  2%  perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.pagevec_lru_move_fn.__lru_cache_add.add_to_page_cache_lru.iomap_readpages_actor
     11.19 ±  2%      -1.3        9.86 ±  2%  perf-profile.calltrace.cycles-pp.__lru_cache_add.add_to_page_cache_lru.iomap_readpages_actor.iomap_apply.iomap_readpages
     11.10 ±  2%      -1.3        9.78 ±  2%  perf-profile.calltrace.cycles-pp.pagevec_lru_move_fn.__lru_cache_add.add_to_page_cache_lru.iomap_readpages_actor.iomap_apply
     24.77            -1.1       23.69        perf-profile.calltrace.cycles-pp.iomap_readpages_actor.iomap_apply.iomap_readpages.read_pages.__do_page_cache_readahead
     24.88            -1.1       23.80        perf-profile.calltrace.cycles-pp.iomap_readpages.read_pages.__do_page_cache_readahead.ondemand_readahead.generic_file_read_iter
     24.90            -1.1       23.83        perf-profile.calltrace.cycles-pp.read_pages.__do_page_cache_readahead.ondemand_readahead.generic_file_read_iter.xfs_file_buffered_aio_read
     24.86            -1.1       23.79        perf-profile.calltrace.cycles-pp.iomap_apply.iomap_readpages.read_pages.__do_page_cache_readahead.ondemand_readahead
      8.91 ±  2%      -0.6        8.29        perf-profile.calltrace.cycles-pp._raw_spin_lock.get_page_from_freelist.__alloc_pages_nodemask.__do_page_cache_readahead.ondemand_readahead
      8.90 ±  2%      -0.6        8.28        perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.get_page_from_freelist.__alloc_pages_nodemask.__do_page_cache_readahead
     10.56            -0.6        9.96        perf-profile.calltrace.cycles-pp.get_page_from_freelist.__alloc_pages_nodemask.__do_page_cache_readahead.ondemand_readahead.generic_file_read_iter
     40.64            -0.2       40.42        perf-profile.calltrace.cycles-pp.shrink_inactive_list.shrink_node_memcg.shrink_node.do_try_to_free_pages.try_to_free_pages
      1.33 ±  2%      -0.1        1.24 ±  2%  perf-profile.calltrace.cycles-pp.free_unref_page_list.shrink_page_list.shrink_inactive_list.shrink_node_memcg.shrink_node
      1.18 ±  3%      -0.1        1.09 ±  2%  perf-profile.calltrace.cycles-pp.free_pcppages_bulk.free_unref_page_list.shrink_page_list.shrink_inactive_list.shrink_node_memcg
      0.87            -0.0        0.83        perf-profile.calltrace.cycles-pp.__delete_from_page_cache.__remove_mapping.shrink_page_list.shrink_inactive_list.shrink_node_memcg
      1.45            -0.0        1.42        perf-profile.calltrace.cycles-pp.__add_to_page_cache_locked.add_to_page_cache_lru.iomap_readpages_actor.iomap_apply.iomap_readpages
      1.71            +0.0        1.73        perf-profile.calltrace.cycles-pp.__entry_SYSCALL_64_trampoline
      6.59            +0.2        6.78        perf-profile.calltrace.cycles-pp.memset_erms.iomap_readpage_actor.iomap_readpages_actor.iomap_apply.iomap_readpages
      9.56            +0.2        9.77        perf-profile.calltrace.cycles-pp.copy_user_enhanced_fast_string.copyout.copy_page_to_iter.generic_file_read_iter.xfs_file_buffered_aio_read
      9.63            +0.2        9.85        perf-profile.calltrace.cycles-pp.copyout.copy_page_to_iter.generic_file_read_iter.xfs_file_buffered_aio_read.xfs_file_read_iter
      9.84            +0.2       10.07        perf-profile.calltrace.cycles-pp.copy_page_to_iter.generic_file_read_iter.xfs_file_buffered_aio_read.xfs_file_read_iter.__vfs_read
     11.79            +0.3       12.06        perf-profile.calltrace.cycles-pp.iomap_readpage_actor.iomap_readpages_actor.iomap_apply.iomap_readpages.read_pages
     54.10            +0.7       54.83        perf-profile.calltrace.cycles-pp.__alloc_pages_nodemask.__do_page_cache_readahead.ondemand_readahead.generic_file_read_iter.xfs_file_buffered_aio_read
     43.36            +1.3       44.68        perf-profile.calltrace.cycles-pp.__alloc_pages_slowpath.__alloc_pages_nodemask.__do_page_cache_readahead.ondemand_readahead.generic_file_read_iter
     40.97            +1.4       42.35        perf-profile.calltrace.cycles-pp.shrink_node.do_try_to_free_pages.try_to_free_pages.__alloc_pages_slowpath.__alloc_pages_nodemask
     40.99            +1.4       42.37        perf-profile.calltrace.cycles-pp.try_to_free_pages.__alloc_pages_slowpath.__alloc_pages_nodemask.__do_page_cache_readahead.ondemand_readahead
     40.98            +1.4       42.36        perf-profile.calltrace.cycles-pp.do_try_to_free_pages.try_to_free_pages.__alloc_pages_slowpath.__alloc_pages_nodemask.__do_page_cache_readahead
      0.00            +1.7        1.68 ±  4%  perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irq.shadow_lru_isolate.__list_lru_walk_one.list_lru_walk_one_irq
      0.00            +1.7        1.70 ±  4%  perf-profile.calltrace.cycles-pp._raw_spin_lock_irq.shadow_lru_isolate.__list_lru_walk_one.list_lru_walk_one_irq.do_shrink_slab
      0.00            +1.9        1.86 ±  4%  perf-profile.calltrace.cycles-pp.shadow_lru_isolate.__list_lru_walk_one.list_lru_walk_one_irq.do_shrink_slab.shrink_slab
      0.00            +1.9        1.88 ±  4%  perf-profile.calltrace.cycles-pp.__list_lru_walk_one.list_lru_walk_one_irq.do_shrink_slab.shrink_slab.shrink_node
      0.00            +1.9        1.88 ±  4%  perf-profile.calltrace.cycles-pp.list_lru_walk_one_irq.do_shrink_slab.shrink_slab.shrink_node.do_try_to_free_pages
      0.00            +1.9        1.90 ±  4%  perf-profile.calltrace.cycles-pp.shrink_slab.shrink_node.do_try_to_free_pages.try_to_free_pages.__alloc_pages_slowpath
      0.00            +1.9        1.90 ±  4%  perf-profile.calltrace.cycles-pp.do_shrink_slab.shrink_slab.shrink_node.do_try_to_free_pages.try_to_free_pages
     12.74 ±  2%      -1.4       11.37 ±  2%  perf-profile.children.cycles-pp.add_to_page_cache_lru
     11.20 ±  2%      -1.3        9.87 ±  2%  perf-profile.children.cycles-pp.__lru_cache_add
     11.24 ±  2%      -1.3        9.92 ±  2%  perf-profile.children.cycles-pp.pagevec_lru_move_fn
     11.46 ±  2%      -1.3       10.19 ±  2%  perf-profile.children.cycles-pp._raw_spin_lock_irqsave
     24.77            -1.1       23.69        perf-profile.children.cycles-pp.iomap_readpages_actor
     24.88            -1.1       23.80        perf-profile.children.cycles-pp.iomap_readpages
     24.86            -1.1       23.79        perf-profile.children.cycles-pp.iomap_apply
     24.90            -1.1       23.84        perf-profile.children.cycles-pp.read_pages
     10.06 ±  2%      -0.7        9.32 ±  2%  perf-profile.children.cycles-pp._raw_spin_lock
     11.52            -0.6       10.89        perf-profile.children.cycles-pp.get_page_from_freelist
     57.25            -0.5       56.75        perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
      0.21            -0.1        0.14 ±  3%  perf-profile.children.cycles-pp.replace_slot
      0.50            -0.0        0.45        perf-profile.children.cycles-pp.__radix_tree_replace
      1.46            -0.0        1.42        perf-profile.children.cycles-pp.__add_to_page_cache_locked
      0.99            -0.0        0.96        perf-profile.children.cycles-pp.__delete_from_page_cache
      0.23 ±  4%      -0.0        0.21 ±  4%  perf-profile.children.cycles-pp.touch_atime
      0.18 ±  6%      -0.0        0.17 ±  5%  perf-profile.children.cycles-pp.atime_needs_update
      0.64            +0.0        0.67 ±  2%  perf-profile.children.cycles-pp.security_file_permission
      1.80            +0.0        1.83        perf-profile.children.cycles-pp.__entry_SYSCALL_64_trampoline
      0.20 ±  9%      +0.0        0.23 ±  3%  perf-profile.children.cycles-pp.scheduler_tick
      0.00            +0.1        0.05        perf-profile.children.cycles-pp.list_lru_add
      0.28 ± 13%      +0.1        0.34 ±  4%  perf-profile.children.cycles-pp.tick_sched_handle
      0.27 ± 13%      +0.1        0.33 ±  4%  perf-profile.children.cycles-pp.update_process_times
      0.30 ± 14%      +0.1        0.37 ±  5%  perf-profile.children.cycles-pp.tick_sched_timer
      0.61 ±  8%      +0.1        0.69 ±  3%  perf-profile.children.cycles-pp.smp_apic_timer_interrupt
      0.73 ±  9%      +0.1        0.83 ±  3%  perf-profile.children.cycles-pp.apic_timer_interrupt
      6.62            +0.2        6.80        perf-profile.children.cycles-pp.memset_erms
      9.64            +0.2        9.85        perf-profile.children.cycles-pp.copyout
      9.62            +0.2        9.84        perf-profile.children.cycles-pp.copy_user_enhanced_fast_string
      9.85            +0.2       10.07        perf-profile.children.cycles-pp.copy_page_to_iter
     11.86            +0.3       12.14        perf-profile.children.cycles-pp.iomap_readpage_actor
     54.38            +0.8       55.13        perf-profile.children.cycles-pp.__alloc_pages_nodemask
     43.60            +1.4       44.96        perf-profile.children.cycles-pp.__alloc_pages_slowpath
     41.23            +1.4       42.64        perf-profile.children.cycles-pp.try_to_free_pages
     41.22            +1.4       42.62        perf-profile.children.cycles-pp.do_try_to_free_pages
     42.96            +1.5       44.43        perf-profile.children.cycles-pp.shrink_node
     36.13            +1.5       37.66        perf-profile.children.cycles-pp._raw_spin_lock_irq
      0.29            +1.6        1.87 ±  4%  perf-profile.children.cycles-pp.shadow_lru_isolate
      0.33            +1.6        1.92 ±  4%  perf-profile.children.cycles-pp.do_shrink_slab
      0.30            +1.6        1.89 ±  4%  perf-profile.children.cycles-pp.__list_lru_walk_one
      0.33            +1.6        1.92 ±  4%  perf-profile.children.cycles-pp.shrink_slab
      0.30            +1.6        1.90 ±  4%  perf-profile.children.cycles-pp.list_lru_walk_one_irq
     57.24            -0.5       56.75        perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
      0.20 ±  2%      -0.1        0.13 ±  3%  perf-profile.self.cycles-pp.replace_slot
      0.38            +0.0        0.40        perf-profile.self.cycles-pp.selinux_file_permission
      0.42 ±  3%      +0.1        0.49 ±  3%  perf-profile.self.cycles-pp.generic_file_read_iter
      6.54            +0.2        6.71        perf-profile.self.cycles-pp.memset_erms
      9.51            +0.2        9.71        perf-profile.self.cycles-pp.copy_user_enhanced_fast_string





Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Rong Chen

View attachment "config-4.19.0-rc4-00143-g172b06c3" of type "text/plain" (167709 bytes)

View attachment "job-script" of type "text/plain" (7897 bytes)

View attachment "job.yaml" of type "text/plain" (5176 bytes)

View attachment "reproduce" of type "text/plain" (2168 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ