lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20181211095906.GT23332@shao2-debian>
Date:   Tue, 11 Dec 2018 17:59:06 +0800
From:   kernel test robot <rong.a.chen@...el.com>
To:     Yunlong Song <yunlong.song@...wei.com>
Cc:     Jaegeuk Kim <jaegeuk@...nel.org>, Chao Yu <yuchao0@...wei.com>,
        LKML <linux-kernel@...r.kernel.org>,
        linux-f2fs-devel@...ts.sourceforge.net, lkp@...org
Subject: [LKP] [f2fs] 089842de57:  aim7.jobs-per-min 15.4% improvement

Greeting,

FYI, we noticed a 15.4% improvement of aim7.jobs-per-min due to commit:


commit: 089842de5750f434aa016eb23f3d3a3a151083bd ("f2fs: remove codes of unused wio_mutex")
https://git.kernel.org/cgit/linux/kernel/git/jaegeuk/f2fs.git dev-test

in testcase: aim7
on test machine: 40 threads Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz with 384G memory
with following parameters:

	disk: 4BRD_12G
	md: RAID1
	fs: f2fs
	test: disk_rw
	load: 3000
	cpufreq_governor: performance

test-description: AIM7 is a traditional UNIX system level benchmark suite which is used to test and measure the performance of multiuser system.
test-url: https://sourceforge.net/projects/aimbench/files/aim-suite7/

In addition to that, the commit also has significant impact on the following tests:

+------------------+-----------------------------------------------------------------------+
| testcase: change | aim7: aim7.jobs-per-min 8.8% improvement                              |
| test machine     | 40 threads Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz with 384G memory |
| test parameters  | cpufreq_governor=performance                                          |
|                  | disk=4BRD_12G                                                         |
|                  | fs=f2fs                                                               |
|                  | load=3000                                                             |
|                  | md=RAID1                                                              |
|                  | test=disk_rr                                                          |
+------------------+-----------------------------------------------------------------------+


Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp install job.yaml  # job file is attached in this email
        bin/lkp run     job.yaml

=========================================================================================
compiler/cpufreq_governor/disk/fs/kconfig/load/md/rootfs/tbox_group/test/testcase:
  gcc-7/performance/4BRD_12G/f2fs/x86_64-rhel-7.2/3000/RAID1/debian-x86_64-2018-04-03.cgz/lkp-ivb-ep01/disk_rw/aim7

commit: 
  d6c66cd19e ("f2fs: fix count of seg_freed to make sec_freed correct")
  089842de57 ("f2fs: remove codes of unused wio_mutex")

d6c66cd19ef322fe 089842de5750f434aa016eb23f 
---------------- -------------------------- 
         %stddev     %change         %stddev
             \          |                \  
     96213           +15.4%     110996        aim7.jobs-per-min
    191.50 ±  3%     -15.1%     162.52        aim7.time.elapsed_time
    191.50 ±  3%     -15.1%     162.52        aim7.time.elapsed_time.max
   1090253 ±  2%     -17.5%     899165        aim7.time.involuntary_context_switches
    176713            -7.5%     163478        aim7.time.minor_page_faults
      6882           -14.6%       5875        aim7.time.system_time
    127.97            +4.7%     134.00        aim7.time.user_time
    760923            +7.1%     814632        aim7.time.voluntary_context_switches
     78499 ±  2%     -11.2%      69691        interrupts.CAL:Function_call_interrupts
   3183861 ±  4%     -16.7%    2651390 ±  4%  softirqs.TIMER
    191.54 ± 13%     +45.4%     278.59 ± 12%  iostat.md0.w/s
      6118 ±  3%     +16.5%       7126 ±  2%  iostat.md0.wkB/s
    151257 ±  2%     -10.1%     135958 ±  2%  meminfo.AnonHugePages
     46754 ±  3%     +14.0%      53307 ±  3%  meminfo.max_used_kB
      0.03 ± 62%      -0.0        0.01 ± 78%  mpstat.cpu.soft%
      1.73 ±  3%      +0.4        2.13 ±  3%  mpstat.cpu.usr%
  16062961 ±  2%     -12.1%   14124403 ±  2%  turbostat.IRQ
      0.76 ± 37%     -71.8%       0.22 ± 83%  turbostat.Pkg%pc6
      9435 ±  7%     -18.1%       7730 ±  4%  turbostat.SMI
      6113 ±  3%     +16.5%       7120 ±  2%  vmstat.io.bo
     11293 ±  2%     +12.3%      12688 ±  2%  vmstat.system.cs
     81879 ±  2%      +2.5%      83951        vmstat.system.in
      2584            -4.4%       2469 ±  2%  proc-vmstat.nr_active_file
      2584            -4.4%       2469 ±  2%  proc-vmstat.nr_zone_active_file
     28564 ±  4%     -23.6%      21817 ± 12%  proc-vmstat.numa_hint_faults
     10958 ±  5%     -43.9%       6147 ± 26%  proc-vmstat.numa_hint_faults_local
    660531 ±  3%     -10.7%     590059 ±  2%  proc-vmstat.pgfault
      1191 ±  7%     -16.5%     995.25 ± 12%  slabinfo.UNIX.active_objs
      1191 ±  7%     -16.5%     995.25 ± 12%  slabinfo.UNIX.num_objs
     10552 ±  4%      -7.8%       9729        slabinfo.ext4_io_end.active_objs
     10552 ±  4%      -7.8%       9729        slabinfo.ext4_io_end.num_objs
     18395           +12.3%      20656 ±  8%  slabinfo.kmalloc-32.active_objs
     18502 ±  2%     +12.3%      20787 ±  8%  slabinfo.kmalloc-32.num_objs
 1.291e+12           -12.3%  1.131e+12        perf-stat.branch-instructions
      0.66            +0.1        0.76 ±  3%  perf-stat.branch-miss-rate%
 1.118e+10 ±  4%      -7.5%  1.034e+10        perf-stat.cache-misses
 2.772e+10 ±  8%      -6.6%  2.589e+10        perf-stat.cache-references
   2214958            -3.6%    2136237        perf-stat.context-switches
      3.95 ±  2%      -5.8%       3.72        perf-stat.cpi
  2.24e+13           -16.4%  1.873e+13        perf-stat.cpu-cycles
 1.542e+12           -10.4%  1.382e+12        perf-stat.dTLB-loads
      0.18 ±  6%      +0.0        0.19 ±  4%  perf-stat.dTLB-store-miss-rate%
 5.667e+12           -11.3%  5.029e+12        perf-stat.instructions
      5534           -13.1%       4809 ±  6%  perf-stat.instructions-per-iTLB-miss
      0.25 ±  2%      +6.1%       0.27        perf-stat.ipc
    647970 ±  2%     -10.7%     578955 ±  2%  perf-stat.minor-faults
 2.783e+09 ± 18%     -17.8%  2.288e+09 ±  4%  perf-stat.node-loads
 5.706e+09 ±  2%      -5.2%  5.407e+09        perf-stat.node-store-misses
 7.693e+09            -4.4%  7.352e+09        perf-stat.node-stores
    647979 ±  2%     -10.7%     578955 ±  2%  perf-stat.page-faults
     70960 ± 16%     -26.6%      52062        sched_debug.cfs_rq:/.exec_clock.avg
     70628 ± 16%     -26.7%      51787        sched_debug.cfs_rq:/.exec_clock.min
     22499 ±  3%     -10.5%      20133 ±  3%  sched_debug.cfs_rq:/.load.avg
      7838 ± 23%     -67.6%       2536 ± 81%  sched_debug.cfs_rq:/.load.min
    362.19 ± 12%     +58.3%     573.50 ± 25%  sched_debug.cfs_rq:/.load_avg.max
   3092960 ± 16%     -28.5%    2211400        sched_debug.cfs_rq:/.min_vruntime.avg
   3244162 ± 15%     -27.0%    2367437 ±  2%  sched_debug.cfs_rq:/.min_vruntime.max
   2984299 ± 16%     -28.9%    2121271        sched_debug.cfs_rq:/.min_vruntime.min
      0.73 ±  4%     -65.7%       0.25 ± 57%  sched_debug.cfs_rq:/.nr_running.min
      0.12 ± 13%    +114.6%       0.26 ±  9%  sched_debug.cfs_rq:/.nr_running.stddev
      8.44 ± 23%     -36.8%       5.33 ± 15%  sched_debug.cfs_rq:/.nr_spread_over.max
      1.49 ± 21%     -29.6%       1.05 ±  7%  sched_debug.cfs_rq:/.nr_spread_over.stddev
     16.53 ± 20%     -38.8%      10.12 ± 23%  sched_debug.cfs_rq:/.runnable_load_avg.avg
     15259 ±  7%     -33.3%      10176 ± 22%  sched_debug.cfs_rq:/.runnable_weight.avg
    796.65 ± 93%     -74.8%     200.68 ± 17%  sched_debug.cfs_rq:/.util_est_enqueued.avg
    669258 ±  3%     -13.3%     580068        sched_debug.cpu.avg_idle.avg
    116020 ± 12%     -21.4%      91239        sched_debug.cpu.clock.avg
    116076 ± 12%     -21.4%      91261        sched_debug.cpu.clock.max
    115967 ± 12%     -21.3%      91215        sched_debug.cpu.clock.min
    116020 ± 12%     -21.4%      91239        sched_debug.cpu.clock_task.avg
    116076 ± 12%     -21.4%      91261        sched_debug.cpu.clock_task.max
    115967 ± 12%     -21.3%      91215        sched_debug.cpu.clock_task.min
     15.41 ±  4%     -32.0%      10.48 ± 24%  sched_debug.cpu.cpu_load[0].avg
     15.71 ±  6%     -26.6%      11.53 ± 22%  sched_debug.cpu.cpu_load[1].avg
     16.20 ±  8%     -22.9%      12.49 ± 21%  sched_debug.cpu.cpu_load[2].avg
     16.92 ±  7%     -21.2%      13.33 ± 21%  sched_debug.cpu.cpu_load[3].avg
      2650 ±  6%     -15.6%       2238 ±  3%  sched_debug.cpu.curr->pid.avg
      1422 ±  8%     -68.5%     447.42 ± 57%  sched_debug.cpu.curr->pid.min
      7838 ± 23%     -67.6%       2536 ± 81%  sched_debug.cpu.load.min
     86066 ± 14%     -26.3%      63437        sched_debug.cpu.nr_load_updates.min
      3.97 ± 88%     -70.9%       1.15 ± 10%  sched_debug.cpu.nr_running.avg
      0.73 ±  4%     -65.7%       0.25 ± 57%  sched_debug.cpu.nr_running.min
      1126 ± 16%     -27.6%     816.02 ±  9%  sched_debug.cpu.sched_count.stddev
      1468 ± 16%     +31.1%       1925 ±  5%  sched_debug.cpu.sched_goidle.avg
      1115 ± 16%     +37.8%       1538 ±  4%  sched_debug.cpu.sched_goidle.min
      3979 ± 13%     -27.4%       2888 ±  5%  sched_debug.cpu.ttwu_local.max
    348.96 ±  8%     -26.3%     257.16 ± 13%  sched_debug.cpu.ttwu_local.stddev
    115966 ± 12%     -21.3%      91214        sched_debug.cpu_clk
    113505 ± 12%     -21.8%      88773        sched_debug.ktime
    116416 ± 12%     -21.3%      91663        sched_debug.sched_clk
      0.26 ±100%      +0.3        0.57 ±  6%  perf-profile.calltrace.cycles-pp.security_file_permission.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.29 ±100%      +0.4        0.66 ±  5%  perf-profile.calltrace.cycles-pp.find_get_entry.pagecache_get_page.f2fs_write_begin.generic_perform_write.__generic_file_write_iter
      0.67 ± 65%      +0.4        1.11        perf-profile.calltrace.cycles-pp.copy_user_enhanced_fast_string.copyin.iov_iter_copy_from_user_atomic.generic_perform_write.__generic_file_write_iter
      0.69 ± 65%      +0.5        1.14        perf-profile.calltrace.cycles-pp.copyin.iov_iter_copy_from_user_atomic.generic_perform_write.__generic_file_write_iter.f2fs_file_write_iter
      1.07 ± 57%      +0.5        1.61 ±  5%  perf-profile.calltrace.cycles-pp.pagecache_get_page.f2fs_write_begin.generic_perform_write.__generic_file_write_iter.f2fs_file_write_iter
      0.79 ± 64%      +0.5        1.33        perf-profile.calltrace.cycles-pp.iov_iter_copy_from_user_atomic.generic_perform_write.__generic_file_write_iter.f2fs_file_write_iter.__vfs_write
      0.73 ± 63%      +0.6        1.32 ±  3%  perf-profile.calltrace.cycles-pp.syscall_return_via_sysret
      0.81 ± 63%      +0.6        1.43 ±  3%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64
      0.06 ± 58%      +0.0        0.09 ±  4%  perf-profile.children.cycles-pp.__pagevec_lru_add_fn
      0.05 ± 58%      +0.0        0.09 ± 13%  perf-profile.children.cycles-pp.down_write_trylock
      0.06 ± 58%      +0.0        0.10 ±  4%  perf-profile.children.cycles-pp.__x64_sys_write
      0.07 ± 58%      +0.0        0.11 ±  3%  perf-profile.children.cycles-pp.account_page_dirtied
      0.04 ± 57%      +0.0        0.09 ±  5%  perf-profile.children.cycles-pp.account_page_cleaned
      0.06 ± 58%      +0.0        0.10 ±  7%  perf-profile.children.cycles-pp.free_pcppages_bulk
      0.10 ± 58%      +0.1        0.15 ±  6%  perf-profile.children.cycles-pp.page_mapping
      0.09 ± 57%      +0.1        0.14 ±  7%  perf-profile.children.cycles-pp.__lru_cache_add
      0.10 ± 57%      +0.1        0.15 ±  9%  perf-profile.children.cycles-pp.__might_sleep
      0.12 ± 58%      +0.1        0.19 ±  3%  perf-profile.children.cycles-pp.set_page_dirty
      0.08 ± 64%      +0.1        0.15 ± 10%  perf-profile.children.cycles-pp.dquot_claim_space_nodirty
      0.06 ± 61%      +0.1        0.13 ±  5%  perf-profile.children.cycles-pp.percpu_counter_add_batch
      0.18 ± 57%      +0.1        0.27 ±  2%  perf-profile.children.cycles-pp.iov_iter_fault_in_readable
      0.17 ± 57%      +0.1        0.26 ±  2%  perf-profile.children.cycles-pp.__set_page_dirty_nobuffers
      0.09 ± 57%      +0.1        0.18 ± 27%  perf-profile.children.cycles-pp.free_unref_page_list
      0.16 ± 58%      +0.1        0.30 ± 18%  perf-profile.children.cycles-pp.__pagevec_release
      0.30 ± 57%      +0.1        0.43 ±  5%  perf-profile.children.cycles-pp.add_to_page_cache_lru
      0.17 ± 58%      +0.1        0.31 ± 16%  perf-profile.children.cycles-pp.release_pages
      0.29 ± 58%      +0.2        0.45 ±  7%  perf-profile.children.cycles-pp.selinux_file_permission
      0.38 ± 57%      +0.2        0.58 ±  6%  perf-profile.children.cycles-pp.security_file_permission
      0.78 ± 57%      +0.3        1.12        perf-profile.children.cycles-pp.copy_user_enhanced_fast_string
      0.80 ± 57%      +0.3        1.15        perf-profile.children.cycles-pp.copyin
      0.92 ± 57%      +0.4        1.34        perf-profile.children.cycles-pp.iov_iter_copy_from_user_atomic
      0.98 ± 54%      +0.5        1.43 ±  3%  perf-profile.children.cycles-pp.entry_SYSCALL_64
      0.98 ± 53%      +0.5        1.50 ±  3%  perf-profile.children.cycles-pp.syscall_return_via_sysret
      1.64 ± 57%      +0.8        2.45 ±  5%  perf-profile.children.cycles-pp.pagecache_get_page
      0.04 ± 57%      +0.0        0.06        perf-profile.self.cycles-pp.__pagevec_lru_add_fn
      0.04 ± 58%      +0.0        0.07 ±  7%  perf-profile.self.cycles-pp.release_pages
      0.05 ± 58%      +0.0        0.08 ± 15%  perf-profile.self.cycles-pp._cond_resched
      0.04 ± 58%      +0.0        0.08 ±  6%  perf-profile.self.cycles-pp.ksys_write
      0.05 ± 58%      +0.0        0.09 ± 13%  perf-profile.self.cycles-pp.down_write_trylock
      0.09 ± 58%      +0.1        0.14 ±  9%  perf-profile.self.cycles-pp.page_mapping
      0.01 ±173%      +0.1        0.07 ±  7%  perf-profile.self.cycles-pp.__fdget_pos
      0.11 ± 57%      +0.1        0.17 ±  7%  perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
      0.05 ± 59%      +0.1        0.12 ±  5%  perf-profile.self.cycles-pp.percpu_counter_add_batch
      0.12 ± 58%      +0.1        0.19 ±  4%  perf-profile.self.cycles-pp.iov_iter_copy_from_user_atomic
      0.17 ± 57%      +0.1        0.24 ±  4%  perf-profile.self.cycles-pp.generic_perform_write
      0.17 ± 58%      +0.1        0.26 ±  2%  perf-profile.self.cycles-pp.iov_iter_fault_in_readable
      0.19 ± 57%      +0.1        0.30 ±  2%  perf-profile.self.cycles-pp.f2fs_set_data_page_dirty
      0.18 ± 58%      +0.1        0.30 ±  4%  perf-profile.self.cycles-pp.pagecache_get_page
      0.27 ± 57%      +0.1        0.41 ±  4%  perf-profile.self.cycles-pp.do_syscall_64
      0.40 ± 57%      +0.2        0.62 ±  5%  perf-profile.self.cycles-pp.find_get_entry
      0.77 ± 57%      +0.3        1.11        perf-profile.self.cycles-pp.copy_user_enhanced_fast_string
      0.96 ± 54%      +0.5        1.43 ±  3%  perf-profile.self.cycles-pp.entry_SYSCALL_64
      0.98 ± 53%      +0.5        1.50 ±  2%  perf-profile.self.cycles-pp.syscall_return_via_sysret
      0.72 ± 59%      +0.5        1.26 ± 10%  perf-profile.self.cycles-pp.f2fs_lookup_extent_cache


                                                                                
                                  aim7.jobs-per-min                             
                                                                                
  114000 +-+----------------------------------------------------------------+   
  112000 +-+     O                                                          |   
         O    O       O    O    O    O                    O  O O            |   
  110000 +-+       O    O     O    O    O              O          O         |   
  108000 +-+                                                                |   
         |                                 O O  O O  O                      |   
  106000 +-+O                                                               |   
  104000 +-+                                                                |   
  102000 +-+                                                                |   
         |                                                                  |   
  100000 +-+                                                                |   
   98000 +-+                                                                |   
         |.. .+..+.+..    .+.. .+.. .+..+..+.+.. .+..+.+..+..+.+..  +..     |   
   96000 +-++          .+.    +    +            +                  +   +.+..|   
   94000 +-+----------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                               aim7.time.system_time                            
                                                                                
  7200 +-+------------------------------------------------------------------+   
       |                                                                    |   
  7000 +-+         .+..     +..                                 .+..        |   
       | .+.     .+    +.. +     .+.     .+.  .+.     .+.     .+      .+.+..|   
  6800 +-+  +..+.         +    +.   +..+.   +.   +..+.   +..+.      +.      |   
       |                                                                    |   
  6600 +-+                                                                  |   
       |                                                                    |   
  6400 +-+                                                                  |   
       |  O                                                                 |   
  6200 +-+                                                                  |   
       |                                  O O  O O  O                       |   
  6000 +-+                  O     O                    O                    |   
       O    O     O O  O  O         O  O                    O  O O          |   
  5800 +-+-----O---------------O-------------------------O------------------+   
                                                                                
                                                                                                                                                                
                              aim7.time.elapsed_time                            
                                                                                
  205 +-+-------------------------------------------------------------------+   
      |                                                                  :: |   
  200 +-+                                                               : : |   
  195 +-+                                                               :  :|   
      |           .+..                                           +..   :   :|   
  190 +-++.     .+    +..  .+.  .+..    .+.. .+..    .+..       +     .+    |   
  185 +-+  +..+.         +.   +.    +.+.    +    +..+    +..+..+    +.      |   
      |                                                                     |   
  180 +-+                                                                   |   
  175 +-+                                                                   |   
      |  O                                                                  |   
  170 +-+                                   O    O  O                       |   
  165 +-+                        O       O    O                             |   
      O    O     O O  O  O  O O     O O               O  O  O  O O          |   
  160 +-+-----O-------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                            aim7.time.elapsed_time.max                          
                                                                                
  205 +-+-------------------------------------------------------------------+   
      |                                                                  :: |   
  200 +-+                                                               : : |   
  195 +-+                                                               :  :|   
      |           .+..                                           +..   :   :|   
  190 +-++.     .+    +..  .+.  .+..    .+.. .+..    .+..       +     .+    |   
  185 +-+  +..+.         +.   +.    +.+.    +    +..+    +..+..+    +.      |   
      |                                                                     |   
  180 +-+                                                                   |   
  175 +-+                                                                   |   
      |  O                                                                  |   
  170 +-+                                   O    O  O                       |   
  165 +-+                        O       O    O                             |   
      O    O     O O  O  O  O O     O O               O  O  O  O O          |   
  160 +-+-----O-------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                        aim7.time.involuntary_context_switches                  
                                                                                
  1.15e+06 +-+--------------------------------------------------------------+   
           |                   +..                                        + |   
   1.1e+06 +-++     .+.. .+.. +    .+..    .+.  .+     .+..    .+.       : +|   
           |.  +  .+    +    +    +     .+.   +.  +  .+    +.+.   +..+   :  |   
           |    +.                     +           +.                 + :   |   
  1.05e+06 +-+                                                         +    |   
           |                                                                |   
     1e+06 +-+                                                              |   
           |                                                                |   
    950000 +-+                                                              |   
           |                                          O                     |   
           O  O O    O         O    O    O              O         O         |   
    900000 +-+          O O  O         O    O O  O O       O O  O           |   
           |       O              O                                         |   
    850000 +-+--------------------------------------------------------------+   
                                                                                
                                                                                
[*] bisect-good sample
[O] bisect-bad  sample

***************************************************************************************************
lkp-ivb-ep01: 40 threads Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz with 384G memory
=========================================================================================
compiler/cpufreq_governor/disk/fs/kconfig/load/md/rootfs/tbox_group/test/testcase:
  gcc-7/performance/4BRD_12G/f2fs/x86_64-rhel-7.2/3000/RAID1/debian-x86_64-2018-04-03.cgz/lkp-ivb-ep01/disk_rr/aim7

commit: 
  d6c66cd19e ("f2fs: fix count of seg_freed to make sec_freed correct")
  089842de57 ("f2fs: remove codes of unused wio_mutex")

d6c66cd19ef322fe 089842de5750f434aa016eb23f 
---------------- -------------------------- 
       fail:runs  %reproduction    fail:runs
           |             |             |    
           :4           50%           2:4     dmesg.WARNING:at#for_ip_interrupt_entry/0x
           :4           25%           1:4     kmsg.DHCP/BOOTP:Reply_not_for_us_on_eth#,op[#]xid[#]
           :4           25%           1:4     kmsg.IP-Config:Reopening_network_devices
         %stddev     %change         %stddev
             \          |                \  
    102582            +8.8%     111626        aim7.jobs-per-min
    176.57            -8.5%     161.64        aim7.time.elapsed_time
    176.57            -8.5%     161.64        aim7.time.elapsed_time.max
   1060618           -12.5%     927723        aim7.time.involuntary_context_switches
      6408            -8.9%       5839        aim7.time.system_time
    785554            +4.5%     820987        aim7.time.voluntary_context_switches
   1077477            -9.5%     975130 ±  2%  softirqs.RCU
    184.77 ±  6%     +41.2%     260.90 ± 11%  iostat.md0.w/s
      6609 ±  2%      +9.6%       7246        iostat.md0.wkB/s
      0.00 ± 94%      +0.0        0.02 ± 28%  mpstat.cpu.soft%
      1.89 ±  4%      +0.3        2.15 ±  3%  mpstat.cpu.usr%
      6546 ± 19%     -49.1%       3328 ± 63%  numa-numastat.node0.other_node
      1470 ± 86%    +222.9%       4749 ± 45%  numa-numastat.node1.other_node
    959.75 ±  8%     +16.8%       1120 ±  7%  slabinfo.UNIX.active_objs
    959.75 ±  8%     +16.8%       1120 ±  7%  slabinfo.UNIX.num_objs
     38.35            +3.2%      39.57 ±  2%  turbostat.RAMWatt
      8800 ±  2%     -10.7%       7855 ±  3%  turbostat.SMI
    103925 ± 27%     -59.5%      42134 ± 61%  numa-meminfo.node0.AnonHugePages
     14267 ± 61%     -54.9%       6430 ± 76%  numa-meminfo.node0.Inactive(anon)
     52220 ± 18%    +104.0%     106522 ± 40%  numa-meminfo.node1.AnonHugePages
      6614 ±  2%      +9.6%       7248        vmstat.io.bo
    316.00 ±  2%     -15.4%     267.25 ±  8%  vmstat.procs.r
     12256 ±  2%      +6.9%      13098        vmstat.system.cs
      2852 ±  3%     +12.5%       3208 ±  3%  numa-vmstat.node0.nr_active_file
      3566 ± 61%     -54.9%       1607 ± 76%  numa-vmstat.node0.nr_inactive_anon
      2852 ±  3%     +12.4%       3207 ±  3%  numa-vmstat.node0.nr_zone_active_file
      3566 ± 61%     -54.9%       1607 ± 76%  numa-vmstat.node0.nr_zone_inactive_anon
     95337            +2.3%      97499        proc-vmstat.nr_active_anon
      5746 ±  2%      +4.3%       5990        proc-vmstat.nr_active_file
     89732            +2.0%      91532        proc-vmstat.nr_anon_pages
     95337            +2.3%      97499        proc-vmstat.nr_zone_active_anon
      5746 ±  2%      +4.3%       5990        proc-vmstat.nr_zone_active_file
     10407 ±  4%     -49.3%       5274 ± 52%  proc-vmstat.numa_hint_faults_local
    615058            -6.0%     578344 ±  2%  proc-vmstat.pgfault
 1.187e+12            -8.7%  1.084e+12        perf-stat.branch-instructions
      0.65 ±  3%      +0.0        0.70 ±  2%  perf-stat.branch-miss-rate%
   2219706            -2.5%    2164425        perf-stat.context-switches
 2.071e+13           -10.0%  1.864e+13        perf-stat.cpu-cycles
    641874            -2.7%     624703        perf-stat.cpu-migrations
 1.408e+12            -7.3%  1.305e+12        perf-stat.dTLB-loads
  39182891 ±  4%    +796.4%  3.512e+08 ±150%  perf-stat.iTLB-loads
 5.184e+12            -8.0%   4.77e+12        perf-stat.instructions
      5035 ±  2%     -14.1%       4325 ± 13%  perf-stat.instructions-per-iTLB-miss
    604219            -6.2%     566725        perf-stat.minor-faults
 4.962e+09            -2.7%  4.827e+09        perf-stat.node-stores
    604097            -6.2%     566730        perf-stat.page-faults
    110.81 ± 13%     +25.7%     139.25 ±  8%  sched_debug.cfs_rq:/.load_avg.stddev
     12.76 ± 74%    +114.6%      27.39 ± 38%  sched_debug.cfs_rq:/.removed.load_avg.avg
     54.23 ± 62%     +66.2%      90.10 ± 17%  sched_debug.cfs_rq:/.removed.load_avg.stddev
    585.18 ± 74%    +115.8%       1262 ± 38%  sched_debug.cfs_rq:/.removed.runnable_sum.avg
      2489 ± 62%     +66.9%       4153 ± 17%  sched_debug.cfs_rq:/.removed.runnable_sum.stddev
     11909 ± 10%     +44.7%      17229 ± 18%  sched_debug.cfs_rq:/.runnable_weight.avg
      1401 ±  2%     +36.5%       1913 ±  5%  sched_debug.cpu.sched_goidle.avg
      2350 ±  2%     +21.9%       2863 ±  5%  sched_debug.cpu.sched_goidle.max
      1082 ±  5%     +39.2%       1506 ±  4%  sched_debug.cpu.sched_goidle.min
      7327           +14.7%       8401 ±  2%  sched_debug.cpu.ttwu_count.avg
      5719 ±  3%     +18.3%       6767 ±  2%  sched_debug.cpu.ttwu_count.min
      1518 ±  3%     +15.6%       1755 ±  3%  sched_debug.cpu.ttwu_local.min
     88.70            -1.0       87.65        perf-profile.calltrace.cycles-pp.generic_perform_write.__generic_file_write_iter.f2fs_file_write_iter.__vfs_write.vfs_write
     54.51            -1.0       53.48        perf-profile.calltrace.cycles-pp._raw_spin_lock.f2fs_inode_dirtied.f2fs_mark_inode_dirty_sync.f2fs_write_end.generic_perform_write
     54.55            -1.0       53.53        perf-profile.calltrace.cycles-pp.f2fs_mark_inode_dirty_sync.f2fs_write_end.generic_perform_write.__generic_file_write_iter.f2fs_file_write_iter
     56.32            -1.0       55.30        perf-profile.calltrace.cycles-pp.f2fs_write_end.generic_perform_write.__generic_file_write_iter.f2fs_file_write_iter.__vfs_write
     54.54            -1.0       53.53        perf-profile.calltrace.cycles-pp.f2fs_inode_dirtied.f2fs_mark_inode_dirty_sync.f2fs_write_end.generic_perform_write.__generic_file_write_iter
     88.93            -1.0       87.96        perf-profile.calltrace.cycles-pp.__generic_file_write_iter.f2fs_file_write_iter.__vfs_write.vfs_write.ksys_write
     89.94            -0.8       89.14        perf-profile.calltrace.cycles-pp.f2fs_file_write_iter.__vfs_write.vfs_write.ksys_write.do_syscall_64
     90.01            -0.8       89.26        perf-profile.calltrace.cycles-pp.__vfs_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
     90.72            -0.7       90.00        perf-profile.calltrace.cycles-pp.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
     90.59            -0.7       89.87        perf-profile.calltrace.cycles-pp.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
     13.32            -0.3       13.01        perf-profile.calltrace.cycles-pp._raw_spin_lock.f2fs_inode_dirtied.f2fs_mark_inode_dirty_sync.f2fs_reserve_new_blocks.f2fs_reserve_block
     13.33            -0.3       13.01        perf-profile.calltrace.cycles-pp.f2fs_inode_dirtied.f2fs_mark_inode_dirty_sync.f2fs_reserve_new_blocks.f2fs_reserve_block.f2fs_get_block
     13.33            -0.3       13.01        perf-profile.calltrace.cycles-pp.f2fs_mark_inode_dirty_sync.f2fs_reserve_new_blocks.f2fs_reserve_block.f2fs_get_block.f2fs_write_begin
     13.26            -0.3       12.94        perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.f2fs_inode_dirtied.f2fs_mark_inode_dirty_sync.f2fs_reserve_new_blocks
      1.30 ±  2%      +0.1        1.40 ±  2%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64
      2.20 ±  6%      +0.2        2.40 ±  3%  perf-profile.calltrace.cycles-pp.generic_file_read_iter.__vfs_read.vfs_read.ksys_read.do_syscall_64
      2.28 ±  5%      +0.2        2.52 ±  5%  perf-profile.calltrace.cycles-pp.__vfs_read.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
      2.85 ±  4%      +0.3        3.16 ±  5%  perf-profile.calltrace.cycles-pp.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
      2.97 ±  4%      +0.3        3.31 ±  5%  perf-profile.calltrace.cycles-pp.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
     88.74            -1.0       87.70        perf-profile.children.cycles-pp.generic_perform_write
     56.33            -1.0       55.31        perf-profile.children.cycles-pp.f2fs_write_end
     88.95            -1.0       87.98        perf-profile.children.cycles-pp.__generic_file_write_iter
     89.95            -0.8       89.15        perf-profile.children.cycles-pp.f2fs_file_write_iter
     90.03            -0.8       89.28        perf-profile.children.cycles-pp.__vfs_write
     90.73            -0.7       90.02        perf-profile.children.cycles-pp.ksys_write
     90.60            -0.7       89.89        perf-profile.children.cycles-pp.vfs_write
      0.22 ±  5%      -0.1        0.17 ± 19%  perf-profile.children.cycles-pp.f2fs_invalidate_page
      0.08 ± 10%      +0.0        0.10 ±  5%  perf-profile.children.cycles-pp.page_mapping
      0.09            +0.0        0.11 ±  7%  perf-profile.children.cycles-pp.__cancel_dirty_page
      0.06 ±  6%      +0.0        0.09 ± 28%  perf-profile.children.cycles-pp.read_node_page
      0.10 ±  4%      +0.0        0.14 ± 14%  perf-profile.children.cycles-pp.current_time
      0.07 ± 12%      +0.0        0.11 ±  9%  perf-profile.children.cycles-pp.percpu_counter_add_batch
      0.00            +0.1        0.05        perf-profile.children.cycles-pp.__x64_sys_write
      0.38 ±  3%      +0.1        0.43 ±  5%  perf-profile.children.cycles-pp.selinux_file_permission
      0.55 ±  4%      +0.1        0.61 ±  4%  perf-profile.children.cycles-pp.security_file_permission
      1.30            +0.1        1.40 ±  2%  perf-profile.children.cycles-pp.entry_SYSCALL_64
      2.21 ±  6%      +0.2        2.41 ±  3%  perf-profile.children.cycles-pp.generic_file_read_iter
      2.29 ±  6%      +0.2        2.53 ±  5%  perf-profile.children.cycles-pp.__vfs_read
      2.86 ±  4%      +0.3        3.18 ±  5%  perf-profile.children.cycles-pp.vfs_read
      2.99 ±  4%      +0.3        3.32 ±  5%  perf-profile.children.cycles-pp.ksys_read
      0.37            -0.1        0.24 ± 23%  perf-profile.self.cycles-pp.__get_node_page
      0.21 ±  3%      -0.1        0.15 ± 16%  perf-profile.self.cycles-pp.f2fs_invalidate_page
      0.07 ±  5%      +0.0        0.09 ± 11%  perf-profile.self.cycles-pp.page_mapping
      0.06 ± 11%      +0.0        0.08 ±  8%  perf-profile.self.cycles-pp.vfs_read
      0.07 ±  7%      +0.0        0.10 ± 21%  perf-profile.self.cycles-pp.__generic_file_write_iter
      0.06 ± 14%      +0.0        0.10 ± 10%  perf-profile.self.cycles-pp.percpu_counter_add_batch
      0.20 ± 11%      +0.0        0.25 ± 12%  perf-profile.self.cycles-pp.selinux_file_permission
      0.05 ±  8%      +0.1        0.11 ± 52%  perf-profile.self.cycles-pp.__vfs_read
      0.33 ±  9%      +0.1        0.41 ±  9%  perf-profile.self.cycles-pp.f2fs_lookup_extent_cache
      1.30            +0.1        1.40 ±  2%  perf-profile.self.cycles-pp.entry_SYSCALL_64





Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Rong Chen

View attachment "config-4.20.0-rc4-00010-g089842d" of type "text/plain" (168529 bytes)

View attachment "job-script" of type "text/plain" (7812 bytes)

View attachment "job.yaml" of type "text/plain" (5428 bytes)

View attachment "reproduce" of type "text/plain" (1033 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ