lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <9b0ce6db-b979-9fbb-08de-56c0339cce64@huawei.com>
Date:   Tue, 11 Dec 2018 18:12:25 +0800
From:   Chao Yu <yuchao0@...wei.com>
To:     kernel test robot <rong.a.chen@...el.com>,
        Yunlong Song <yunlong.song@...wei.com>
CC:     Jaegeuk Kim <jaegeuk@...nel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        <linux-f2fs-devel@...ts.sourceforge.net>, <lkp@...org>
Subject: Re: [LKP] [f2fs] 089842de57: aim7.jobs-per-min 15.4% improvement

Hi all,

The commit only clean up codes which are unused currently, so why we can
improve performance with it? could you retest to make sure?

Thanks,

On 2018/12/11 17:59, kernel test robot wrote:
> Greeting,
> 
> FYI, we noticed a 15.4% improvement of aim7.jobs-per-min due to commit:
> 
> 
> commit: 089842de5750f434aa016eb23f3d3a3a151083bd ("f2fs: remove codes of unused wio_mutex")
> https://git.kernel.org/cgit/linux/kernel/git/jaegeuk/f2fs.git dev-test
> 
> in testcase: aim7
> on test machine: 40 threads Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz with 384G memory
> with following parameters:
> 
> 	disk: 4BRD_12G
> 	md: RAID1
> 	fs: f2fs
> 	test: disk_rw
> 	load: 3000
> 	cpufreq_governor: performance
> 
> test-description: AIM7 is a traditional UNIX system level benchmark suite which is used to test and measure the performance of multiuser system.
> test-url: https://sourceforge.net/projects/aimbench/files/aim-suite7/
> 
> In addition to that, the commit also has significant impact on the following tests:
> 
> +------------------+-----------------------------------------------------------------------+
> | testcase: change | aim7: aim7.jobs-per-min 8.8% improvement                              |
> | test machine     | 40 threads Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz with 384G memory |
> | test parameters  | cpufreq_governor=performance                                          |
> |                  | disk=4BRD_12G                                                         |
> |                  | fs=f2fs                                                               |
> |                  | load=3000                                                             |
> |                  | md=RAID1                                                              |
> |                  | test=disk_rr                                                          |
> +------------------+-----------------------------------------------------------------------+
> 
> 
> Details are as below:
> -------------------------------------------------------------------------------------------------->
> 
> 
> To reproduce:
> 
>         git clone https://github.com/intel/lkp-tests.git
>         cd lkp-tests
>         bin/lkp install job.yaml  # job file is attached in this email
>         bin/lkp run     job.yaml
> 
> =========================================================================================
> compiler/cpufreq_governor/disk/fs/kconfig/load/md/rootfs/tbox_group/test/testcase:
>   gcc-7/performance/4BRD_12G/f2fs/x86_64-rhel-7.2/3000/RAID1/debian-x86_64-2018-04-03.cgz/lkp-ivb-ep01/disk_rw/aim7
> 
> commit: 
>   d6c66cd19e ("f2fs: fix count of seg_freed to make sec_freed correct")
>   089842de57 ("f2fs: remove codes of unused wio_mutex")
> 
> d6c66cd19ef322fe 089842de5750f434aa016eb23f 
> ---------------- -------------------------- 
>          %stddev     %change         %stddev
>              \          |                \  
>      96213           +15.4%     110996        aim7.jobs-per-min
>     191.50 ±  3%     -15.1%     162.52        aim7.time.elapsed_time
>     191.50 ±  3%     -15.1%     162.52        aim7.time.elapsed_time.max
>    1090253 ±  2%     -17.5%     899165        aim7.time.involuntary_context_switches
>     176713            -7.5%     163478        aim7.time.minor_page_faults
>       6882           -14.6%       5875        aim7.time.system_time
>     127.97            +4.7%     134.00        aim7.time.user_time
>     760923            +7.1%     814632        aim7.time.voluntary_context_switches
>      78499 ±  2%     -11.2%      69691        interrupts.CAL:Function_call_interrupts
>    3183861 ±  4%     -16.7%    2651390 ±  4%  softirqs.TIMER
>     191.54 ± 13%     +45.4%     278.59 ± 12%  iostat.md0.w/s
>       6118 ±  3%     +16.5%       7126 ±  2%  iostat.md0.wkB/s
>     151257 ±  2%     -10.1%     135958 ±  2%  meminfo.AnonHugePages
>      46754 ±  3%     +14.0%      53307 ±  3%  meminfo.max_used_kB
>       0.03 ± 62%      -0.0        0.01 ± 78%  mpstat.cpu.soft%
>       1.73 ±  3%      +0.4        2.13 ±  3%  mpstat.cpu.usr%
>   16062961 ±  2%     -12.1%   14124403 ±  2%  turbostat.IRQ
>       0.76 ± 37%     -71.8%       0.22 ± 83%  turbostat.Pkg%pc6
>       9435 ±  7%     -18.1%       7730 ±  4%  turbostat.SMI
>       6113 ±  3%     +16.5%       7120 ±  2%  vmstat.io.bo
>      11293 ±  2%     +12.3%      12688 ±  2%  vmstat.system.cs
>      81879 ±  2%      +2.5%      83951        vmstat.system.in
>       2584            -4.4%       2469 ±  2%  proc-vmstat.nr_active_file
>       2584            -4.4%       2469 ±  2%  proc-vmstat.nr_zone_active_file
>      28564 ±  4%     -23.6%      21817 ± 12%  proc-vmstat.numa_hint_faults
>      10958 ±  5%     -43.9%       6147 ± 26%  proc-vmstat.numa_hint_faults_local
>     660531 ±  3%     -10.7%     590059 ±  2%  proc-vmstat.pgfault
>       1191 ±  7%     -16.5%     995.25 ± 12%  slabinfo.UNIX.active_objs
>       1191 ±  7%     -16.5%     995.25 ± 12%  slabinfo.UNIX.num_objs
>      10552 ±  4%      -7.8%       9729        slabinfo.ext4_io_end.active_objs
>      10552 ±  4%      -7.8%       9729        slabinfo.ext4_io_end.num_objs
>      18395           +12.3%      20656 ±  8%  slabinfo.kmalloc-32.active_objs
>      18502 ±  2%     +12.3%      20787 ±  8%  slabinfo.kmalloc-32.num_objs
>  1.291e+12           -12.3%  1.131e+12        perf-stat.branch-instructions
>       0.66            +0.1        0.76 ±  3%  perf-stat.branch-miss-rate%
>  1.118e+10 ±  4%      -7.5%  1.034e+10        perf-stat.cache-misses
>  2.772e+10 ±  8%      -6.6%  2.589e+10        perf-stat.cache-references
>    2214958            -3.6%    2136237        perf-stat.context-switches
>       3.95 ±  2%      -5.8%       3.72        perf-stat.cpi
>   2.24e+13           -16.4%  1.873e+13        perf-stat.cpu-cycles
>  1.542e+12           -10.4%  1.382e+12        perf-stat.dTLB-loads
>       0.18 ±  6%      +0.0        0.19 ±  4%  perf-stat.dTLB-store-miss-rate%
>  5.667e+12           -11.3%  5.029e+12        perf-stat.instructions
>       5534           -13.1%       4809 ±  6%  perf-stat.instructions-per-iTLB-miss
>       0.25 ±  2%      +6.1%       0.27        perf-stat.ipc
>     647970 ±  2%     -10.7%     578955 ±  2%  perf-stat.minor-faults
>  2.783e+09 ± 18%     -17.8%  2.288e+09 ±  4%  perf-stat.node-loads
>  5.706e+09 ±  2%      -5.2%  5.407e+09        perf-stat.node-store-misses
>  7.693e+09            -4.4%  7.352e+09        perf-stat.node-stores
>     647979 ±  2%     -10.7%     578955 ±  2%  perf-stat.page-faults
>      70960 ± 16%     -26.6%      52062        sched_debug.cfs_rq:/.exec_clock.avg
>      70628 ± 16%     -26.7%      51787        sched_debug.cfs_rq:/.exec_clock.min
>      22499 ±  3%     -10.5%      20133 ±  3%  sched_debug.cfs_rq:/.load.avg
>       7838 ± 23%     -67.6%       2536 ± 81%  sched_debug.cfs_rq:/.load.min
>     362.19 ± 12%     +58.3%     573.50 ± 25%  sched_debug.cfs_rq:/.load_avg.max
>    3092960 ± 16%     -28.5%    2211400        sched_debug.cfs_rq:/.min_vruntime.avg
>    3244162 ± 15%     -27.0%    2367437 ±  2%  sched_debug.cfs_rq:/.min_vruntime.max
>    2984299 ± 16%     -28.9%    2121271        sched_debug.cfs_rq:/.min_vruntime.min
>       0.73 ±  4%     -65.7%       0.25 ± 57%  sched_debug.cfs_rq:/.nr_running.min
>       0.12 ± 13%    +114.6%       0.26 ±  9%  sched_debug.cfs_rq:/.nr_running.stddev
>       8.44 ± 23%     -36.8%       5.33 ± 15%  sched_debug.cfs_rq:/.nr_spread_over.max
>       1.49 ± 21%     -29.6%       1.05 ±  7%  sched_debug.cfs_rq:/.nr_spread_over.stddev
>      16.53 ± 20%     -38.8%      10.12 ± 23%  sched_debug.cfs_rq:/.runnable_load_avg.avg
>      15259 ±  7%     -33.3%      10176 ± 22%  sched_debug.cfs_rq:/.runnable_weight.avg
>     796.65 ± 93%     -74.8%     200.68 ± 17%  sched_debug.cfs_rq:/.util_est_enqueued.avg
>     669258 ±  3%     -13.3%     580068        sched_debug.cpu.avg_idle.avg
>     116020 ± 12%     -21.4%      91239        sched_debug.cpu.clock.avg
>     116076 ± 12%     -21.4%      91261        sched_debug.cpu.clock.max
>     115967 ± 12%     -21.3%      91215        sched_debug.cpu.clock.min
>     116020 ± 12%     -21.4%      91239        sched_debug.cpu.clock_task.avg
>     116076 ± 12%     -21.4%      91261        sched_debug.cpu.clock_task.max
>     115967 ± 12%     -21.3%      91215        sched_debug.cpu.clock_task.min
>      15.41 ±  4%     -32.0%      10.48 ± 24%  sched_debug.cpu.cpu_load[0].avg
>      15.71 ±  6%     -26.6%      11.53 ± 22%  sched_debug.cpu.cpu_load[1].avg
>      16.20 ±  8%     -22.9%      12.49 ± 21%  sched_debug.cpu.cpu_load[2].avg
>      16.92 ±  7%     -21.2%      13.33 ± 21%  sched_debug.cpu.cpu_load[3].avg
>       2650 ±  6%     -15.6%       2238 ±  3%  sched_debug.cpu.curr->pid.avg
>       1422 ±  8%     -68.5%     447.42 ± 57%  sched_debug.cpu.curr->pid.min
>       7838 ± 23%     -67.6%       2536 ± 81%  sched_debug.cpu.load.min
>      86066 ± 14%     -26.3%      63437        sched_debug.cpu.nr_load_updates.min
>       3.97 ± 88%     -70.9%       1.15 ± 10%  sched_debug.cpu.nr_running.avg
>       0.73 ±  4%     -65.7%       0.25 ± 57%  sched_debug.cpu.nr_running.min
>       1126 ± 16%     -27.6%     816.02 ±  9%  sched_debug.cpu.sched_count.stddev
>       1468 ± 16%     +31.1%       1925 ±  5%  sched_debug.cpu.sched_goidle.avg
>       1115 ± 16%     +37.8%       1538 ±  4%  sched_debug.cpu.sched_goidle.min
>       3979 ± 13%     -27.4%       2888 ±  5%  sched_debug.cpu.ttwu_local.max
>     348.96 ±  8%     -26.3%     257.16 ± 13%  sched_debug.cpu.ttwu_local.stddev
>     115966 ± 12%     -21.3%      91214        sched_debug.cpu_clk
>     113505 ± 12%     -21.8%      88773        sched_debug.ktime
>     116416 ± 12%     -21.3%      91663        sched_debug.sched_clk
>       0.26 ±100%      +0.3        0.57 ±  6%  perf-profile.calltrace.cycles-pp.security_file_permission.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
>       0.29 ±100%      +0.4        0.66 ±  5%  perf-profile.calltrace.cycles-pp.find_get_entry.pagecache_get_page.f2fs_write_begin.generic_perform_write.__generic_file_write_iter
>       0.67 ± 65%      +0.4        1.11        perf-profile.calltrace.cycles-pp.copy_user_enhanced_fast_string.copyin.iov_iter_copy_from_user_atomic.generic_perform_write.__generic_file_write_iter
>       0.69 ± 65%      +0.5        1.14        perf-profile.calltrace.cycles-pp.copyin.iov_iter_copy_from_user_atomic.generic_perform_write.__generic_file_write_iter.f2fs_file_write_iter
>       1.07 ± 57%      +0.5        1.61 ±  5%  perf-profile.calltrace.cycles-pp.pagecache_get_page.f2fs_write_begin.generic_perform_write.__generic_file_write_iter.f2fs_file_write_iter
>       0.79 ± 64%      +0.5        1.33        perf-profile.calltrace.cycles-pp.iov_iter_copy_from_user_atomic.generic_perform_write.__generic_file_write_iter.f2fs_file_write_iter.__vfs_write
>       0.73 ± 63%      +0.6        1.32 ±  3%  perf-profile.calltrace.cycles-pp.syscall_return_via_sysret
>       0.81 ± 63%      +0.6        1.43 ±  3%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64
>       0.06 ± 58%      +0.0        0.09 ±  4%  perf-profile.children.cycles-pp.__pagevec_lru_add_fn
>       0.05 ± 58%      +0.0        0.09 ± 13%  perf-profile.children.cycles-pp.down_write_trylock
>       0.06 ± 58%      +0.0        0.10 ±  4%  perf-profile.children.cycles-pp.__x64_sys_write
>       0.07 ± 58%      +0.0        0.11 ±  3%  perf-profile.children.cycles-pp.account_page_dirtied
>       0.04 ± 57%      +0.0        0.09 ±  5%  perf-profile.children.cycles-pp.account_page_cleaned
>       0.06 ± 58%      +0.0        0.10 ±  7%  perf-profile.children.cycles-pp.free_pcppages_bulk
>       0.10 ± 58%      +0.1        0.15 ±  6%  perf-profile.children.cycles-pp.page_mapping
>       0.09 ± 57%      +0.1        0.14 ±  7%  perf-profile.children.cycles-pp.__lru_cache_add
>       0.10 ± 57%      +0.1        0.15 ±  9%  perf-profile.children.cycles-pp.__might_sleep
>       0.12 ± 58%      +0.1        0.19 ±  3%  perf-profile.children.cycles-pp.set_page_dirty
>       0.08 ± 64%      +0.1        0.15 ± 10%  perf-profile.children.cycles-pp.dquot_claim_space_nodirty
>       0.06 ± 61%      +0.1        0.13 ±  5%  perf-profile.children.cycles-pp.percpu_counter_add_batch
>       0.18 ± 57%      +0.1        0.27 ±  2%  perf-profile.children.cycles-pp.iov_iter_fault_in_readable
>       0.17 ± 57%      +0.1        0.26 ±  2%  perf-profile.children.cycles-pp.__set_page_dirty_nobuffers
>       0.09 ± 57%      +0.1        0.18 ± 27%  perf-profile.children.cycles-pp.free_unref_page_list
>       0.16 ± 58%      +0.1        0.30 ± 18%  perf-profile.children.cycles-pp.__pagevec_release
>       0.30 ± 57%      +0.1        0.43 ±  5%  perf-profile.children.cycles-pp.add_to_page_cache_lru
>       0.17 ± 58%      +0.1        0.31 ± 16%  perf-profile.children.cycles-pp.release_pages
>       0.29 ± 58%      +0.2        0.45 ±  7%  perf-profile.children.cycles-pp.selinux_file_permission
>       0.38 ± 57%      +0.2        0.58 ±  6%  perf-profile.children.cycles-pp.security_file_permission
>       0.78 ± 57%      +0.3        1.12        perf-profile.children.cycles-pp.copy_user_enhanced_fast_string
>       0.80 ± 57%      +0.3        1.15        perf-profile.children.cycles-pp.copyin
>       0.92 ± 57%      +0.4        1.34        perf-profile.children.cycles-pp.iov_iter_copy_from_user_atomic
>       0.98 ± 54%      +0.5        1.43 ±  3%  perf-profile.children.cycles-pp.entry_SYSCALL_64
>       0.98 ± 53%      +0.5        1.50 ±  3%  perf-profile.children.cycles-pp.syscall_return_via_sysret
>       1.64 ± 57%      +0.8        2.45 ±  5%  perf-profile.children.cycles-pp.pagecache_get_page
>       0.04 ± 57%      +0.0        0.06        perf-profile.self.cycles-pp.__pagevec_lru_add_fn
>       0.04 ± 58%      +0.0        0.07 ±  7%  perf-profile.self.cycles-pp.release_pages
>       0.05 ± 58%      +0.0        0.08 ± 15%  perf-profile.self.cycles-pp._cond_resched
>       0.04 ± 58%      +0.0        0.08 ±  6%  perf-profile.self.cycles-pp.ksys_write
>       0.05 ± 58%      +0.0        0.09 ± 13%  perf-profile.self.cycles-pp.down_write_trylock
>       0.09 ± 58%      +0.1        0.14 ±  9%  perf-profile.self.cycles-pp.page_mapping
>       0.01 ±173%      +0.1        0.07 ±  7%  perf-profile.self.cycles-pp.__fdget_pos
>       0.11 ± 57%      +0.1        0.17 ±  7%  perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
>       0.05 ± 59%      +0.1        0.12 ±  5%  perf-profile.self.cycles-pp.percpu_counter_add_batch
>       0.12 ± 58%      +0.1        0.19 ±  4%  perf-profile.self.cycles-pp.iov_iter_copy_from_user_atomic
>       0.17 ± 57%      +0.1        0.24 ±  4%  perf-profile.self.cycles-pp.generic_perform_write
>       0.17 ± 58%      +0.1        0.26 ±  2%  perf-profile.self.cycles-pp.iov_iter_fault_in_readable
>       0.19 ± 57%      +0.1        0.30 ±  2%  perf-profile.self.cycles-pp.f2fs_set_data_page_dirty
>       0.18 ± 58%      +0.1        0.30 ±  4%  perf-profile.self.cycles-pp.pagecache_get_page
>       0.27 ± 57%      +0.1        0.41 ±  4%  perf-profile.self.cycles-pp.do_syscall_64
>       0.40 ± 57%      +0.2        0.62 ±  5%  perf-profile.self.cycles-pp.find_get_entry
>       0.77 ± 57%      +0.3        1.11        perf-profile.self.cycles-pp.copy_user_enhanced_fast_string
>       0.96 ± 54%      +0.5        1.43 ±  3%  perf-profile.self.cycles-pp.entry_SYSCALL_64
>       0.98 ± 53%      +0.5        1.50 ±  2%  perf-profile.self.cycles-pp.syscall_return_via_sysret
>       0.72 ± 59%      +0.5        1.26 ± 10%  perf-profile.self.cycles-pp.f2fs_lookup_extent_cache
> 
> 
>                                                                                 
>                                   aim7.jobs-per-min                             
>                                                                                 
>   114000 +-+----------------------------------------------------------------+   
>   112000 +-+     O                                                          |   
>          O    O       O    O    O    O                    O  O O            |   
>   110000 +-+       O    O     O    O    O              O          O         |   
>   108000 +-+                                                                |   
>          |                                 O O  O O  O                      |   
>   106000 +-+O                                                               |   
>   104000 +-+                                                                |   
>   102000 +-+                                                                |   
>          |                                                                  |   
>   100000 +-+                                                                |   
>    98000 +-+                                                                |   
>          |.. .+..+.+..    .+.. .+.. .+..+..+.+.. .+..+.+..+..+.+..  +..     |   
>    96000 +-++          .+.    +    +            +                  +   +.+..|   
>    94000 +-+----------------------------------------------------------------+   
>                                                                                 
>                                                                                                                                                                 
>                                aim7.time.system_time                            
>                                                                                 
>   7200 +-+------------------------------------------------------------------+   
>        |                                                                    |   
>   7000 +-+         .+..     +..                                 .+..        |   
>        | .+.     .+    +.. +     .+.     .+.  .+.     .+.     .+      .+.+..|   
>   6800 +-+  +..+.         +    +.   +..+.   +.   +..+.   +..+.      +.      |   
>        |                                                                    |   
>   6600 +-+                                                                  |   
>        |                                                                    |   
>   6400 +-+                                                                  |   
>        |  O                                                                 |   
>   6200 +-+                                                                  |   
>        |                                  O O  O O  O                       |   
>   6000 +-+                  O     O                    O                    |   
>        O    O     O O  O  O         O  O                    O  O O          |   
>   5800 +-+-----O---------------O-------------------------O------------------+   
>                                                                                 
>                                                                                                                                                                 
>                               aim7.time.elapsed_time                            
>                                                                                 
>   205 +-+-------------------------------------------------------------------+   
>       |                                                                  :: |   
>   200 +-+                                                               : : |   
>   195 +-+                                                               :  :|   
>       |           .+..                                           +..   :   :|   
>   190 +-++.     .+    +..  .+.  .+..    .+.. .+..    .+..       +     .+    |   
>   185 +-+  +..+.         +.   +.    +.+.    +    +..+    +..+..+    +.      |   
>       |                                                                     |   
>   180 +-+                                                                   |   
>   175 +-+                                                                   |   
>       |  O                                                                  |   
>   170 +-+                                   O    O  O                       |   
>   165 +-+                        O       O    O                             |   
>       O    O     O O  O  O  O O     O O               O  O  O  O O          |   
>   160 +-+-----O-------------------------------------------------------------+   
>                                                                                 
>                                                                                                                                                                 
>                             aim7.time.elapsed_time.max                          
>                                                                                 
>   205 +-+-------------------------------------------------------------------+   
>       |                                                                  :: |   
>   200 +-+                                                               : : |   
>   195 +-+                                                               :  :|   
>       |           .+..                                           +..   :   :|   
>   190 +-++.     .+    +..  .+.  .+..    .+.. .+..    .+..       +     .+    |   
>   185 +-+  +..+.         +.   +.    +.+.    +    +..+    +..+..+    +.      |   
>       |                                                                     |   
>   180 +-+                                                                   |   
>   175 +-+                                                                   |   
>       |  O                                                                  |   
>   170 +-+                                   O    O  O                       |   
>   165 +-+                        O       O    O                             |   
>       O    O     O O  O  O  O O     O O               O  O  O  O O          |   
>   160 +-+-----O-------------------------------------------------------------+   
>                                                                                 
>                                                                                                                                                                 
>                         aim7.time.involuntary_context_switches                  
>                                                                                 
>   1.15e+06 +-+--------------------------------------------------------------+   
>            |                   +..                                        + |   
>    1.1e+06 +-++     .+.. .+.. +    .+..    .+.  .+     .+..    .+.       : +|   
>            |.  +  .+    +    +    +     .+.   +.  +  .+    +.+.   +..+   :  |   
>            |    +.                     +           +.                 + :   |   
>   1.05e+06 +-+                                                         +    |   
>            |                                                                |   
>      1e+06 +-+                                                              |   
>            |                                                                |   
>     950000 +-+                                                              |   
>            |                                          O                     |   
>            O  O O    O         O    O    O              O         O         |   
>     900000 +-+          O O  O         O    O O  O O       O O  O           |   
>            |       O              O                                         |   
>     850000 +-+--------------------------------------------------------------+   
>                                                                                 
>                                                                                 
> [*] bisect-good sample
> [O] bisect-bad  sample
> 
> ***************************************************************************************************
> lkp-ivb-ep01: 40 threads Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz with 384G memory
> =========================================================================================
> compiler/cpufreq_governor/disk/fs/kconfig/load/md/rootfs/tbox_group/test/testcase:
>   gcc-7/performance/4BRD_12G/f2fs/x86_64-rhel-7.2/3000/RAID1/debian-x86_64-2018-04-03.cgz/lkp-ivb-ep01/disk_rr/aim7
> 
> commit: 
>   d6c66cd19e ("f2fs: fix count of seg_freed to make sec_freed correct")
>   089842de57 ("f2fs: remove codes of unused wio_mutex")
> 
> d6c66cd19ef322fe 089842de5750f434aa016eb23f 
> ---------------- -------------------------- 
>        fail:runs  %reproduction    fail:runs
>            |             |             |    
>            :4           50%           2:4     dmesg.WARNING:at#for_ip_interrupt_entry/0x
>            :4           25%           1:4     kmsg.DHCP/BOOTP:Reply_not_for_us_on_eth#,op[#]xid[#]
>            :4           25%           1:4     kmsg.IP-Config:Reopening_network_devices
>          %stddev     %change         %stddev
>              \          |                \  
>     102582            +8.8%     111626        aim7.jobs-per-min
>     176.57            -8.5%     161.64        aim7.time.elapsed_time
>     176.57            -8.5%     161.64        aim7.time.elapsed_time.max
>    1060618           -12.5%     927723        aim7.time.involuntary_context_switches
>       6408            -8.9%       5839        aim7.time.system_time
>     785554            +4.5%     820987        aim7.time.voluntary_context_switches
>    1077477            -9.5%     975130 ±  2%  softirqs.RCU
>     184.77 ±  6%     +41.2%     260.90 ± 11%  iostat.md0.w/s
>       6609 ±  2%      +9.6%       7246        iostat.md0.wkB/s
>       0.00 ± 94%      +0.0        0.02 ± 28%  mpstat.cpu.soft%
>       1.89 ±  4%      +0.3        2.15 ±  3%  mpstat.cpu.usr%
>       6546 ± 19%     -49.1%       3328 ± 63%  numa-numastat.node0.other_node
>       1470 ± 86%    +222.9%       4749 ± 45%  numa-numastat.node1.other_node
>     959.75 ±  8%     +16.8%       1120 ±  7%  slabinfo.UNIX.active_objs
>     959.75 ±  8%     +16.8%       1120 ±  7%  slabinfo.UNIX.num_objs
>      38.35            +3.2%      39.57 ±  2%  turbostat.RAMWatt
>       8800 ±  2%     -10.7%       7855 ±  3%  turbostat.SMI
>     103925 ± 27%     -59.5%      42134 ± 61%  numa-meminfo.node0.AnonHugePages
>      14267 ± 61%     -54.9%       6430 ± 76%  numa-meminfo.node0.Inactive(anon)
>      52220 ± 18%    +104.0%     106522 ± 40%  numa-meminfo.node1.AnonHugePages
>       6614 ±  2%      +9.6%       7248        vmstat.io.bo
>     316.00 ±  2%     -15.4%     267.25 ±  8%  vmstat.procs.r
>      12256 ±  2%      +6.9%      13098        vmstat.system.cs
>       2852 ±  3%     +12.5%       3208 ±  3%  numa-vmstat.node0.nr_active_file
>       3566 ± 61%     -54.9%       1607 ± 76%  numa-vmstat.node0.nr_inactive_anon
>       2852 ±  3%     +12.4%       3207 ±  3%  numa-vmstat.node0.nr_zone_active_file
>       3566 ± 61%     -54.9%       1607 ± 76%  numa-vmstat.node0.nr_zone_inactive_anon
>      95337            +2.3%      97499        proc-vmstat.nr_active_anon
>       5746 ±  2%      +4.3%       5990        proc-vmstat.nr_active_file
>      89732            +2.0%      91532        proc-vmstat.nr_anon_pages
>      95337            +2.3%      97499        proc-vmstat.nr_zone_active_anon
>       5746 ±  2%      +4.3%       5990        proc-vmstat.nr_zone_active_file
>      10407 ±  4%     -49.3%       5274 ± 52%  proc-vmstat.numa_hint_faults_local
>     615058            -6.0%     578344 ±  2%  proc-vmstat.pgfault
>  1.187e+12            -8.7%  1.084e+12        perf-stat.branch-instructions
>       0.65 ±  3%      +0.0        0.70 ±  2%  perf-stat.branch-miss-rate%
>    2219706            -2.5%    2164425        perf-stat.context-switches
>  2.071e+13           -10.0%  1.864e+13        perf-stat.cpu-cycles
>     641874            -2.7%     624703        perf-stat.cpu-migrations
>  1.408e+12            -7.3%  1.305e+12        perf-stat.dTLB-loads
>   39182891 ±  4%    +796.4%  3.512e+08 ±150%  perf-stat.iTLB-loads
>  5.184e+12            -8.0%   4.77e+12        perf-stat.instructions
>       5035 ±  2%     -14.1%       4325 ± 13%  perf-stat.instructions-per-iTLB-miss
>     604219            -6.2%     566725        perf-stat.minor-faults
>  4.962e+09            -2.7%  4.827e+09        perf-stat.node-stores
>     604097            -6.2%     566730        perf-stat.page-faults
>     110.81 ± 13%     +25.7%     139.25 ±  8%  sched_debug.cfs_rq:/.load_avg.stddev
>      12.76 ± 74%    +114.6%      27.39 ± 38%  sched_debug.cfs_rq:/.removed.load_avg.avg
>      54.23 ± 62%     +66.2%      90.10 ± 17%  sched_debug.cfs_rq:/.removed.load_avg.stddev
>     585.18 ± 74%    +115.8%       1262 ± 38%  sched_debug.cfs_rq:/.removed.runnable_sum.avg
>       2489 ± 62%     +66.9%       4153 ± 17%  sched_debug.cfs_rq:/.removed.runnable_sum.stddev
>      11909 ± 10%     +44.7%      17229 ± 18%  sched_debug.cfs_rq:/.runnable_weight.avg
>       1401 ±  2%     +36.5%       1913 ±  5%  sched_debug.cpu.sched_goidle.avg
>       2350 ±  2%     +21.9%       2863 ±  5%  sched_debug.cpu.sched_goidle.max
>       1082 ±  5%     +39.2%       1506 ±  4%  sched_debug.cpu.sched_goidle.min
>       7327           +14.7%       8401 ±  2%  sched_debug.cpu.ttwu_count.avg
>       5719 ±  3%     +18.3%       6767 ±  2%  sched_debug.cpu.ttwu_count.min
>       1518 ±  3%     +15.6%       1755 ±  3%  sched_debug.cpu.ttwu_local.min
>      88.70            -1.0       87.65        perf-profile.calltrace.cycles-pp.generic_perform_write.__generic_file_write_iter.f2fs_file_write_iter.__vfs_write.vfs_write
>      54.51            -1.0       53.48        perf-profile.calltrace.cycles-pp._raw_spin_lock.f2fs_inode_dirtied.f2fs_mark_inode_dirty_sync.f2fs_write_end.generic_perform_write
>      54.55            -1.0       53.53        perf-profile.calltrace.cycles-pp.f2fs_mark_inode_dirty_sync.f2fs_write_end.generic_perform_write.__generic_file_write_iter.f2fs_file_write_iter
>      56.32            -1.0       55.30        perf-profile.calltrace.cycles-pp.f2fs_write_end.generic_perform_write.__generic_file_write_iter.f2fs_file_write_iter.__vfs_write
>      54.54            -1.0       53.53        perf-profile.calltrace.cycles-pp.f2fs_inode_dirtied.f2fs_mark_inode_dirty_sync.f2fs_write_end.generic_perform_write.__generic_file_write_iter
>      88.93            -1.0       87.96        perf-profile.calltrace.cycles-pp.__generic_file_write_iter.f2fs_file_write_iter.__vfs_write.vfs_write.ksys_write
>      89.94            -0.8       89.14        perf-profile.calltrace.cycles-pp.f2fs_file_write_iter.__vfs_write.vfs_write.ksys_write.do_syscall_64
>      90.01            -0.8       89.26        perf-profile.calltrace.cycles-pp.__vfs_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
>      90.72            -0.7       90.00        perf-profile.calltrace.cycles-pp.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
>      90.59            -0.7       89.87        perf-profile.calltrace.cycles-pp.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
>      13.32            -0.3       13.01        perf-profile.calltrace.cycles-pp._raw_spin_lock.f2fs_inode_dirtied.f2fs_mark_inode_dirty_sync.f2fs_reserve_new_blocks.f2fs_reserve_block
>      13.33            -0.3       13.01        perf-profile.calltrace.cycles-pp.f2fs_inode_dirtied.f2fs_mark_inode_dirty_sync.f2fs_reserve_new_blocks.f2fs_reserve_block.f2fs_get_block
>      13.33            -0.3       13.01        perf-profile.calltrace.cycles-pp.f2fs_mark_inode_dirty_sync.f2fs_reserve_new_blocks.f2fs_reserve_block.f2fs_get_block.f2fs_write_begin
>      13.26            -0.3       12.94        perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.f2fs_inode_dirtied.f2fs_mark_inode_dirty_sync.f2fs_reserve_new_blocks
>       1.30 ±  2%      +0.1        1.40 ±  2%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64
>       2.20 ±  6%      +0.2        2.40 ±  3%  perf-profile.calltrace.cycles-pp.generic_file_read_iter.__vfs_read.vfs_read.ksys_read.do_syscall_64
>       2.28 ±  5%      +0.2        2.52 ±  5%  perf-profile.calltrace.cycles-pp.__vfs_read.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
>       2.85 ±  4%      +0.3        3.16 ±  5%  perf-profile.calltrace.cycles-pp.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
>       2.97 ±  4%      +0.3        3.31 ±  5%  perf-profile.calltrace.cycles-pp.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
>      88.74            -1.0       87.70        perf-profile.children.cycles-pp.generic_perform_write
>      56.33            -1.0       55.31        perf-profile.children.cycles-pp.f2fs_write_end
>      88.95            -1.0       87.98        perf-profile.children.cycles-pp.__generic_file_write_iter
>      89.95            -0.8       89.15        perf-profile.children.cycles-pp.f2fs_file_write_iter
>      90.03            -0.8       89.28        perf-profile.children.cycles-pp.__vfs_write
>      90.73            -0.7       90.02        perf-profile.children.cycles-pp.ksys_write
>      90.60            -0.7       89.89        perf-profile.children.cycles-pp.vfs_write
>       0.22 ±  5%      -0.1        0.17 ± 19%  perf-profile.children.cycles-pp.f2fs_invalidate_page
>       0.08 ± 10%      +0.0        0.10 ±  5%  perf-profile.children.cycles-pp.page_mapping
>       0.09            +0.0        0.11 ±  7%  perf-profile.children.cycles-pp.__cancel_dirty_page
>       0.06 ±  6%      +0.0        0.09 ± 28%  perf-profile.children.cycles-pp.read_node_page
>       0.10 ±  4%      +0.0        0.14 ± 14%  perf-profile.children.cycles-pp.current_time
>       0.07 ± 12%      +0.0        0.11 ±  9%  perf-profile.children.cycles-pp.percpu_counter_add_batch
>       0.00            +0.1        0.05        perf-profile.children.cycles-pp.__x64_sys_write
>       0.38 ±  3%      +0.1        0.43 ±  5%  perf-profile.children.cycles-pp.selinux_file_permission
>       0.55 ±  4%      +0.1        0.61 ±  4%  perf-profile.children.cycles-pp.security_file_permission
>       1.30            +0.1        1.40 ±  2%  perf-profile.children.cycles-pp.entry_SYSCALL_64
>       2.21 ±  6%      +0.2        2.41 ±  3%  perf-profile.children.cycles-pp.generic_file_read_iter
>       2.29 ±  6%      +0.2        2.53 ±  5%  perf-profile.children.cycles-pp.__vfs_read
>       2.86 ±  4%      +0.3        3.18 ±  5%  perf-profile.children.cycles-pp.vfs_read
>       2.99 ±  4%      +0.3        3.32 ±  5%  perf-profile.children.cycles-pp.ksys_read
>       0.37            -0.1        0.24 ± 23%  perf-profile.self.cycles-pp.__get_node_page
>       0.21 ±  3%      -0.1        0.15 ± 16%  perf-profile.self.cycles-pp.f2fs_invalidate_page
>       0.07 ±  5%      +0.0        0.09 ± 11%  perf-profile.self.cycles-pp.page_mapping
>       0.06 ± 11%      +0.0        0.08 ±  8%  perf-profile.self.cycles-pp.vfs_read
>       0.07 ±  7%      +0.0        0.10 ± 21%  perf-profile.self.cycles-pp.__generic_file_write_iter
>       0.06 ± 14%      +0.0        0.10 ± 10%  perf-profile.self.cycles-pp.percpu_counter_add_batch
>       0.20 ± 11%      +0.0        0.25 ± 12%  perf-profile.self.cycles-pp.selinux_file_permission
>       0.05 ±  8%      +0.1        0.11 ± 52%  perf-profile.self.cycles-pp.__vfs_read
>       0.33 ±  9%      +0.1        0.41 ±  9%  perf-profile.self.cycles-pp.f2fs_lookup_extent_cache
>       1.30            +0.1        1.40 ±  2%  perf-profile.self.cycles-pp.entry_SYSCALL_64
> 
> 
> 
> 
> 
> Disclaimer:
> Results have been estimated based on internal Intel analysis and are provided
> for informational purposes only. Any difference in system hardware or software
> design or configuration may affect actual performance.
> 
> 
> Thanks,
> Rong Chen
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ