lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Date:   Wed, 12 Dec 2018 11:23:23 +0800
From:   Chao Yu <yuchao0@...wei.com>
To:     Rong Chen <rong.a.chen@...el.com>,
        Yunlong Song <yunlong.song@...wei.com>
CC:     Jaegeuk Kim <jaegeuk@...nel.org>, <lkp@...org>,
        LKML <linux-kernel@...r.kernel.org>,
        <linux-f2fs-devel@...ts.sourceforge.net>
Subject: Re: [LKP] [f2fs] 089842de57: aim7.jobs-per-min 15.4% improvement

On 2018/12/12 11:01, Rong Chen wrote:
> 
> 
> On 12/11/2018 06:12 PM, Chao Yu wrote:
>> Hi all,
>>
>> The commit only clean up codes which are unused currently, so why we can
>> improve performance with it? could you retest to make sure?
> 
> Hi Chao,
> 
> the improvement is exist in 0day environment.

Hi Rong,

Logically, the deleted codes are dead, and removal of them shouldn't impact
any flows.

Instead, I expect below patch can improve performance in some cases:

https://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs.git/commit/?h=dev-test&id=1e771e83ce26d0ba2ce6c7df7effb7822f032c4a

So I suspect there is problem in current test result, not sure the problem
is in test suit or test environment.

Any thoughts?

Thanks,

> 
> ➜  job cat 
> /result/aim7/4BRD_12G-RAID1-f2fs-disk_rw-3000-performance/lkp-ivb-ep01/debian-x86_64-2018-04-03.cgz/x86_64-rhel-7.2/gcc-7/089842de5750f434aa016eb23f3d3a3a151083bd/*/aim7.json 
> | grep -A1 min\"
>    "aim7.jobs-per-min": [
>      111406.82
> --
>    "aim7.jobs-per-min": [
>      110851.09
> --
>    "aim7.jobs-per-min": [
>      111399.93
> --
>    "aim7.jobs-per-min": [
>      110327.92
> --
>    "aim7.jobs-per-min": [
>      110321.16
> 
> ➜  job cat 
> /result/aim7/4BRD_12G-RAID1-f2fs-disk_rw-3000-performance/lkp-ivb-ep01/debian-x86_64-2018-04-03.cgz/x86_64-rhel-7.2/gcc-7/d6c66cd19ef322fe0d51ba09ce1b7f386acab04a/*/aim7.json 
> | grep -A1 min\"
>    "aim7.jobs-per-min": [
>      97082.14
> --
>    "aim7.jobs-per-min": [
>      95959.06
> --
>    "aim7.jobs-per-min": [
>      95959.06
> --
>    "aim7.jobs-per-min": [
>      95851.75
> --
>    "aim7.jobs-per-min": [
>      96946.19
> 
> Best Regards,
> Rong Chen
> 
>>
>> Thanks,
>>
>> On 2018/12/11 17:59, kernel test robot wrote:
>>> Greeting,
>>>
>>> FYI, we noticed a 15.4% improvement of aim7.jobs-per-min due to commit:
>>>
>>>
>>> commit: 089842de5750f434aa016eb23f3d3a3a151083bd ("f2fs: remove codes of unused wio_mutex")
>>> https://git.kernel.org/cgit/linux/kernel/git/jaegeuk/f2fs.git dev-test
>>>
>>> in testcase: aim7
>>> on test machine: 40 threads Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz with 384G memory
>>> with following parameters:
>>>
>>> 	disk: 4BRD_12G
>>> 	md: RAID1
>>> 	fs: f2fs
>>> 	test: disk_rw
>>> 	load: 3000
>>> 	cpufreq_governor: performance
>>>
>>> test-description: AIM7 is a traditional UNIX system level benchmark suite which is used to test and measure the performance of multiuser system.
>>> test-url: https://sourceforge.net/projects/aimbench/files/aim-suite7/
>>>
>>> In addition to that, the commit also has significant impact on the following tests:
>>>
>>> +------------------+-----------------------------------------------------------------------+
>>> | testcase: change | aim7: aim7.jobs-per-min 8.8% improvement                              |
>>> | test machine     | 40 threads Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz with 384G memory |
>>> | test parameters  | cpufreq_governor=performance                                          |
>>> |                  | disk=4BRD_12G                                                         |
>>> |                  | fs=f2fs                                                               |
>>> |                  | load=3000                                                             |
>>> |                  | md=RAID1                                                              |
>>> |                  | test=disk_rr                                                          |
>>> +------------------+-----------------------------------------------------------------------+
>>>
>>>
>>> Details are as below:
>>> -------------------------------------------------------------------------------------------------->
>>>
>>>
>>> To reproduce:
>>>
>>>          git clone https://github.com/intel/lkp-tests.git
>>>          cd lkp-tests
>>>          bin/lkp install job.yaml  # job file is attached in this email
>>>          bin/lkp run     job.yaml
>>>
>>> =========================================================================================
>>> compiler/cpufreq_governor/disk/fs/kconfig/load/md/rootfs/tbox_group/test/testcase:
>>>    gcc-7/performance/4BRD_12G/f2fs/x86_64-rhel-7.2/3000/RAID1/debian-x86_64-2018-04-03.cgz/lkp-ivb-ep01/disk_rw/aim7
>>>
>>> commit:
>>>    d6c66cd19e ("f2fs: fix count of seg_freed to make sec_freed correct")
>>>    089842de57 ("f2fs: remove codes of unused wio_mutex")
>>>
>>> d6c66cd19ef322fe 089842de5750f434aa016eb23f
>>> ---------------- --------------------------
>>>           %stddev     %change         %stddev
>>>               \          |                \
>>>       96213           +15.4%     110996        aim7.jobs-per-min
>>>      191.50 ±  3%     -15.1%     162.52        aim7.time.elapsed_time
>>>      191.50 ±  3%     -15.1%     162.52        aim7.time.elapsed_time.max
>>>     1090253 ±  2%     -17.5%     899165        aim7.time.involuntary_context_switches
>>>      176713            -7.5%     163478        aim7.time.minor_page_faults
>>>        6882           -14.6%       5875        aim7.time.system_time
>>>      127.97            +4.7%     134.00        aim7.time.user_time
>>>      760923            +7.1%     814632        aim7.time.voluntary_context_switches
>>>       78499 ±  2%     -11.2%      69691        interrupts.CAL:Function_call_interrupts
>>>     3183861 ±  4%     -16.7%    2651390 ±  4%  softirqs.TIMER
>>>      191.54 ± 13%     +45.4%     278.59 ± 12%  iostat.md0.w/s
>>>        6118 ±  3%     +16.5%       7126 ±  2%  iostat.md0.wkB/s
>>>      151257 ±  2%     -10.1%     135958 ±  2%  meminfo.AnonHugePages
>>>       46754 ±  3%     +14.0%      53307 ±  3%  meminfo.max_used_kB
>>>        0.03 ± 62%      -0.0        0.01 ± 78%  mpstat.cpu.soft%
>>>        1.73 ±  3%      +0.4        2.13 ±  3%  mpstat.cpu.usr%
>>>    16062961 ±  2%     -12.1%   14124403 ±  2%  turbostat.IRQ
>>>        0.76 ± 37%     -71.8%       0.22 ± 83%  turbostat.Pkg%pc6
>>>        9435 ±  7%     -18.1%       7730 ±  4%  turbostat.SMI
>>>        6113 ±  3%     +16.5%       7120 ±  2%  vmstat.io.bo
>>>       11293 ±  2%     +12.3%      12688 ±  2%  vmstat.system.cs
>>>       81879 ±  2%      +2.5%      83951        vmstat.system.in
>>>        2584            -4.4%       2469 ±  2%  proc-vmstat.nr_active_file
>>>        2584            -4.4%       2469 ±  2%  proc-vmstat.nr_zone_active_file
>>>       28564 ±  4%     -23.6%      21817 ± 12%  proc-vmstat.numa_hint_faults
>>>       10958 ±  5%     -43.9%       6147 ± 26%  proc-vmstat.numa_hint_faults_local
>>>      660531 ±  3%     -10.7%     590059 ±  2%  proc-vmstat.pgfault
>>>        1191 ±  7%     -16.5%     995.25 ± 12%  slabinfo.UNIX.active_objs
>>>        1191 ±  7%     -16.5%     995.25 ± 12%  slabinfo.UNIX.num_objs
>>>       10552 ±  4%      -7.8%       9729        slabinfo.ext4_io_end.active_objs
>>>       10552 ±  4%      -7.8%       9729        slabinfo.ext4_io_end.num_objs
>>>       18395           +12.3%      20656 ±  8%  slabinfo.kmalloc-32.active_objs
>>>       18502 ±  2%     +12.3%      20787 ±  8%  slabinfo.kmalloc-32.num_objs
>>>   1.291e+12           -12.3%  1.131e+12        perf-stat.branch-instructions
>>>        0.66            +0.1        0.76 ±  3%  perf-stat.branch-miss-rate%
>>>   1.118e+10 ±  4%      -7.5%  1.034e+10        perf-stat.cache-misses
>>>   2.772e+10 ±  8%      -6.6%  2.589e+10        perf-stat.cache-references
>>>     2214958            -3.6%    2136237        perf-stat.context-switches
>>>        3.95 ±  2%      -5.8%       3.72        perf-stat.cpi
>>>    2.24e+13           -16.4%  1.873e+13        perf-stat.cpu-cycles
>>>   1.542e+12           -10.4%  1.382e+12        perf-stat.dTLB-loads
>>>        0.18 ±  6%      +0.0        0.19 ±  4%  perf-stat.dTLB-store-miss-rate%
>>>   5.667e+12           -11.3%  5.029e+12        perf-stat.instructions
>>>        5534           -13.1%       4809 ±  6%  perf-stat.instructions-per-iTLB-miss
>>>        0.25 ±  2%      +6.1%       0.27        perf-stat.ipc
>>>      647970 ±  2%     -10.7%     578955 ±  2%  perf-stat.minor-faults
>>>   2.783e+09 ± 18%     -17.8%  2.288e+09 ±  4%  perf-stat.node-loads
>>>   5.706e+09 ±  2%      -5.2%  5.407e+09        perf-stat.node-store-misses
>>>   7.693e+09            -4.4%  7.352e+09        perf-stat.node-stores
>>>      647979 ±  2%     -10.7%     578955 ±  2%  perf-stat.page-faults
>>>       70960 ± 16%     -26.6%      52062        sched_debug.cfs_rq:/.exec_clock.avg
>>>       70628 ± 16%     -26.7%      51787        sched_debug.cfs_rq:/.exec_clock.min
>>>       22499 ±  3%     -10.5%      20133 ±  3%  sched_debug.cfs_rq:/.load.avg
>>>        7838 ± 23%     -67.6%       2536 ± 81%  sched_debug.cfs_rq:/.load.min
>>>      362.19 ± 12%     +58.3%     573.50 ± 25%  sched_debug.cfs_rq:/.load_avg.max
>>>     3092960 ± 16%     -28.5%    2211400        sched_debug.cfs_rq:/.min_vruntime.avg
>>>     3244162 ± 15%     -27.0%    2367437 ±  2%  sched_debug.cfs_rq:/.min_vruntime.max
>>>     2984299 ± 16%     -28.9%    2121271        sched_debug.cfs_rq:/.min_vruntime.min
>>>        0.73 ±  4%     -65.7%       0.25 ± 57%  sched_debug.cfs_rq:/.nr_running.min
>>>        0.12 ± 13%    +114.6%       0.26 ±  9%  sched_debug.cfs_rq:/.nr_running.stddev
>>>        8.44 ± 23%     -36.8%       5.33 ± 15%  sched_debug.cfs_rq:/.nr_spread_over.max
>>>        1.49 ± 21%     -29.6%       1.05 ±  7%  sched_debug.cfs_rq:/.nr_spread_over.stddev
>>>       16.53 ± 20%     -38.8%      10.12 ± 23%  sched_debug.cfs_rq:/.runnable_load_avg.avg
>>>       15259 ±  7%     -33.3%      10176 ± 22%  sched_debug.cfs_rq:/.runnable_weight.avg
>>>      796.65 ± 93%     -74.8%     200.68 ± 17%  sched_debug.cfs_rq:/.util_est_enqueued.avg
>>>      669258 ±  3%     -13.3%     580068        sched_debug.cpu.avg_idle.avg
>>>      116020 ± 12%     -21.4%      91239        sched_debug.cpu.clock.avg
>>>      116076 ± 12%     -21.4%      91261        sched_debug.cpu.clock.max
>>>      115967 ± 12%     -21.3%      91215        sched_debug.cpu.clock.min
>>>      116020 ± 12%     -21.4%      91239        sched_debug.cpu.clock_task.avg
>>>      116076 ± 12%     -21.4%      91261        sched_debug.cpu.clock_task.max
>>>      115967 ± 12%     -21.3%      91215        sched_debug.cpu.clock_task.min
>>>       15.41 ±  4%     -32.0%      10.48 ± 24%  sched_debug.cpu.cpu_load[0].avg
>>>       15.71 ±  6%     -26.6%      11.53 ± 22%  sched_debug.cpu.cpu_load[1].avg
>>>       16.20 ±  8%     -22.9%      12.49 ± 21%  sched_debug.cpu.cpu_load[2].avg
>>>       16.92 ±  7%     -21.2%      13.33 ± 21%  sched_debug.cpu.cpu_load[3].avg
>>>        2650 ±  6%     -15.6%       2238 ±  3%  sched_debug.cpu.curr->pid.avg
>>>        1422 ±  8%     -68.5%     447.42 ± 57%  sched_debug.cpu.curr->pid.min
>>>        7838 ± 23%     -67.6%       2536 ± 81%  sched_debug.cpu.load.min
>>>       86066 ± 14%     -26.3%      63437        sched_debug.cpu.nr_load_updates.min
>>>        3.97 ± 88%     -70.9%       1.15 ± 10%  sched_debug.cpu.nr_running.avg
>>>        0.73 ±  4%     -65.7%       0.25 ± 57%  sched_debug.cpu.nr_running.min
>>>        1126 ± 16%     -27.6%     816.02 ±  9%  sched_debug.cpu.sched_count.stddev
>>>        1468 ± 16%     +31.1%       1925 ±  5%  sched_debug.cpu.sched_goidle.avg
>>>        1115 ± 16%     +37.8%       1538 ±  4%  sched_debug.cpu.sched_goidle.min
>>>        3979 ± 13%     -27.4%       2888 ±  5%  sched_debug.cpu.ttwu_local.max
>>>      348.96 ±  8%     -26.3%     257.16 ± 13%  sched_debug.cpu.ttwu_local.stddev
>>>      115966 ± 12%     -21.3%      91214        sched_debug.cpu_clk
>>>      113505 ± 12%     -21.8%      88773        sched_debug.ktime
>>>      116416 ± 12%     -21.3%      91663        sched_debug.sched_clk
>>>        0.26 ±100%      +0.3        0.57 ±  6%  perf-profile.calltrace.cycles-pp.security_file_permission.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
>>>        0.29 ±100%      +0.4        0.66 ±  5%  perf-profile.calltrace.cycles-pp.find_get_entry.pagecache_get_page.f2fs_write_begin.generic_perform_write.__generic_file_write_iter
>>>        0.67 ± 65%      +0.4        1.11        perf-profile.calltrace.cycles-pp.copy_user_enhanced_fast_string.copyin.iov_iter_copy_from_user_atomic.generic_perform_write.__generic_file_write_iter
>>>        0.69 ± 65%      +0.5        1.14        perf-profile.calltrace.cycles-pp.copyin.iov_iter_copy_from_user_atomic.generic_perform_write.__generic_file_write_iter.f2fs_file_write_iter
>>>        1.07 ± 57%      +0.5        1.61 ±  5%  perf-profile.calltrace.cycles-pp.pagecache_get_page.f2fs_write_begin.generic_perform_write.__generic_file_write_iter.f2fs_file_write_iter
>>>        0.79 ± 64%      +0.5        1.33        perf-profile.calltrace.cycles-pp.iov_iter_copy_from_user_atomic.generic_perform_write.__generic_file_write_iter.f2fs_file_write_iter.__vfs_write
>>>        0.73 ± 63%      +0.6        1.32 ±  3%  perf-profile.calltrace.cycles-pp.syscall_return_via_sysret
>>>        0.81 ± 63%      +0.6        1.43 ±  3%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64
>>>        0.06 ± 58%      +0.0        0.09 ±  4%  perf-profile.children.cycles-pp.__pagevec_lru_add_fn
>>>        0.05 ± 58%      +0.0        0.09 ± 13%  perf-profile.children.cycles-pp.down_write_trylock
>>>        0.06 ± 58%      +0.0        0.10 ±  4%  perf-profile.children.cycles-pp.__x64_sys_write
>>>        0.07 ± 58%      +0.0        0.11 ±  3%  perf-profile.children.cycles-pp.account_page_dirtied
>>>        0.04 ± 57%      +0.0        0.09 ±  5%  perf-profile.children.cycles-pp.account_page_cleaned
>>>        0.06 ± 58%      +0.0        0.10 ±  7%  perf-profile.children.cycles-pp.free_pcppages_bulk
>>>        0.10 ± 58%      +0.1        0.15 ±  6%  perf-profile.children.cycles-pp.page_mapping
>>>        0.09 ± 57%      +0.1        0.14 ±  7%  perf-profile.children.cycles-pp.__lru_cache_add
>>>        0.10 ± 57%      +0.1        0.15 ±  9%  perf-profile.children.cycles-pp.__might_sleep
>>>        0.12 ± 58%      +0.1        0.19 ±  3%  perf-profile.children.cycles-pp.set_page_dirty
>>>        0.08 ± 64%      +0.1        0.15 ± 10%  perf-profile.children.cycles-pp.dquot_claim_space_nodirty
>>>        0.06 ± 61%      +0.1        0.13 ±  5%  perf-profile.children.cycles-pp.percpu_counter_add_batch
>>>        0.18 ± 57%      +0.1        0.27 ±  2%  perf-profile.children.cycles-pp.iov_iter_fault_in_readable
>>>        0.17 ± 57%      +0.1        0.26 ±  2%  perf-profile.children.cycles-pp.__set_page_dirty_nobuffers
>>>        0.09 ± 57%      +0.1        0.18 ± 27%  perf-profile.children.cycles-pp.free_unref_page_list
>>>        0.16 ± 58%      +0.1        0.30 ± 18%  perf-profile.children.cycles-pp.__pagevec_release
>>>        0.30 ± 57%      +0.1        0.43 ±  5%  perf-profile.children.cycles-pp.add_to_page_cache_lru
>>>        0.17 ± 58%      +0.1        0.31 ± 16%  perf-profile.children.cycles-pp.release_pages
>>>        0.29 ± 58%      +0.2        0.45 ±  7%  perf-profile.children.cycles-pp.selinux_file_permission
>>>        0.38 ± 57%      +0.2        0.58 ±  6%  perf-profile.children.cycles-pp.security_file_permission
>>>        0.78 ± 57%      +0.3        1.12        perf-profile.children.cycles-pp.copy_user_enhanced_fast_string
>>>        0.80 ± 57%      +0.3        1.15        perf-profile.children.cycles-pp.copyin
>>>        0.92 ± 57%      +0.4        1.34        perf-profile.children.cycles-pp.iov_iter_copy_from_user_atomic
>>>        0.98 ± 54%      +0.5        1.43 ±  3%  perf-profile.children.cycles-pp.entry_SYSCALL_64
>>>        0.98 ± 53%      +0.5        1.50 ±  3%  perf-profile.children.cycles-pp.syscall_return_via_sysret
>>>        1.64 ± 57%      +0.8        2.45 ±  5%  perf-profile.children.cycles-pp.pagecache_get_page
>>>        0.04 ± 57%      +0.0        0.06        perf-profile.self.cycles-pp.__pagevec_lru_add_fn
>>>        0.04 ± 58%      +0.0        0.07 ±  7%  perf-profile.self.cycles-pp.release_pages
>>>        0.05 ± 58%      +0.0        0.08 ± 15%  perf-profile.self.cycles-pp._cond_resched
>>>        0.04 ± 58%      +0.0        0.08 ±  6%  perf-profile.self.cycles-pp.ksys_write
>>>        0.05 ± 58%      +0.0        0.09 ± 13%  perf-profile.self.cycles-pp.down_write_trylock
>>>        0.09 ± 58%      +0.1        0.14 ±  9%  perf-profile.self.cycles-pp.page_mapping
>>>        0.01 ±173%      +0.1        0.07 ±  7%  perf-profile.self.cycles-pp.__fdget_pos
>>>        0.11 ± 57%      +0.1        0.17 ±  7%  perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
>>>        0.05 ± 59%      +0.1        0.12 ±  5%  perf-profile.self.cycles-pp.percpu_counter_add_batch
>>>        0.12 ± 58%      +0.1        0.19 ±  4%  perf-profile.self.cycles-pp.iov_iter_copy_from_user_atomic
>>>        0.17 ± 57%      +0.1        0.24 ±  4%  perf-profile.self.cycles-pp.generic_perform_write
>>>        0.17 ± 58%      +0.1        0.26 ±  2%  perf-profile.self.cycles-pp.iov_iter_fault_in_readable
>>>        0.19 ± 57%      +0.1        0.30 ±  2%  perf-profile.self.cycles-pp.f2fs_set_data_page_dirty
>>>        0.18 ± 58%      +0.1        0.30 ±  4%  perf-profile.self.cycles-pp.pagecache_get_page
>>>        0.27 ± 57%      +0.1        0.41 ±  4%  perf-profile.self.cycles-pp.do_syscall_64
>>>        0.40 ± 57%      +0.2        0.62 ±  5%  perf-profile.self.cycles-pp.find_get_entry
>>>        0.77 ± 57%      +0.3        1.11        perf-profile.self.cycles-pp.copy_user_enhanced_fast_string
>>>        0.96 ± 54%      +0.5        1.43 ±  3%  perf-profile.self.cycles-pp.entry_SYSCALL_64
>>>        0.98 ± 53%      +0.5        1.50 ±  2%  perf-profile.self.cycles-pp.syscall_return_via_sysret
>>>        0.72 ± 59%      +0.5        1.26 ± 10%  perf-profile.self.cycles-pp.f2fs_lookup_extent_cache
>>>
>>>
>>>                                                                                  
>>>                                    aim7.jobs-per-min
>>>                                                                                  
>>>    114000 +-+----------------------------------------------------------------+
>>>    112000 +-+     O                                                          |
>>>           O    O       O    O    O    O                    O  O O            |
>>>    110000 +-+       O    O     O    O    O              O          O         |
>>>    108000 +-+                                                                |
>>>           |                                 O O  O O  O                      |
>>>    106000 +-+O                                                               |
>>>    104000 +-+                                                                |
>>>    102000 +-+                                                                |
>>>           |                                                                  |
>>>    100000 +-+                                                                |
>>>     98000 +-+                                                                |
>>>           |.. .+..+.+..    .+.. .+.. .+..+..+.+.. .+..+.+..+..+.+..  +..     |
>>>     96000 +-++          .+.    +    +            +                  +   +.+..|
>>>     94000 +-+----------------------------------------------------------------+
>>>                                                                                  
>>>                                                                                                                                                                  
>>>                                 aim7.time.system_time
>>>                                                                                  
>>>    7200 +-+------------------------------------------------------------------+
>>>         |                                                                    |
>>>    7000 +-+         .+..     +..                                 .+..        |
>>>         | .+.     .+    +.. +     .+.     .+.  .+.     .+.     .+      .+.+..|
>>>    6800 +-+  +..+.         +    +.   +..+.   +.   +..+.   +..+.      +.      |
>>>         |                                                                    |
>>>    6600 +-+                                                                  |
>>>         |                                                                    |
>>>    6400 +-+                                                                  |
>>>         |  O                                                                 |
>>>    6200 +-+                                                                  |
>>>         |                                  O O  O O  O                       |
>>>    6000 +-+                  O     O                    O                    |
>>>         O    O     O O  O  O         O  O                    O  O O          |
>>>    5800 +-+-----O---------------O-------------------------O------------------+
>>>                                                                                  
>>>                                                                                                                                                                  
>>>                                aim7.time.elapsed_time
>>>                                                                                  
>>>    205 +-+-------------------------------------------------------------------+
>>>        |                                                                  :: |
>>>    200 +-+                                                               : : |
>>>    195 +-+                                                               :  :|
>>>        |           .+..                                           +..   :   :|
>>>    190 +-++.     .+    +..  .+.  .+..    .+.. .+..    .+..       +     .+    |
>>>    185 +-+  +..+.         +.   +.    +.+.    +    +..+    +..+..+    +.      |
>>>        |                                                                     |
>>>    180 +-+                                                                   |
>>>    175 +-+                                                                   |
>>>        |  O                                                                  |
>>>    170 +-+                                   O    O  O                       |
>>>    165 +-+                        O       O    O                             |
>>>        O    O     O O  O  O  O O     O O               O  O  O  O O          |
>>>    160 +-+-----O-------------------------------------------------------------+
>>>                                                                                  
>>>                                                                                                                                                                  
>>>                              aim7.time.elapsed_time.max
>>>                                                                                  
>>>    205 +-+-------------------------------------------------------------------+
>>>        |                                                                  :: |
>>>    200 +-+                                                               : : |
>>>    195 +-+                                                               :  :|
>>>        |           .+..                                           +..   :   :|
>>>    190 +-++.     .+    +..  .+.  .+..    .+.. .+..    .+..       +     .+    |
>>>    185 +-+  +..+.         +.   +.    +.+.    +    +..+    +..+..+    +.      |
>>>        |                                                                     |
>>>    180 +-+                                                                   |
>>>    175 +-+                                                                   |
>>>        |  O                                                                  |
>>>    170 +-+                                   O    O  O                       |
>>>    165 +-+                        O       O    O                             |
>>>        O    O     O O  O  O  O O     O O               O  O  O  O O          |
>>>    160 +-+-----O-------------------------------------------------------------+
>>>                                                                                  
>>>                                                                                                                                                                  
>>>                          aim7.time.involuntary_context_switches
>>>                                                                                  
>>>    1.15e+06 +-+--------------------------------------------------------------+
>>>             |                   +..                                        + |
>>>     1.1e+06 +-++     .+.. .+.. +    .+..    .+.  .+     .+..    .+.       : +|
>>>             |.  +  .+    +    +    +     .+.   +.  +  .+    +.+.   +..+   :  |
>>>             |    +.                     +           +.                 + :   |
>>>    1.05e+06 +-+                                                         +    |
>>>             |                                                                |
>>>       1e+06 +-+                                                              |
>>>             |                                                                |
>>>      950000 +-+                                                              |
>>>             |                                          O                     |
>>>             O  O O    O         O    O    O              O         O         |
>>>      900000 +-+          O O  O         O    O O  O O       O O  O           |
>>>             |       O              O                                         |
>>>      850000 +-+--------------------------------------------------------------+
>>>                                                                                  
>>>                                                                                  
>>> [*] bisect-good sample
>>> [O] bisect-bad  sample
>>>
>>> ***************************************************************************************************
>>> lkp-ivb-ep01: 40 threads Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz with 384G memory
>>> =========================================================================================
>>> compiler/cpufreq_governor/disk/fs/kconfig/load/md/rootfs/tbox_group/test/testcase:
>>>    gcc-7/performance/4BRD_12G/f2fs/x86_64-rhel-7.2/3000/RAID1/debian-x86_64-2018-04-03.cgz/lkp-ivb-ep01/disk_rr/aim7
>>>
>>> commit:
>>>    d6c66cd19e ("f2fs: fix count of seg_freed to make sec_freed correct")
>>>    089842de57 ("f2fs: remove codes of unused wio_mutex")
>>>
>>> d6c66cd19ef322fe 089842de5750f434aa016eb23f
>>> ---------------- --------------------------
>>>         fail:runs  %reproduction    fail:runs
>>>             |             |             |
>>>             :4           50%           2:4     dmesg.WARNING:at#for_ip_interrupt_entry/0x
>>>             :4           25%           1:4     kmsg.DHCP/BOOTP:Reply_not_for_us_on_eth#,op[#]xid[#]
>>>             :4           25%           1:4     kmsg.IP-Config:Reopening_network_devices
>>>           %stddev     %change         %stddev
>>>               \          |                \
>>>      102582            +8.8%     111626        aim7.jobs-per-min
>>>      176.57            -8.5%     161.64        aim7.time.elapsed_time
>>>      176.57            -8.5%     161.64        aim7.time.elapsed_time.max
>>>     1060618           -12.5%     927723        aim7.time.involuntary_context_switches
>>>        6408            -8.9%       5839        aim7.time.system_time
>>>      785554            +4.5%     820987        aim7.time.voluntary_context_switches
>>>     1077477            -9.5%     975130 ±  2%  softirqs.RCU
>>>      184.77 ±  6%     +41.2%     260.90 ± 11%  iostat.md0.w/s
>>>        6609 ±  2%      +9.6%       7246        iostat.md0.wkB/s
>>>        0.00 ± 94%      +0.0        0.02 ± 28%  mpstat.cpu.soft%
>>>        1.89 ±  4%      +0.3        2.15 ±  3%  mpstat.cpu.usr%
>>>        6546 ± 19%     -49.1%       3328 ± 63%  numa-numastat.node0.other_node
>>>        1470 ± 86%    +222.9%       4749 ± 45%  numa-numastat.node1.other_node
>>>      959.75 ±  8%     +16.8%       1120 ±  7%  slabinfo.UNIX.active_objs
>>>      959.75 ±  8%     +16.8%       1120 ±  7%  slabinfo.UNIX.num_objs
>>>       38.35            +3.2%      39.57 ±  2%  turbostat.RAMWatt
>>>        8800 ±  2%     -10.7%       7855 ±  3%  turbostat.SMI
>>>      103925 ± 27%     -59.5%      42134 ± 61%  numa-meminfo.node0.AnonHugePages
>>>       14267 ± 61%     -54.9%       6430 ± 76%  numa-meminfo.node0.Inactive(anon)
>>>       52220 ± 18%    +104.0%     106522 ± 40%  numa-meminfo.node1.AnonHugePages
>>>        6614 ±  2%      +9.6%       7248        vmstat.io.bo
>>>      316.00 ±  2%     -15.4%     267.25 ±  8%  vmstat.procs.r
>>>       12256 ±  2%      +6.9%      13098        vmstat.system.cs
>>>        2852 ±  3%     +12.5%       3208 ±  3%  numa-vmstat.node0.nr_active_file
>>>        3566 ± 61%     -54.9%       1607 ± 76%  numa-vmstat.node0.nr_inactive_anon
>>>        2852 ±  3%     +12.4%       3207 ±  3%  numa-vmstat.node0.nr_zone_active_file
>>>        3566 ± 61%     -54.9%       1607 ± 76%  numa-vmstat.node0.nr_zone_inactive_anon
>>>       95337            +2.3%      97499        proc-vmstat.nr_active_anon
>>>        5746 ±  2%      +4.3%       5990        proc-vmstat.nr_active_file
>>>       89732            +2.0%      91532        proc-vmstat.nr_anon_pages
>>>       95337            +2.3%      97499        proc-vmstat.nr_zone_active_anon
>>>        5746 ±  2%      +4.3%       5990        proc-vmstat.nr_zone_active_file
>>>       10407 ±  4%     -49.3%       5274 ± 52%  proc-vmstat.numa_hint_faults_local
>>>      615058            -6.0%     578344 ±  2%  proc-vmstat.pgfault
>>>   1.187e+12            -8.7%  1.084e+12        perf-stat.branch-instructions
>>>        0.65 ±  3%      +0.0        0.70 ±  2%  perf-stat.branch-miss-rate%
>>>     2219706            -2.5%    2164425        perf-stat.context-switches
>>>   2.071e+13           -10.0%  1.864e+13        perf-stat.cpu-cycles
>>>      641874            -2.7%     624703        perf-stat.cpu-migrations
>>>   1.408e+12            -7.3%  1.305e+12        perf-stat.dTLB-loads
>>>    39182891 ±  4%    +796.4%  3.512e+08 ±150%  perf-stat.iTLB-loads
>>>   5.184e+12            -8.0%   4.77e+12        perf-stat.instructions
>>>        5035 ±  2%     -14.1%       4325 ± 13%  perf-stat.instructions-per-iTLB-miss
>>>      604219            -6.2%     566725        perf-stat.minor-faults
>>>   4.962e+09            -2.7%  4.827e+09        perf-stat.node-stores
>>>      604097            -6.2%     566730        perf-stat.page-faults
>>>      110.81 ± 13%     +25.7%     139.25 ±  8%  sched_debug.cfs_rq:/.load_avg.stddev
>>>       12.76 ± 74%    +114.6%      27.39 ± 38%  sched_debug.cfs_rq:/.removed.load_avg.avg
>>>       54.23 ± 62%     +66.2%      90.10 ± 17%  sched_debug.cfs_rq:/.removed.load_avg.stddev
>>>      585.18 ± 74%    +115.8%       1262 ± 38%  sched_debug.cfs_rq:/.removed.runnable_sum.avg
>>>        2489 ± 62%     +66.9%       4153 ± 17%  sched_debug.cfs_rq:/.removed.runnable_sum.stddev
>>>       11909 ± 10%     +44.7%      17229 ± 18%  sched_debug.cfs_rq:/.runnable_weight.avg
>>>        1401 ±  2%     +36.5%       1913 ±  5%  sched_debug.cpu.sched_goidle.avg
>>>        2350 ±  2%     +21.9%       2863 ±  5%  sched_debug.cpu.sched_goidle.max
>>>        1082 ±  5%     +39.2%       1506 ±  4%  sched_debug.cpu.sched_goidle.min
>>>        7327           +14.7%       8401 ±  2%  sched_debug.cpu.ttwu_count.avg
>>>        5719 ±  3%     +18.3%       6767 ±  2%  sched_debug.cpu.ttwu_count.min
>>>        1518 ±  3%     +15.6%       1755 ±  3%  sched_debug.cpu.ttwu_local.min
>>>       88.70            -1.0       87.65        perf-profile.calltrace.cycles-pp.generic_perform_write.__generic_file_write_iter.f2fs_file_write_iter.__vfs_write.vfs_write
>>>       54.51            -1.0       53.48        perf-profile.calltrace.cycles-pp._raw_spin_lock.f2fs_inode_dirtied.f2fs_mark_inode_dirty_sync.f2fs_write_end.generic_perform_write
>>>       54.55            -1.0       53.53        perf-profile.calltrace.cycles-pp.f2fs_mark_inode_dirty_sync.f2fs_write_end.generic_perform_write.__generic_file_write_iter.f2fs_file_write_iter
>>>       56.32            -1.0       55.30        perf-profile.calltrace.cycles-pp.f2fs_write_end.generic_perform_write.__generic_file_write_iter.f2fs_file_write_iter.__vfs_write
>>>       54.54            -1.0       53.53        perf-profile.calltrace.cycles-pp.f2fs_inode_dirtied.f2fs_mark_inode_dirty_sync.f2fs_write_end.generic_perform_write.__generic_file_write_iter
>>>       88.93            -1.0       87.96        perf-profile.calltrace.cycles-pp.__generic_file_write_iter.f2fs_file_write_iter.__vfs_write.vfs_write.ksys_write
>>>       89.94            -0.8       89.14        perf-profile.calltrace.cycles-pp.f2fs_file_write_iter.__vfs_write.vfs_write.ksys_write.do_syscall_64
>>>       90.01            -0.8       89.26        perf-profile.calltrace.cycles-pp.__vfs_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
>>>       90.72            -0.7       90.00        perf-profile.calltrace.cycles-pp.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
>>>       90.59            -0.7       89.87        perf-profile.calltrace.cycles-pp.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
>>>       13.32            -0.3       13.01        perf-profile.calltrace.cycles-pp._raw_spin_lock.f2fs_inode_dirtied.f2fs_mark_inode_dirty_sync.f2fs_reserve_new_blocks.f2fs_reserve_block
>>>       13.33            -0.3       13.01        perf-profile.calltrace.cycles-pp.f2fs_inode_dirtied.f2fs_mark_inode_dirty_sync.f2fs_reserve_new_blocks.f2fs_reserve_block.f2fs_get_block
>>>       13.33            -0.3       13.01        perf-profile.calltrace.cycles-pp.f2fs_mark_inode_dirty_sync.f2fs_reserve_new_blocks.f2fs_reserve_block.f2fs_get_block.f2fs_write_begin
>>>       13.26            -0.3       12.94        perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.f2fs_inode_dirtied.f2fs_mark_inode_dirty_sync.f2fs_reserve_new_blocks
>>>        1.30 ±  2%      +0.1        1.40 ±  2%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64
>>>        2.20 ±  6%      +0.2        2.40 ±  3%  perf-profile.calltrace.cycles-pp.generic_file_read_iter.__vfs_read.vfs_read.ksys_read.do_syscall_64
>>>        2.28 ±  5%      +0.2        2.52 ±  5%  perf-profile.calltrace.cycles-pp.__vfs_read.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
>>>        2.85 ±  4%      +0.3        3.16 ±  5%  perf-profile.calltrace.cycles-pp.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
>>>        2.97 ±  4%      +0.3        3.31 ±  5%  perf-profile.calltrace.cycles-pp.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
>>>       88.74            -1.0       87.70        perf-profile.children.cycles-pp.generic_perform_write
>>>       56.33            -1.0       55.31        perf-profile.children.cycles-pp.f2fs_write_end
>>>       88.95            -1.0       87.98        perf-profile.children.cycles-pp.__generic_file_write_iter
>>>       89.95            -0.8       89.15        perf-profile.children.cycles-pp.f2fs_file_write_iter
>>>       90.03            -0.8       89.28        perf-profile.children.cycles-pp.__vfs_write
>>>       90.73            -0.7       90.02        perf-profile.children.cycles-pp.ksys_write
>>>       90.60            -0.7       89.89        perf-profile.children.cycles-pp.vfs_write
>>>        0.22 ±  5%      -0.1        0.17 ± 19%  perf-profile.children.cycles-pp.f2fs_invalidate_page
>>>        0.08 ± 10%      +0.0        0.10 ±  5%  perf-profile.children.cycles-pp.page_mapping
>>>        0.09            +0.0        0.11 ±  7%  perf-profile.children.cycles-pp.__cancel_dirty_page
>>>        0.06 ±  6%      +0.0        0.09 ± 28%  perf-profile.children.cycles-pp.read_node_page
>>>        0.10 ±  4%      +0.0        0.14 ± 14%  perf-profile.children.cycles-pp.current_time
>>>        0.07 ± 12%      +0.0        0.11 ±  9%  perf-profile.children.cycles-pp.percpu_counter_add_batch
>>>        0.00            +0.1        0.05        perf-profile.children.cycles-pp.__x64_sys_write
>>>        0.38 ±  3%      +0.1        0.43 ±  5%  perf-profile.children.cycles-pp.selinux_file_permission
>>>        0.55 ±  4%      +0.1        0.61 ±  4%  perf-profile.children.cycles-pp.security_file_permission
>>>        1.30            +0.1        1.40 ±  2%  perf-profile.children.cycles-pp.entry_SYSCALL_64
>>>        2.21 ±  6%      +0.2        2.41 ±  3%  perf-profile.children.cycles-pp.generic_file_read_iter
>>>        2.29 ±  6%      +0.2        2.53 ±  5%  perf-profile.children.cycles-pp.__vfs_read
>>>        2.86 ±  4%      +0.3        3.18 ±  5%  perf-profile.children.cycles-pp.vfs_read
>>>        2.99 ±  4%      +0.3        3.32 ±  5%  perf-profile.children.cycles-pp.ksys_read
>>>        0.37            -0.1        0.24 ± 23%  perf-profile.self.cycles-pp.__get_node_page
>>>        0.21 ±  3%      -0.1        0.15 ± 16%  perf-profile.self.cycles-pp.f2fs_invalidate_page
>>>        0.07 ±  5%      +0.0        0.09 ± 11%  perf-profile.self.cycles-pp.page_mapping
>>>        0.06 ± 11%      +0.0        0.08 ±  8%  perf-profile.self.cycles-pp.vfs_read
>>>        0.07 ±  7%      +0.0        0.10 ± 21%  perf-profile.self.cycles-pp.__generic_file_write_iter
>>>        0.06 ± 14%      +0.0        0.10 ± 10%  perf-profile.self.cycles-pp.percpu_counter_add_batch
>>>        0.20 ± 11%      +0.0        0.25 ± 12%  perf-profile.self.cycles-pp.selinux_file_permission
>>>        0.05 ±  8%      +0.1        0.11 ± 52%  perf-profile.self.cycles-pp.__vfs_read
>>>        0.33 ±  9%      +0.1        0.41 ±  9%  perf-profile.self.cycles-pp.f2fs_lookup_extent_cache
>>>        1.30            +0.1        1.40 ±  2%  perf-profile.self.cycles-pp.entry_SYSCALL_64
>>>
>>>
>>>
>>>
>>>
>>> Disclaimer:
>>> Results have been estimated based on internal Intel analysis and are provided
>>> for informational purposes only. Any difference in system hardware or software
>>> design or configuration may affect actual performance.
>>>
>>>
>>> Thanks,
>>> Rong Chen
>>>
>> _______________________________________________
>> LKP mailing list
>> LKP@...ts.01.org
>> https://lists.01.org/mailman/listinfo/lkp
> 
> 
> .
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ