lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150421043408.GH8084@yliu-dev.sh.intel.com>
Date:	Tue, 21 Apr 2015 12:34:08 +0800
From:	Yuanhan Liu <yuanhan.liu@...ux.intel.com>
To:	Jaegeuk Kim <jaegeuk@...nel.org>
Cc:	lkp@...org, Yuanhan Liu <yuanhan.liu@...ux.intel.com>,
	LKML <linux-kernel@...r.kernel.org>
Subject: performance changes on 78373b73:  -46.6% fsmark.files_per_sec, and
 few more

FYI, we found changes on `fsmark.files_per_sec' by 78373b7319abdf15050af5b1632c4c8b8b398f33:

    > commit 78373b7319abdf15050af5b1632c4c8b8b398f33
    > Author:     Jaegeuk Kim <jaegeuk@...nel.org>
    > AuthorDate: Fri Mar 13 21:44:36 2015 -0700
    > Commit:     Jaegeuk Kim <jaegeuk@...nel.org>
    > CommitDate: Fri Apr 10 15:08:45 2015 -0700
    > 
    >     f2fs: enhance multi-threads performance

3402e87cfb5e762f9c95071bf4a2ad65fd9392a2     78373b7319abdf15050af5b1632c4c8b8b398f33
----------------------------------------     ----------------------------------------
run time(m)     metric_value     ±stddev     run time(m)     metric_value     ±stddev     change   testbox/benchmark/testcase-params
--- ------  ----------------------------     --- ------  ----------------------------     -------- ------------------------------
3   0.3             |490.800|    ±5.7        3   0.5             |262.067|    ±0.4          -46.6% ivb44/fsmark/1x-64t-4BRD_12G-RAID0-f2fs-4M-30G-fsyncBeforeClose
3   0.3             |468.367|    ±3.5        3   0.5             |264.467|    ±0.2          -43.5% ivb44/fsmark/1x-64t-9BRD_6G-RAID0-f2fs-4M-30G-fsyncBeforeClose
3   0.6             |211.867|    ±0.7        3   0.7             |191.067|    ±0.5           -9.8% ivb44/fsmark/1x-64t-4BRD_12G-RAID5-f2fs-4M-30G-fsyncBeforeClose

NOTE: here are some more info about those test parameters for you to
      know what the testcase does better:

      1x: where 'x' means iterations or loop, corresponding to the 'L' option of fsmark

      1t, 64t: where 't' means thread

      4M: means the single file size, corresponding to the '-s' option of fsmark
      40G, 30G, 120G: means the total test size

      4BRD_12G: BRD is the ramdisk, where '4' means 4 ramdisk, and where '12G' means
                the size of one ramdisk. So, it would be 48G in total. And we made a
                raid on those ramdisk


The change is a bit interesting as you already stated it clear that this
patch is for performance gain. The patch itself is clear, too: remove a
mutex lock. So the only reasonable cause, without too much dig, I can think
of would be the remove of this lock reduces sleep time, and brings more
process to be able run, but somehow increases the context switches and cpu
usage in the meantime at somewhere. I guess this is what the following
changes are trying to tell us:

     29708 ±  2%   +5720.0%    1729051 ±  1%  fsmark.time.voluntary_context_switches
       302 ±  0%    +113.8%        647 ±  0%  fsmark.time.percent_of_cpu_this_job_got
     61.05 ±  0%    +214.0%     191.70 ±  0%  fsmark.time.system_time


FYI, Here I listed all changes for the outstanding change:

3   0.3             |490.800|    ±5.7        3   0.5             |262.067|    ±0.4          -46.6% ivb44/fsmark/1x-64t-4BRD_12G-RAID0-f2fs-4M-30G-fsyncBeforeClose

3402e87cfb5e762f  78373b7319abdf15050af5b163  
----------------  --------------------------  
         %stddev     %change         %stddev
             \          |                \  
     29708 ±  2%   +5720.0%    1729051 ±  1%  fsmark.time.voluntary_context_switches
     61.05 ±  0%    +214.0%     191.70 ±  0%  fsmark.time.system_time
       302 ±  0%    +113.8%        647 ±  0%  fsmark.time.percent_of_cpu_this_job_got
     10476 ±  0%     +95.4%      20467 ±  5%  fsmark.time.minor_page_faults
       490 ±  5%     -46.6%        262 ±  0%  fsmark.files_per_sec
     20.21 ±  0%     +46.7%      29.65 ±  0%  fsmark.time.elapsed_time
     20.21 ±  0%     +46.7%      29.65 ±  0%  fsmark.time.elapsed_time.max
    226379 ±  0%     +32.5%     299882 ±  0%  fsmark.app_overhead
         0 ±  0%      +Inf%       1045 ±  2%  proc-vmstat.numa_pages_migrated
       209 ± 26%   +3272.3%       7059 ±  3%  cpuidle.C1E-IVT.usage
       228 ± 42%    +686.7%       1799 ± 14%  numa-meminfo.node0.Writeback
     14633 ±  5%   +7573.2%    1122849 ±  1%  cpuidle.C1-IVT.usage
         0 ±  0%      +Inf%       1045 ±  2%  proc-vmstat.pgmigrate_success
     29708 ±  2%   +5720.0%    1729051 ±  1%  time.voluntary_context_switches
     55663 ±  0%    +776.9%     488081 ±  0%  cpuidle.C6-IVT.usage
        56 ± 42%    +718.8%        464 ± 11%  numa-vmstat.node0.nr_writeback
       535 ± 29%    +334.4%       2325 ± 10%  meminfo.Writeback
       129 ± 30%    +295.6%        511 ±  4%  proc-vmstat.nr_writeback
     59.25 ±  5%     -74.2%      15.26 ±  3%  turbostat.CPU%c6
      2.58 ±  8%     -74.5%       0.66 ± 11%  turbostat.Pkg%pc2
 1.551e+08 ± 14%    +233.4%  5.171e+08 ±  4%  cpuidle.C1-IVT.time
     32564 ± 24%    +208.1%     100330 ±  5%  softirqs.RCU
     61.05 ±  0%    +214.0%     191.70 ±  0%  time.system_time
        60 ± 32%    +165.7%        160 ± 16%  numa-vmstat.node1.nr_writeback
         2 ±  0%    +200.0%          6 ±  0%  vmstat.procs.r
      3057 ±  2%    +166.1%       8136 ± 22%  numa-vmstat.node0.nr_mapped
     12240 ±  2%    +165.9%      32547 ± 22%  numa-meminfo.node0.Mapped
      6324 ±  3%    +148.4%      15709 ±  0%  proc-vmstat.nr_mapped
     25318 ±  3%    +148.6%      62931 ±  0%  meminfo.Mapped
       302 ±  0%    +113.8%        647 ±  0%  time.percent_of_cpu_this_job_got
     34.39 ±  8%    +102.9%      69.79 ±  0%  turbostat.CPU%c1
      3247 ± 32%    +124.9%       7303 ± 14%  numa-vmstat.node0.nr_active_anon
     12994 ± 32%    +124.8%      29215 ± 14%  numa-meminfo.node0.Active(anon)
      3256 ±  4%    +132.5%       7569 ± 23%  numa-vmstat.node1.nr_mapped
      3252 ± 32%    +124.6%       7304 ± 14%  numa-vmstat.node0.nr_anon_pages
     13036 ±  4%    +132.3%      30279 ± 23%  numa-meminfo.node1.Mapped
     13033 ± 31%    +124.1%      29214 ± 14%  numa-meminfo.node0.AnonPages
     10476 ±  0%     +95.4%      20467 ±  5%  time.minor_page_faults
    121752 ±  7%     -35.8%      78122 ±  2%  numa-meminfo.node0.Dirty
     30381 ±  7%     -36.0%      19456 ±  2%  numa-vmstat.node0.nr_dirty
    113996 ±  4%     +64.4%     187416 ±  0%  proc-vmstat.pgfault
     59301 ±  1%     +60.9%      95414 ±  1%  softirqs.SCHED
      1.92 ±  1%     -31.9%       1.31 ±  4%  turbostat.Pkg%pc6
        16 ± 35%     +54.2%         24 ± 10%  cpuidle.POLL.usage
     20.21 ±  0%     +46.7%      29.65 ±  0%  time.elapsed_time.max
     20.21 ±  0%     +46.7%      29.65 ±  0%  time.elapsed_time
      7114 ±  2%     +46.3%      10405 ±  0%  proc-vmstat.nr_active_anon
     28499 ±  2%     +46.2%      41659 ±  0%  meminfo.Active(anon)
     28617 ±  2%     +45.7%      41692 ±  0%  meminfo.AnonPages
      7138 ±  2%     +45.7%      10403 ±  0%  proc-vmstat.nr_anon_pages
     95342 ±  2%     +44.9%     138177 ±  6%  numa-meminfo.node0.Active
     82347 ±  2%     +32.3%     108962 ±  9%  numa-meminfo.node0.Active(file)
     20586 ±  2%     +32.3%      27240 ±  9%  numa-vmstat.node0.nr_active_file
    233023 ±  3%     -30.7%     161486 ±  0%  meminfo.Dirty
      0.19 ±  2%     +36.2%       0.26 ±  3%  time.user_time
     57900 ±  3%     -30.5%      40219 ±  0%  proc-vmstat.nr_dirty
    108006 ±  8%     -24.6%      81460 ±  2%  numa-meminfo.node1.Dirty
     26896 ±  8%     -24.4%      20331 ±  2%  numa-vmstat.node1.nr_dirty
    219977 ±  0%     +32.9%     292456 ±  1%  softirqs.TIMER
    197382 ±  1%     +32.6%     261733 ±  0%  meminfo.Active
     42210 ±  1%     +30.2%      54947 ±  0%  proc-vmstat.nr_active_file
    168883 ±  1%     +30.3%     220073 ±  0%  meminfo.Active(file)
       353 ±  4%     +21.7%        430 ±  3%  numa-vmstat.node0.nr_page_table_pages
      1425 ±  4%     +20.2%       1714 ±  4%  numa-meminfo.node0.PageTables
     86494 ±  2%     +28.0%     110745 ± 10%  numa-meminfo.node1.Active(file)
     21621 ±  2%     +28.0%      27683 ± 10%  numa-vmstat.node1.nr_active_file
     46.00 ±  4%     +20.8%      55.56 ±  3%  uptime.boot
    101939 ±  4%     +20.8%     123117 ±  8%  numa-meminfo.node1.Active
      1412 ±  9%     +10.8%       1565 ±  8%  numa-vmstat.node0.nr_alloc_batch
     90644 ±  6%     +22.4%     110984 ±  0%  proc-vmstat.pgfree
      2084 ±  4%     +15.7%       2412 ±  3%  uptime.idle
     18666 ±  1%      +9.2%      20390 ±  4%  slabinfo.anon_vma.num_objs
     18663 ±  1%      +9.3%      20390 ±  4%  slabinfo.anon_vma.active_objs
    435.51 ±  0%     -89.9%      43.95 ±  0%  iostat.md0.avgrq-sz
      6415 ±  0%    +875.4%      62579 ±  1%  iostat.md0.w/s
      5595 ±  1%   +1883.4%     110979 ±  1%  vmstat.system.cs
    280678 ±  0%    +315.2%    1165516 ±  0%  iostat.md0.avgqu-sz
      6.31 ±  0%    +123.9%      14.14 ±  0%  turbostat.%Busy
       195 ±  0%    +116.4%        423 ±  0%  turbostat.Avg_MHz
     26035 ±  0%     +76.8%      46037 ±  0%  vmstat.system.in
   1370424 ±  0%     -28.1%     985009 ±  0%  vmstat.io.bo
   1459306 ±  0%     -28.1%    1048622 ±  0%  iostat.md0.wkB/s
     72.28 ±  0%     +29.6%      93.70 ±  0%  turbostat.CorWatt
       102 ±  0%     +21.4%        124 ±  0%  turbostat.PkgWatt
      6.66 ±  2%      +8.1%       7.19 ±  0%  turbostat.RAMWatt

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ