lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Tue, 21 Apr 2015 11:00:06 -0700
From:	Jaegeuk Kim <jaegeuk@...nel.org>
To:	Yuanhan Liu <yuanhan.liu@...ux.intel.com>
Cc:	lkp@...org, LKML <linux-kernel@...r.kernel.org>
Subject: Re: performance changes on 78373b73:  -46.6% fsmark.files_per_sec,
 and few more

Thank you for the report.

Interesting to me as well.
Maybe, I suspect f2fs_submit_merged_bio(sbi, DATA, WRITE); in
f2fs_write_data_pages was called too many from multiple threads
at the same time, which results in bio splits.

Not sure, so, at this moment, I'll revert this patch. Once finishing
analysis on this issue, I'd be better repost the patch later.

Thank you,

On Tue, Apr 21, 2015 at 12:34:08PM +0800, Yuanhan Liu wrote:
> FYI, we found changes on `fsmark.files_per_sec' by 78373b7319abdf15050af5b1632c4c8b8b398f33:
> 
>     > commit 78373b7319abdf15050af5b1632c4c8b8b398f33
>     > Author:     Jaegeuk Kim <jaegeuk@...nel.org>
>     > AuthorDate: Fri Mar 13 21:44:36 2015 -0700
>     > Commit:     Jaegeuk Kim <jaegeuk@...nel.org>
>     > CommitDate: Fri Apr 10 15:08:45 2015 -0700
>     > 
>     >     f2fs: enhance multi-threads performance
> 
> 3402e87cfb5e762f9c95071bf4a2ad65fd9392a2     78373b7319abdf15050af5b1632c4c8b8b398f33
> ----------------------------------------     ----------------------------------------
> run time(m)     metric_value     ±stddev     run time(m)     metric_value     ±stddev     change   testbox/benchmark/testcase-params
> --- ------  ----------------------------     --- ------  ----------------------------     -------- ------------------------------
> 3   0.3             |490.800|    ±5.7        3   0.5             |262.067|    ±0.4          -46.6% ivb44/fsmark/1x-64t-4BRD_12G-RAID0-f2fs-4M-30G-fsyncBeforeClose
> 3   0.3             |468.367|    ±3.5        3   0.5             |264.467|    ±0.2          -43.5% ivb44/fsmark/1x-64t-9BRD_6G-RAID0-f2fs-4M-30G-fsyncBeforeClose
> 3   0.6             |211.867|    ±0.7        3   0.7             |191.067|    ±0.5           -9.8% ivb44/fsmark/1x-64t-4BRD_12G-RAID5-f2fs-4M-30G-fsyncBeforeClose
> 
> NOTE: here are some more info about those test parameters for you to
>       know what the testcase does better:
> 
>       1x: where 'x' means iterations or loop, corresponding to the 'L' option of fsmark
> 
>       1t, 64t: where 't' means thread
> 
>       4M: means the single file size, corresponding to the '-s' option of fsmark
>       40G, 30G, 120G: means the total test size
> 
>       4BRD_12G: BRD is the ramdisk, where '4' means 4 ramdisk, and where '12G' means
>                 the size of one ramdisk. So, it would be 48G in total. And we made a
>                 raid on those ramdisk
> 
> 
> The change is a bit interesting as you already stated it clear that this
> patch is for performance gain. The patch itself is clear, too: remove a
> mutex lock. So the only reasonable cause, without too much dig, I can think
> of would be the remove of this lock reduces sleep time, and brings more
> process to be able run, but somehow increases the context switches and cpu
> usage in the meantime at somewhere. I guess this is what the following
> changes are trying to tell us:
> 
>      29708 ±  2%   +5720.0%    1729051 ±  1%  fsmark.time.voluntary_context_switches
>        302 ±  0%    +113.8%        647 ±  0%  fsmark.time.percent_of_cpu_this_job_got
>      61.05 ±  0%    +214.0%     191.70 ±  0%  fsmark.time.system_time
> 
> 
> FYI, Here I listed all changes for the outstanding change:
> 
> 3   0.3             |490.800|    ±5.7        3   0.5             |262.067|    ±0.4          -46.6% ivb44/fsmark/1x-64t-4BRD_12G-RAID0-f2fs-4M-30G-fsyncBeforeClose
> 
> 3402e87cfb5e762f  78373b7319abdf15050af5b163  
> ----------------  --------------------------  
>          %stddev     %change         %stddev
>              \          |                \  
>      29708 ±  2%   +5720.0%    1729051 ±  1%  fsmark.time.voluntary_context_switches
>      61.05 ±  0%    +214.0%     191.70 ±  0%  fsmark.time.system_time
>        302 ±  0%    +113.8%        647 ±  0%  fsmark.time.percent_of_cpu_this_job_got
>      10476 ±  0%     +95.4%      20467 ±  5%  fsmark.time.minor_page_faults
>        490 ±  5%     -46.6%        262 ±  0%  fsmark.files_per_sec
>      20.21 ±  0%     +46.7%      29.65 ±  0%  fsmark.time.elapsed_time
>      20.21 ±  0%     +46.7%      29.65 ±  0%  fsmark.time.elapsed_time.max
>     226379 ±  0%     +32.5%     299882 ±  0%  fsmark.app_overhead
>          0 ±  0%      +Inf%       1045 ±  2%  proc-vmstat.numa_pages_migrated
>        209 ± 26%   +3272.3%       7059 ±  3%  cpuidle.C1E-IVT.usage
>        228 ± 42%    +686.7%       1799 ± 14%  numa-meminfo.node0.Writeback
>      14633 ±  5%   +7573.2%    1122849 ±  1%  cpuidle.C1-IVT.usage
>          0 ±  0%      +Inf%       1045 ±  2%  proc-vmstat.pgmigrate_success
>      29708 ±  2%   +5720.0%    1729051 ±  1%  time.voluntary_context_switches
>      55663 ±  0%    +776.9%     488081 ±  0%  cpuidle.C6-IVT.usage
>         56 ± 42%    +718.8%        464 ± 11%  numa-vmstat.node0.nr_writeback
>        535 ± 29%    +334.4%       2325 ± 10%  meminfo.Writeback
>        129 ± 30%    +295.6%        511 ±  4%  proc-vmstat.nr_writeback
>      59.25 ±  5%     -74.2%      15.26 ±  3%  turbostat.CPU%c6
>       2.58 ±  8%     -74.5%       0.66 ± 11%  turbostat.Pkg%pc2
>  1.551e+08 ± 14%    +233.4%  5.171e+08 ±  4%  cpuidle.C1-IVT.time
>      32564 ± 24%    +208.1%     100330 ±  5%  softirqs.RCU
>      61.05 ±  0%    +214.0%     191.70 ±  0%  time.system_time
>         60 ± 32%    +165.7%        160 ± 16%  numa-vmstat.node1.nr_writeback
>          2 ±  0%    +200.0%          6 ±  0%  vmstat.procs.r
>       3057 ±  2%    +166.1%       8136 ± 22%  numa-vmstat.node0.nr_mapped
>      12240 ±  2%    +165.9%      32547 ± 22%  numa-meminfo.node0.Mapped
>       6324 ±  3%    +148.4%      15709 ±  0%  proc-vmstat.nr_mapped
>      25318 ±  3%    +148.6%      62931 ±  0%  meminfo.Mapped
>        302 ±  0%    +113.8%        647 ±  0%  time.percent_of_cpu_this_job_got
>      34.39 ±  8%    +102.9%      69.79 ±  0%  turbostat.CPU%c1
>       3247 ± 32%    +124.9%       7303 ± 14%  numa-vmstat.node0.nr_active_anon
>      12994 ± 32%    +124.8%      29215 ± 14%  numa-meminfo.node0.Active(anon)
>       3256 ±  4%    +132.5%       7569 ± 23%  numa-vmstat.node1.nr_mapped
>       3252 ± 32%    +124.6%       7304 ± 14%  numa-vmstat.node0.nr_anon_pages
>      13036 ±  4%    +132.3%      30279 ± 23%  numa-meminfo.node1.Mapped
>      13033 ± 31%    +124.1%      29214 ± 14%  numa-meminfo.node0.AnonPages
>      10476 ±  0%     +95.4%      20467 ±  5%  time.minor_page_faults
>     121752 ±  7%     -35.8%      78122 ±  2%  numa-meminfo.node0.Dirty
>      30381 ±  7%     -36.0%      19456 ±  2%  numa-vmstat.node0.nr_dirty
>     113996 ±  4%     +64.4%     187416 ±  0%  proc-vmstat.pgfault
>      59301 ±  1%     +60.9%      95414 ±  1%  softirqs.SCHED
>       1.92 ±  1%     -31.9%       1.31 ±  4%  turbostat.Pkg%pc6
>         16 ± 35%     +54.2%         24 ± 10%  cpuidle.POLL.usage
>      20.21 ±  0%     +46.7%      29.65 ±  0%  time.elapsed_time.max
>      20.21 ±  0%     +46.7%      29.65 ±  0%  time.elapsed_time
>       7114 ±  2%     +46.3%      10405 ±  0%  proc-vmstat.nr_active_anon
>      28499 ±  2%     +46.2%      41659 ±  0%  meminfo.Active(anon)
>      28617 ±  2%     +45.7%      41692 ±  0%  meminfo.AnonPages
>       7138 ±  2%     +45.7%      10403 ±  0%  proc-vmstat.nr_anon_pages
>      95342 ±  2%     +44.9%     138177 ±  6%  numa-meminfo.node0.Active
>      82347 ±  2%     +32.3%     108962 ±  9%  numa-meminfo.node0.Active(file)
>      20586 ±  2%     +32.3%      27240 ±  9%  numa-vmstat.node0.nr_active_file
>     233023 ±  3%     -30.7%     161486 ±  0%  meminfo.Dirty
>       0.19 ±  2%     +36.2%       0.26 ±  3%  time.user_time
>      57900 ±  3%     -30.5%      40219 ±  0%  proc-vmstat.nr_dirty
>     108006 ±  8%     -24.6%      81460 ±  2%  numa-meminfo.node1.Dirty
>      26896 ±  8%     -24.4%      20331 ±  2%  numa-vmstat.node1.nr_dirty
>     219977 ±  0%     +32.9%     292456 ±  1%  softirqs.TIMER
>     197382 ±  1%     +32.6%     261733 ±  0%  meminfo.Active
>      42210 ±  1%     +30.2%      54947 ±  0%  proc-vmstat.nr_active_file
>     168883 ±  1%     +30.3%     220073 ±  0%  meminfo.Active(file)
>        353 ±  4%     +21.7%        430 ±  3%  numa-vmstat.node0.nr_page_table_pages
>       1425 ±  4%     +20.2%       1714 ±  4%  numa-meminfo.node0.PageTables
>      86494 ±  2%     +28.0%     110745 ± 10%  numa-meminfo.node1.Active(file)
>      21621 ±  2%     +28.0%      27683 ± 10%  numa-vmstat.node1.nr_active_file
>      46.00 ±  4%     +20.8%      55.56 ±  3%  uptime.boot
>     101939 ±  4%     +20.8%     123117 ±  8%  numa-meminfo.node1.Active
>       1412 ±  9%     +10.8%       1565 ±  8%  numa-vmstat.node0.nr_alloc_batch
>      90644 ±  6%     +22.4%     110984 ±  0%  proc-vmstat.pgfree
>       2084 ±  4%     +15.7%       2412 ±  3%  uptime.idle
>      18666 ±  1%      +9.2%      20390 ±  4%  slabinfo.anon_vma.num_objs
>      18663 ±  1%      +9.3%      20390 ±  4%  slabinfo.anon_vma.active_objs
>     435.51 ±  0%     -89.9%      43.95 ±  0%  iostat.md0.avgrq-sz
>       6415 ±  0%    +875.4%      62579 ±  1%  iostat.md0.w/s
>       5595 ±  1%   +1883.4%     110979 ±  1%  vmstat.system.cs
>     280678 ±  0%    +315.2%    1165516 ±  0%  iostat.md0.avgqu-sz
>       6.31 ±  0%    +123.9%      14.14 ±  0%  turbostat.%Busy
>        195 ±  0%    +116.4%        423 ±  0%  turbostat.Avg_MHz
>      26035 ±  0%     +76.8%      46037 ±  0%  vmstat.system.in
>    1370424 ±  0%     -28.1%     985009 ±  0%  vmstat.io.bo
>    1459306 ±  0%     -28.1%    1048622 ±  0%  iostat.md0.wkB/s
>      72.28 ±  0%     +29.6%      93.70 ±  0%  turbostat.CorWatt
>        102 ±  0%     +21.4%        124 ±  0%  turbostat.PkgWatt
>       6.66 ±  2%      +8.1%       7.19 ±  0%  turbostat.RAMWatt
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ