lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Fri, 17 Mar 2017 10:15:50 +0800
From:   Jaegeuk Kim <jaegeuk@...nel.org>
To:     Ye Xiaolong <xiaolong.ye@...el.com>
Cc:     Chao Yu <yuchao0@...wei.com>,
        Thorsten Leemhuis <regressions@...mhuis.info>, lkp@...org,
        linux-kernel@...r.kernel.org
Subject: Re: [lkp-robot] [f2fs] 4ac912427c: -33.7% aim7.jobs-per-min
 regression

Hi,

On 03/17, Ye Xiaolong wrote:
> On 03/15, Jaegeuk Kim wrote:
> >On 03/14, Ye Xiaolong wrote:
> >> On 03/14, Chao Yu wrote:
> >> >On 2017/3/14 3:22, Jaegeuk Kim wrote:
> >> >> On 03/13, Thorsten Leemhuis wrote:
> >> >>> @Chao Yu/@...geuk Kim: I'm considering to add this to the regressions
> >> >>> report for 4.11; or is there a reason why it shouldn't be considered a
> >> >>> regression? Ciao, Thorsten
> >> >>
> >> >> Hi,
> >> >>
> >> >> I'm planning to submit f2fs updates for 4.11-rcX including a patch which
> >> >> resolves this issue as well, as I expect.
> >> >>
> >> >> https://lkml.org/lkml/2017/3/7/813
> >> >
> >> >Sorry for late repay.
> >> >
> >> >I expect below patches in Jaegeuk's tree could help to recover the performance
> >> >as well
> >> >
> >> >	f2fs: skip scanning free nid bitmap of full NAT blocks
> >> >	f2fs: combine nat_bits and free_nid_bitmap cache
> >>
> >> These 2 patches do help recover the performance back. Details as below.
> >>
> >> Tested-by: Xiaolong Ye <xiaolong.ye@...el.com>
> >>
> >> commit:
> >>   4ac912427c4214d8031d9ad6fbc3bc75e71512df ("f2fs: introduce free nid bitmap")
> >>   ced2c7ea8e99b46755a270872cd5ba61c27cffad <= parent commit of 4ac912427c
> >>   c0e39d642e41be12937f4532721fc2538182e264 ("f2fs: combine nat_bits and free_nid_bitmap cache")
> >
> >Thank you for testing them.
> >
> >BTW, I found one missing clear_bit_le conversion from c0e39d642e41b.
> >I updated the original patch and uploaded it into f2fs.git.
> >
> >  d00030cf9cd0bb9 ("f2fs: use __set{__clear}_bit_le")
> >  1382c0f3f9d3f93 ("f2fs: combine nat_bits and free_nid_bitmap cache")
> >
> >Could you please test the above patches from f2fs.git one more time?
> >I need to confirm there-in root-cause for next upstream.
> >
> 
> Here is the comparison results, each commit has been tested for 5 times.
> 
> commit:
>   4ac912427c4214d8031d9ad6fbc3bc75e71512df ("f2fs: introduce free nid bitmap")
>   d00030cf9cd0bb96fdccc41e33d3c91dcbb672ba ("f2fs: use __set{__clear}_bit_le")
>   1382c0f3f9d3f936c8bc42ed1591cf7a593ef9f7 ("f2fs: combine nat_bits and free_nid_bitmap cache")

Got it. I'll prepare upstream with them.
Thank you so much.

> 
> 4ac912427c4214d8 d00030cf9cd0bb96fdccc41e33 1382c0f3f9d3f936c8bc42ed15
> ---------------- -------------------------- --------------------------
>          %stddev     %change         %stddev     %change         %stddev
>              \          |                \          |                \
>      77863 ±  0%      +2.1%      79485 ±  1%     +50.8%     117404 ±  0%  aim7.jobs-per-min
>     231.63 ±  0%      -2.0%     227.01 ±  1%     -33.6%     153.80 ±  0%  aim7.time.elapsed_time
>     231.63 ±  0%      -2.0%     227.01 ±  1%     -33.6%     153.80 ±  0%  aim7.time.elapsed_time.max
>     896604 ±  0%      -0.8%     889221 ±  3%     -20.2%     715260 ±  1%  aim7.time.involuntary_context_switches
>       2394 ±  1%      +4.6%       2503 ±  1%      +3.7%       2481 ±  2%  aim7.time.maximum_resident_set_size
>       6240 ±  0%      -1.5%       6145 ±  1%     -14.1%       5360 ±  1%  aim7.time.system_time
>    1111357 ±  3%      +1.9%    1132509 ±  2%      -6.2%    1041932 ±  2%  aim7.time.voluntary_context_switches
>    5600256 ±  9%      -4.7%    5339136 ±  8%      -9.4%    5074534 ±  0%  meminfo.DirectMap2M
>      78738 ±  8%     +10.0%      86602 ±  8%     +25.0%      98448 ±  2%  meminfo.Dirty
>      20625 ±  0%     +18.3%      24403 ± 11%     +11.4%      22985 ±  6%  meminfo.Mapped
>     393582 ± 85%     -84.9%      59547 ±  2%     -86.1%      54561 ±  9%  meminfo.Shmem
>       1328 ± 18%    +786.4%      11771 ± 18%    +440.6%       7179 ± 26%  softirqs.NET_RX
>     669152 ±  3%      -0.7%     664708 ±  2%     -11.9%     589420 ±  2%  softirqs.RCU
>     170724 ±  0%      +0.1%     170929 ±  7%     -19.8%     136921 ±  7%  softirqs.SCHED
>    2688290 ±  0%      -1.6%    2646510 ±  1%     -13.8%    2317797 ±  1%  softirqs.TIMER
>       4948 ±  3%      +5.9%       5240 ±  1%     +55.6%       7700 ±  0%  vmstat.io.bo
>      39.00 ±  2%     -69.2%      12.00 ± 15%     -67.2%      12.80 ± 15%  vmstat.procs.b
>     171.50 ±  2%     +17.6%     201.75 ± 12%     +41.0%     241.80 ±  9%  vmstat.procs.r
>      13425 ±  1%     +24.4%      16706 ±  9%     +41.3%      18974 ±  9%  vmstat.system.cs
>      45100 ±  1%      +1.4%      45733 ±  0%      +6.8%      48162 ±  0%  vmstat.system.in
>     231.63 ±  0%      -2.0%     227.01 ±  1%     -33.6%     153.80 ±  0%  time.elapsed_time
>     231.63 ±  0%      -2.0%     227.01 ±  1%     -33.6%     153.80 ±  0%  time.elapsed_time.max
>     896604 ±  0%      -0.8%     889221 ±  3%     -20.2%     715260 ±  1%  time.involuntary_context_switches
>       2701 ±  0%      +0.5%       2713 ±  0%     +29.3%       3492 ±  0%  time.percent_of_cpu_this_job_got
>       6240 ±  0%      -1.5%       6145 ±  1%     -14.1%       5360 ±  1%  time.system_time
>      16.66 ±  0%      -4.2%      15.96 ±  7%     -26.0%      12.34 ±  5%  time.user_time
> 
> Thanks,
> Xiaolong
> 
> >Thanks,
> >
> >>
> >> 4ac912427c4214d8 ced2c7ea8e99b46755a270872c c0e39d642e41be12937f453272
> >> ---------------- -------------------------- --------------------------
> >>          %stddev     %change         %stddev     %change         %stddev
> >>              \          |                \          |                \
> >>      77863 ±  0%     +50.8%     117419 ±  1%     +50.9%     117500 ±  0%  aim7.jobs-per-min
> >>     231.63 ±  0%     -33.6%     153.78 ±  1%     -33.7%     153.67 ±  0%  aim7.time.elapsed_time
> >>     231.63 ±  0%     -33.6%     153.78 ±  1%     -33.7%     153.67 ±  0%  aim7.time.elapsed_time.max
> >>     896604 ±  0%     -10.1%     805644 ±  3%     -20.8%     710207 ±  1%  aim7.time.involuntary_context_switches
> >>       6240 ±  0%     -13.3%       5408 ±  1%     -14.2%       5353 ±  1%  aim7.time.system_time
> >>    1111357 ±  3%      -1.3%    1097209 ±  2%      -7.8%    1024716 ±  0%  aim7.time.voluntary_context_switches
> >>    5600256 ±  9%      -9.5%    5066069 ±  0%      -9.3%    5078220 ± 13%  meminfo.DirectMap2M
> >>      78738 ±  8%     +72.1%     135538 ±  8%     +23.2%      96995 ±  6%  meminfo.Dirty
> >>     315.50 ± 12%    +210.8%     980.67 ± 16%    +123.9%     706.40 ± 22%  meminfo.Writeback
> >>       1328 ± 18%    +357.1%       6069 ± 57%    +231.1%       4397 ± 75%  softirqs.NET_RX
> >>     669152 ±  3%      -7.4%     619333 ±  4%     -14.4%     572896 ±  1%  softirqs.RCU
> >>     170724 ±  0%     -25.0%     128030 ±  2%     -23.0%     131531 ±  4%  softirqs.SCHED
> >>    2688290 ±  0%     -13.3%    2331994 ±  1%     -14.7%    2292443 ±  1%  softirqs.TIMER
> >>       4948 ±  3%     +55.6%       7701 ±  1%     +55.8%       7710 ±  0%  vmstat.io.bo
> >>      39.00 ±  2%     +65.8%      64.67 ±  2%     -52.8%      18.40 ± 30%  vmstat.procs.b
> >>     171.50 ±  2%     +94.4%     333.33 ±  7%     +29.7%     222.40 ±  7%  vmstat.procs.r
> >>      13425 ±  1%     +30.6%      17530 ±  1%     +43.9%      19321 ± 10%  vmstat.system.cs
> >>      45100 ±  1%      +5.6%      47642 ±  1%      +6.3%      47957 ±  0%  vmstat.system.in
> >>      19068 ±  0%     +75.8%      33522 ±  4%     +24.5%      23745 ±  5%  proc-vmstat.nr_dirty
> >>      80.00 ±  3%    +195.0%     236.00 ± 14%    +126.0%     180.80 ± 23%  proc-vmstat.nr_writeback
> >>      19222 ±  0%     +76.4%      33907 ±  4%     +24.9%      24004 ±  5%  proc-vmstat.nr_zone_write_pending
> >>
> >> Thanks,
> >> Xiaolong
> >> >
> >> >Xiaolong, Thorsten
> >> >
> >> >Could you help to test these patches?
> >> >
> >> >Thanks,
> >> >
> >> >>
> >> >> Thanks,
> >> >>
> >> >>>
> >> >>> On 08.03.2017 02:21, kernel test robot wrote:
> >> >>>>
> >> >>>> Greeting,
> >> >>>>
> >> >>>> We noticed a -33.7 regression of aim7.jobs-per-min due to commit:
> >> >>>>
> >> >>>> commit: 4ac912427c4214d8031d9ad6fbc3bc75e71512df ("f2fs: introduce free nid bitmap")
> >> >>>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> >> >>>>
> >> >>>> in testcase: aim7
> >> >>>> on test machine: 40 threads Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz with 384G memory
> >> >>>> with following parameters:
> >> >>>>
> >> >>>> 	disk: 1BRD_48G
> >> >>>> 	fs: f2fs
> >> >>>> 	test: disk_wrt
> >> >>>> 	load: 3000
> >> >>>> 	cpufreq_governor: performance
> >> >>>>
> >> >>>> test-description: AIM7 is a traditional UNIX system level benchmark suite which is used to test and measure the performance of multiuser system.
> >> >>>> test-url: https://sourceforge.net/projects/aimbench/files/aim-suite7/
> >> >>>>
> >> >>>>
> >> >>>>
> >> >>>> Details are as below:
> >> >>>> -------------------------------------------------------------------------------------------------->
> >> >>>>
> >> >>>>
> >> >>>> To reproduce:
> >> >>>>
> >> >>>>         git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
> >> >>>>         cd lkp-tests
> >> >>>>         bin/lkp install job.yaml  # job file is attached in this email
> >> >>>>         bin/lkp run     job.yaml
> >> >>>>
> >> >>>> testcase/path_params/tbox_group/run: aim7/1BRD_48G-f2fs-disk_wrt-3000-performance/lkp-ivb-ep01
> >> >>>>
> >> >>>> ced2c7ea8e99b467  4ac912427c4214d8031d9ad6fb
> >> >>>> ----------------  --------------------------
> >> >>>>          %stddev      change         %stddev
> >> >>>>              \          |                \
> >> >>>>     117419 ±  1%     -33.7%      77863 ±  0%  aim7.jobs-per-min
> >> >>>>     153.78 ±  1%     +50.6%     231.63 ±  0%  aim7.time.elapsed_time
> >> >>>>     153.78 ±  1%     +50.6%     231.63 ±  0%  aim7.time.elapsed_time.max
> >> >>>>     805644 ±  3%     +11.3%     896604 ±  0%  aim7.time.involuntary_context_switches
> >> >>>>       5408 ±  1%     +15.4%       6240 ±  0%  aim7.time.system_time
> >> >>>>    5066069 ±  0%     +10.5%    5600256 ±  9%  meminfo.DirectMap2M
> >> >>>>     135538 ±  8%     -41.9%      78738 ±  8%  meminfo.Dirty
> >> >>>>     980.67 ± 16%     -67.8%     315.50 ± 12%  meminfo.Writeback
> >> >>>>      71322 ± 10%     -44.0%      39953 ±  1%  numa-meminfo.node0.Dirty
> >> >>>>      11158 ± 18%     -27.1%       8132 ±  0%  numa-meminfo.node0.Mapped
> >> >>>>      56776 ±  6%     -32.5%      38309 ±  0%  numa-meminfo.node1.Dirty
> >> >>>>       9684 ± 22%     +30.9%      12676 ±  0%  numa-meminfo.node1.Mapped
> >> >>>>       6069 ± 57%     -78.1%       1328 ± 18%  softirqs.NET_RX
> >> >>>>     619333 ±  4%      +8.0%     669152 ±  3%  softirqs.RCU
> >> >>>>     128030 ±  2%     +33.3%     170724 ±  0%  softirqs.SCHED
> >> >>>>    2331994 ±  1%     +15.3%    2688290 ±  0%  softirqs.TIMER
> >> >>>>       7701 ±  1%     -35.7%       4948 ±  3%  vmstat.io.bo
> >> >>>>      64.67 ±  2%     -39.7%      39.00 ±  2%  vmstat.procs.b
> >> >>>>     333.33 ±  7%     -48.5%     171.50 ±  2%  vmstat.procs.r
> >> >>>>      17530 ±  1%     -23.4%      13425 ±  1%  vmstat.system.cs
> >> >>>>      47642 ±  1%      -5.3%      45100 ±  1%  vmstat.system.in
> >> >>>>      33522 ±  4%     -43.1%      19068 ±  0%  proc-vmstat.nr_dirty
> >> >>>>     236.00 ± 14%     -66.1%      80.00 ±  3%  proc-vmstat.nr_writeback
> >> >>>>      33907 ±  4%     -43.3%      19222 ±  0%  proc-vmstat.nr_zone_write_pending
> >> >>>>      28194 ± 10%     +10.4%      31131 ±  6%  proc-vmstat.pgactivate
> >> >>>>     746402 ±  2%     +24.6%     929960 ±  3%  proc-vmstat.pgfault
> >> >>>>     153.78 ±  1%     +50.6%     231.63 ±  0%  time.elapsed_time
> >> >>>>     153.78 ±  1%     +50.6%     231.63 ±  0%  time.elapsed_time.max
> >> >>>>     805644 ±  3%     +11.3%     896604 ±  0%  time.involuntary_context_switches
> >> >>>>       3524 ±  0%     -23.4%       2701 ±  0%  time.percent_of_cpu_this_job_got
> >> >>>>       5408 ±  1%     +15.4%       6240 ±  0%  time.system_time
> >> >>>>      12.19 ±  1%     +36.7%      16.66 ±  0%  time.user_time
> >> >>>>   48260939 ±  3%     +12.1%   54110616 ±  2%  cpuidle.C1-IVT.time
> >> >>>>   33149237 ±  5%     +52.6%   50597349 ±  1%  cpuidle.C1E-IVT.time
> >> >>>>      89642 ±  4%     +52.8%     136976 ±  0%  cpuidle.C1E-IVT.usage
> >> >>>>   13534795 ±  6%    +276.3%   50934566 ± 55%  cpuidle.C3-IVT.time
> >> >>>>      42893 ±  6%    +138.8%     102439 ± 30%  cpuidle.C3-IVT.usage
> >> >>>>  6.431e+08 ±  2%    +390.1%  3.152e+09 ± 10%  cpuidle.C6-IVT.time
> >> >>>>     802009 ±  2%    +375.3%    3811880 ± 10%  cpuidle.C6-IVT.usage
> >> >>>>    1535987 ±  4%    +156.3%    3936830 ±  4%  cpuidle.POLL.time
> >> >>>>      88.14 ±  0%     -24.9%      66.17 ±  3%  turbostat.%Busy
> >> >>>>       2659 ±  0%     -44.7%       1471 ±  3%  turbostat.Avg_MHz
> >> >>>>       3016 ±  0%     -26.3%       2224 ±  0%  turbostat.Bzy_MHz
> >> >>>>       5.20 ±  5%    +127.0%      11.80 ±  2%  turbostat.CPU%c1
> >> >>>>
> >> >>>>
> >> >>>>
> >> >>>>                                 perf-stat.page-faults
> >> >>>>
> >> >>>>    1e+06 ++-----------------------------------------------------------------+
> >> >>>>   900000 O+O O   O O  O       O            O O                              |
> >> >>>>          |     O        O O O   O O O O O          O                        |
> >> >>>>   800000 ++              .*.*.                .*. .*.         .*..          |
> >> >>>>   700000 ++*.*.*.*.*..*.*     *.*.*.*.*.*..*.*   *   *.*.*.*.*    *.*.*.*   *
> >> >>>>          | :                                                            :   |
> >> >>>>   600000 ++:                                                            :   |
> >> >>>>   500000 ++                                                              : :|
> >> >>>>   400000 ++                                                              : :|
> >> >>>>          |:                                                              : :|
> >> >>>>   300000 ++                                                              : :|
> >> >>>>   200000 ++                                                              : :|
> >> >>>>          |                                                                : |
> >> >>>>   100000 ++                                                               : |
> >> >>>>        0 *+------------------------------------O-O------------------------*-+
> >> >>>>
> >> >>>>
> >> >>>>                                perf-stat.minor-faults
> >> >>>>
> >> >>>>    1e+06 ++-----------------------------------------------------------------+
> >> >>>>   900000 O+O O   O O  O       O            O O                              |
> >> >>>>          |     O        O O O   O O O O O          O                        |
> >> >>>>   800000 ++              .*.*.                .*. .*.         .*..          |
> >> >>>>   700000 ++*.*.*.*.*..*.*     *.*.*.*.*.*..*.*   *   *.*.*.*.*    *.*.*.*   *
> >> >>>>          | :                                                            :   |
> >> >>>>   600000 ++:                                                            :   |
> >> >>>>   500000 ++                                                              : :|
> >> >>>>   400000 ++                                                              : :|
> >> >>>>          |:                                                              : :|
> >> >>>>   300000 ++                                                              : :|
> >> >>>>   200000 ++                                                              : :|
> >> >>>>          |                                                                : |
> >> >>>>   100000 ++                                                               : |
> >> >>>>        0 *+------------------------------------O-O------------------------*-+
> >> >>>>
> >> >>>>
> >> >>>>                                   aim7.jobs-per-min
> >> >>>>
> >> >>>>   140000 ++-----------------------------------------------------------------+
> >> >>>>          | *.     .*..     .*.*.*.*.*.*.*..*.*.*.*.*.*.                     |
> >> >>>>   120000 ++: *.*.*    *.*.*                            *.*.*.*.*..*.*.*.    *
> >> >>>>          | :                                                            *   |
> >> >>>>   100000 ++:                                                            :   |
> >> >>>>          |:                                                             :   |
> >> >>>>    80000 O+O O O O O  O O O O O O O O O O  O O     O                     : :|
> >> >>>>          |:                                                              : :|
> >> >>>>    60000 ++                                                              : :|
> >> >>>>          |:                                                              : :|
> >> >>>>    40000 ++                                                              : :|
> >> >>>>          |                                                               : :|
> >> >>>>    20000 ++                                                               : |
> >> >>>>          |                                                                : |
> >> >>>>        0 *+------------------------------------O-O------------------------*-+
> >> >>>>
> >> >>>>
> >> >>>>
> >> >>>> 	[*] bisect-good sample
> >> >>>> 	[O] bisect-bad  sample
> >> >>>>
> >> >>>>
> >> >>>> Disclaimer:
> >> >>>> Results have been estimated based on internal Intel analysis and are provided
> >> >>>> for informational purposes only. Any difference in system hardware or software
> >> >>>> design or configuration may affect actual performance.
> >> >>>>
> >> >>>>
> >> >>>> Thanks,
> >> >>>> Xiaolong
> >> >>>>
> >> >>
> >> >> .
> >> >>
> >> >

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ