[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9a81763c-dc02-fbe3-4da7-bfb006e6dc16@huawei.com>
Date: Tue, 14 Mar 2017 19:19:04 +0800
From: Chao Yu <yuchao0@...wei.com>
To: Ye Xiaolong <xiaolong.ye@...el.com>
CC: Jaegeuk Kim <jaegeuk@...nel.org>,
Thorsten Leemhuis <regressions@...mhuis.info>, <lkp@...org>,
<linux-kernel@...r.kernel.org>
Subject: Re: [lkp-robot] [f2fs] 4ac912427c: -33.7% aim7.jobs-per-min
regression
On 2017/3/14 16:53, Ye Xiaolong wrote:
> On 03/14, Chao Yu wrote:
>> On 2017/3/14 3:22, Jaegeuk Kim wrote:
>>> On 03/13, Thorsten Leemhuis wrote:
>>>> @Chao Yu/@...geuk Kim: I'm considering to add this to the regressions
>>>> report for 4.11; or is there a reason why it shouldn't be considered a
>>>> regression? Ciao, Thorsten
>>>
>>> Hi,
>>>
>>> I'm planning to submit f2fs updates for 4.11-rcX including a patch which
>>> resolves this issue as well, as I expect.
>>>
>>> https://lkml.org/lkml/2017/3/7/813
>>
>> Sorry for late repay.
>>
>> I expect below patches in Jaegeuk's tree could help to recover the performance
>> as well
>>
>> f2fs: skip scanning free nid bitmap of full NAT blocks
>> f2fs: combine nat_bits and free_nid_bitmap cache
>
> These 2 patches do help recover the performance back. Details as below.
>
> Tested-by: Xiaolong Ye <xiaolong.ye@...el.com>
Thank you very much for the test. :)
Thanks,
>
> commit:
> 4ac912427c4214d8031d9ad6fbc3bc75e71512df ("f2fs: introduce free nid bitmap")
> ced2c7ea8e99b46755a270872cd5ba61c27cffad <= parent commit of 4ac912427c
> c0e39d642e41be12937f4532721fc2538182e264 ("f2fs: combine nat_bits and free_nid_bitmap cache")
>
> 4ac912427c4214d8 ced2c7ea8e99b46755a270872c c0e39d642e41be12937f453272
> ---------------- -------------------------- --------------------------
> %stddev %change %stddev %change %stddev
> \ | \ | \
> 77863 ± 0% +50.8% 117419 ± 1% +50.9% 117500 ± 0% aim7.jobs-per-min
> 231.63 ± 0% -33.6% 153.78 ± 1% -33.7% 153.67 ± 0% aim7.time.elapsed_time
> 231.63 ± 0% -33.6% 153.78 ± 1% -33.7% 153.67 ± 0% aim7.time.elapsed_time.max
> 896604 ± 0% -10.1% 805644 ± 3% -20.8% 710207 ± 1% aim7.time.involuntary_context_switches
> 6240 ± 0% -13.3% 5408 ± 1% -14.2% 5353 ± 1% aim7.time.system_time
> 1111357 ± 3% -1.3% 1097209 ± 2% -7.8% 1024716 ± 0% aim7.time.voluntary_context_switches
> 5600256 ± 9% -9.5% 5066069 ± 0% -9.3% 5078220 ± 13% meminfo.DirectMap2M
> 78738 ± 8% +72.1% 135538 ± 8% +23.2% 96995 ± 6% meminfo.Dirty
> 315.50 ± 12% +210.8% 980.67 ± 16% +123.9% 706.40 ± 22% meminfo.Writeback
> 1328 ± 18% +357.1% 6069 ± 57% +231.1% 4397 ± 75% softirqs.NET_RX
> 669152 ± 3% -7.4% 619333 ± 4% -14.4% 572896 ± 1% softirqs.RCU
> 170724 ± 0% -25.0% 128030 ± 2% -23.0% 131531 ± 4% softirqs.SCHED
> 2688290 ± 0% -13.3% 2331994 ± 1% -14.7% 2292443 ± 1% softirqs.TIMER
> 4948 ± 3% +55.6% 7701 ± 1% +55.8% 7710 ± 0% vmstat.io.bo
> 39.00 ± 2% +65.8% 64.67 ± 2% -52.8% 18.40 ± 30% vmstat.procs.b
> 171.50 ± 2% +94.4% 333.33 ± 7% +29.7% 222.40 ± 7% vmstat.procs.r
> 13425 ± 1% +30.6% 17530 ± 1% +43.9% 19321 ± 10% vmstat.system.cs
> 45100 ± 1% +5.6% 47642 ± 1% +6.3% 47957 ± 0% vmstat.system.in
> 19068 ± 0% +75.8% 33522 ± 4% +24.5% 23745 ± 5% proc-vmstat.nr_dirty
> 80.00 ± 3% +195.0% 236.00 ± 14% +126.0% 180.80 ± 23% proc-vmstat.nr_writeback
> 19222 ± 0% +76.4% 33907 ± 4% +24.9% 24004 ± 5% proc-vmstat.nr_zone_write_pending
>
> Thanks,
> Xiaolong
>>
>> Xiaolong, Thorsten
>>
>> Could you help to test these patches?
>>
>> Thanks,
>>
>>>
>>> Thanks,
>>>
>>>>
>>>> On 08.03.2017 02:21, kernel test robot wrote:
>>>>>
>>>>> Greeting,
>>>>>
>>>>> We noticed a -33.7 regression of aim7.jobs-per-min due to commit:
>>>>>
>>>>> commit: 4ac912427c4214d8031d9ad6fbc3bc75e71512df ("f2fs: introduce free nid bitmap")
>>>>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>>>>>
>>>>> in testcase: aim7
>>>>> on test machine: 40 threads Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz with 384G memory
>>>>> with following parameters:
>>>>>
>>>>> disk: 1BRD_48G
>>>>> fs: f2fs
>>>>> test: disk_wrt
>>>>> load: 3000
>>>>> cpufreq_governor: performance
>>>>>
>>>>> test-description: AIM7 is a traditional UNIX system level benchmark suite which is used to test and measure the performance of multiuser system.
>>>>> test-url: https://sourceforge.net/projects/aimbench/files/aim-suite7/
>>>>>
>>>>>
>>>>>
>>>>> Details are as below:
>>>>> -------------------------------------------------------------------------------------------------->
>>>>>
>>>>>
>>>>> To reproduce:
>>>>>
>>>>> git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
>>>>> cd lkp-tests
>>>>> bin/lkp install job.yaml # job file is attached in this email
>>>>> bin/lkp run job.yaml
>>>>>
>>>>> testcase/path_params/tbox_group/run: aim7/1BRD_48G-f2fs-disk_wrt-3000-performance/lkp-ivb-ep01
>>>>>
>>>>> ced2c7ea8e99b467 4ac912427c4214d8031d9ad6fb
>>>>> ---------------- --------------------------
>>>>> %stddev change %stddev
>>>>> \ | \
>>>>> 117419 ± 1% -33.7% 77863 ± 0% aim7.jobs-per-min
>>>>> 153.78 ± 1% +50.6% 231.63 ± 0% aim7.time.elapsed_time
>>>>> 153.78 ± 1% +50.6% 231.63 ± 0% aim7.time.elapsed_time.max
>>>>> 805644 ± 3% +11.3% 896604 ± 0% aim7.time.involuntary_context_switches
>>>>> 5408 ± 1% +15.4% 6240 ± 0% aim7.time.system_time
>>>>> 5066069 ± 0% +10.5% 5600256 ± 9% meminfo.DirectMap2M
>>>>> 135538 ± 8% -41.9% 78738 ± 8% meminfo.Dirty
>>>>> 980.67 ± 16% -67.8% 315.50 ± 12% meminfo.Writeback
>>>>> 71322 ± 10% -44.0% 39953 ± 1% numa-meminfo.node0.Dirty
>>>>> 11158 ± 18% -27.1% 8132 ± 0% numa-meminfo.node0.Mapped
>>>>> 56776 ± 6% -32.5% 38309 ± 0% numa-meminfo.node1.Dirty
>>>>> 9684 ± 22% +30.9% 12676 ± 0% numa-meminfo.node1.Mapped
>>>>> 6069 ± 57% -78.1% 1328 ± 18% softirqs.NET_RX
>>>>> 619333 ± 4% +8.0% 669152 ± 3% softirqs.RCU
>>>>> 128030 ± 2% +33.3% 170724 ± 0% softirqs.SCHED
>>>>> 2331994 ± 1% +15.3% 2688290 ± 0% softirqs.TIMER
>>>>> 7701 ± 1% -35.7% 4948 ± 3% vmstat.io.bo
>>>>> 64.67 ± 2% -39.7% 39.00 ± 2% vmstat.procs.b
>>>>> 333.33 ± 7% -48.5% 171.50 ± 2% vmstat.procs.r
>>>>> 17530 ± 1% -23.4% 13425 ± 1% vmstat.system.cs
>>>>> 47642 ± 1% -5.3% 45100 ± 1% vmstat.system.in
>>>>> 33522 ± 4% -43.1% 19068 ± 0% proc-vmstat.nr_dirty
>>>>> 236.00 ± 14% -66.1% 80.00 ± 3% proc-vmstat.nr_writeback
>>>>> 33907 ± 4% -43.3% 19222 ± 0% proc-vmstat.nr_zone_write_pending
>>>>> 28194 ± 10% +10.4% 31131 ± 6% proc-vmstat.pgactivate
>>>>> 746402 ± 2% +24.6% 929960 ± 3% proc-vmstat.pgfault
>>>>> 153.78 ± 1% +50.6% 231.63 ± 0% time.elapsed_time
>>>>> 153.78 ± 1% +50.6% 231.63 ± 0% time.elapsed_time.max
>>>>> 805644 ± 3% +11.3% 896604 ± 0% time.involuntary_context_switches
>>>>> 3524 ± 0% -23.4% 2701 ± 0% time.percent_of_cpu_this_job_got
>>>>> 5408 ± 1% +15.4% 6240 ± 0% time.system_time
>>>>> 12.19 ± 1% +36.7% 16.66 ± 0% time.user_time
>>>>> 48260939 ± 3% +12.1% 54110616 ± 2% cpuidle.C1-IVT.time
>>>>> 33149237 ± 5% +52.6% 50597349 ± 1% cpuidle.C1E-IVT.time
>>>>> 89642 ± 4% +52.8% 136976 ± 0% cpuidle.C1E-IVT.usage
>>>>> 13534795 ± 6% +276.3% 50934566 ± 55% cpuidle.C3-IVT.time
>>>>> 42893 ± 6% +138.8% 102439 ± 30% cpuidle.C3-IVT.usage
>>>>> 6.431e+08 ± 2% +390.1% 3.152e+09 ± 10% cpuidle.C6-IVT.time
>>>>> 802009 ± 2% +375.3% 3811880 ± 10% cpuidle.C6-IVT.usage
>>>>> 1535987 ± 4% +156.3% 3936830 ± 4% cpuidle.POLL.time
>>>>> 88.14 ± 0% -24.9% 66.17 ± 3% turbostat.%Busy
>>>>> 2659 ± 0% -44.7% 1471 ± 3% turbostat.Avg_MHz
>>>>> 3016 ± 0% -26.3% 2224 ± 0% turbostat.Bzy_MHz
>>>>> 5.20 ± 5% +127.0% 11.80 ± 2% turbostat.CPU%c1
>>>>>
>>>>>
>>>>>
>>>>> perf-stat.page-faults
>>>>>
>>>>> 1e+06 ++-----------------------------------------------------------------+
>>>>> 900000 O+O O O O O O O O |
>>>>> | O O O O O O O O O O |
>>>>> 800000 ++ .*.*. .*. .*. .*.. |
>>>>> 700000 ++*.*.*.*.*..*.* *.*.*.*.*.*..*.* * *.*.*.*.* *.*.*.* *
>>>>> | : : |
>>>>> 600000 ++: : |
>>>>> 500000 ++ : :|
>>>>> 400000 ++ : :|
>>>>> |: : :|
>>>>> 300000 ++ : :|
>>>>> 200000 ++ : :|
>>>>> | : |
>>>>> 100000 ++ : |
>>>>> 0 *+------------------------------------O-O------------------------*-+
>>>>>
>>>>>
>>>>> perf-stat.minor-faults
>>>>>
>>>>> 1e+06 ++-----------------------------------------------------------------+
>>>>> 900000 O+O O O O O O O O |
>>>>> | O O O O O O O O O O |
>>>>> 800000 ++ .*.*. .*. .*. .*.. |
>>>>> 700000 ++*.*.*.*.*..*.* *.*.*.*.*.*..*.* * *.*.*.*.* *.*.*.* *
>>>>> | : : |
>>>>> 600000 ++: : |
>>>>> 500000 ++ : :|
>>>>> 400000 ++ : :|
>>>>> |: : :|
>>>>> 300000 ++ : :|
>>>>> 200000 ++ : :|
>>>>> | : |
>>>>> 100000 ++ : |
>>>>> 0 *+------------------------------------O-O------------------------*-+
>>>>>
>>>>>
>>>>> aim7.jobs-per-min
>>>>>
>>>>> 140000 ++-----------------------------------------------------------------+
>>>>> | *. .*.. .*.*.*.*.*.*.*..*.*.*.*.*.*. |
>>>>> 120000 ++: *.*.* *.*.* *.*.*.*.*..*.*.*. *
>>>>> | : * |
>>>>> 100000 ++: : |
>>>>> |: : |
>>>>> 80000 O+O O O O O O O O O O O O O O O O O O : :|
>>>>> |: : :|
>>>>> 60000 ++ : :|
>>>>> |: : :|
>>>>> 40000 ++ : :|
>>>>> | : :|
>>>>> 20000 ++ : |
>>>>> | : |
>>>>> 0 *+------------------------------------O-O------------------------*-+
>>>>>
>>>>>
>>>>>
>>>>> [*] bisect-good sample
>>>>> [O] bisect-bad sample
>>>>>
>>>>>
>>>>> Disclaimer:
>>>>> Results have been estimated based on internal Intel analysis and are provided
>>>>> for informational purposes only. Any difference in system hardware or software
>>>>> design or configuration may affect actual performance.
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Xiaolong
>>>>>
>>>
>>> .
>>>
>>
>
> .
>
Powered by blists - more mailing lists