lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 19 Aug 2020 09:52:56 +0800
From:   Xing Zhengjun <zhengjun.xing@...ux.intel.com>
To:     Ritesh Harjani <riteshh@...ux.ibm.com>,
        kernel test robot <rong.a.chen@...el.com>
Cc:     Theodore Ts'o <tytso@....edu>, kbuild test robot <lkp@...el.com>,
        Jan Kara <jack@...e.cz>,
        "Darrick J. Wong" <darrick.wong@...cle.com>,
        LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org
Subject: Re: [LKP] Re: [ext4] d3b6f23f71: stress-ng.fiemap.ops_per_sec -60.5%
 regression



On 7/22/2020 2:17 PM, Xing Zhengjun wrote:
> 
> 
> On 7/15/2020 7:04 PM, Ritesh Harjani wrote:
>> Hello Xing,
>>
>> On 4/7/20 1:30 PM, kernel test robot wrote:
>>> Greeting,
>>>
>>> FYI, we noticed a -60.5% regression of stress-ng.fiemap.ops_per_sec 
>>> due to commit:
>>>
>>>
>>> commit: d3b6f23f71670007817a5d59f3fbafab2b794e8c ("ext4: move 
>>> ext4_fiemap to use iomap framework")
>>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>>>
>>> in testcase: stress-ng
>>> on test machine: 96 threads Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz 
>>> with 192G memory
>>> with following parameters:
>>>
>>>     nr_threads: 10%
>>>     disk: 1HDD
>>>     testtime: 1s
>>>     class: os
>>>     cpufreq_governor: performance
>>>     ucode: 0x500002c
>>>     fs: ext4
>>
>> I started looking into this issue. But with my unit testing, I didn't
>> find any perf issue with fiemap ioctl call. I haven't yet explored about
>> how stress-ng take fiemap performance numbers, it could be doing
>> something differently. But in my testing I just made sure to create a
>> file with large number of extents and used xfs_io -c "fiemap -v" cmd
>> to check how much time it takes to read all the entries in 1st
>> and subsequent iterations.
>>
>>
>> Setup comprised of qemu machine on x86_64 with latest linux branch.
>>
>> 1. created a file of 10G using fallocate. (this allocated unwritten
>> extents for this file).
>>
>> 2. Then I punched hole on every alternate block of file. This step took
>> a long time. And after sufficiently long time, I had to cancel it.
>> for i in $(seq 1 2 xxxxx); do echo $i; fallocate -p -o $(($i*4096)) -l 
>> 4096; done
>>
>> 3. Then issued fiemap call via xfs_io and took the time measurement.
>> time xfs_io -c "fiemap -v" bigfile > /dev/null
>>
>>
>> Perf numbers on latest default kernel build for above cmd.
>>
>> 1st iteration
>> ==============
>> real    0m31.684s
>> user    0m1.593s
>> sys     0m24.174s
>>
>> 2nd and subsequent iteration
>> ============================
>> real    0m3.379s
>> user    0m1.300s
>> sys     0m2.080s
>>
>>
>> 4. Then I reverted all the iomap_fiemap patches and re-tested this.
>> With this the older ext4_fiemap implementation will be tested:-
>>
>>
>> 1st iteration
>> ==============
>> real    0m31.591s
>> user    0m1.400s
>> sys     0m24.243s
>>
>>
>> 2nd and subsequent iteration (had to cancel it since it was taking 
>> more time then 15m)
>> ============================
>> ^C^C
>> real    15m49.884s
>> user    0m0.032s
>> sys     15m49.722s
>>
>> I guess the reason why 2nd iteration with older implementation is taking
>> too much time is since with previous implementation we never cached
>> extent entries in extent_status tree. And also in 1st iteration the page
>> cache may get filled with lot of buffer_head entries. So maybe page
>> reclaims are taking more time.
>>
>> While with the latest implementation using iomap_fiemap(), the call to 
>> query extent blocks is done using ext4_map_blocks(). ext4_map_blocks()
>> by default will also cache the extent entries into extent_status tree.
>> Hence during 2nd iteration, we will directly read the entries from 
>> extent_status tree and will not do any disk I/O.
>>
>> -ritesh

I re-test it on the v5.9-rc1, the regression still existed. Have you 
tried stress-ng test cases?

> 
> Could you try stress-ng( https://kernel.ubuntu.com/~cking/stress-ng/) 
> test cases?  The tarballs can be get from 
> https://kernel.ubuntu.com/~cking/tarballs/stress-ng/.
> The command for this case you can try "stress-ng --timeout 1 --times 
> --verify --metrics-brief --sequential 9 --class os --minimize --exclude 
> spawn,exec,swap"
> I re-test it on the v5.8-rc6, the regression still existed.
> 
> ========================================================================================= 
> 
> tbox_group/testcase/rootfs/kconfig/compiler/debug-setup/nr_threads/disk/testtime/fs/class/cpufreq_governor/ucode: 
> 
> 
> lkp-csl-2sp5/stress-ng/debian-10.4-x86_64-20200603.cgz/x86_64-rhel-8.3/gcc-9/test/10%/1HDD/1s/ext4/os/performance/0x5002f01 
> 
> 
> commit:
>    b2c5764262edded1b1cfff5a6ca82c3d61bb4a4a
>    d3b6f23f71670007817a5d59f3fbafab2b794e8c
>    v5.8-rc6
> 
> b2c5764262edded1 d3b6f23f71670007817a5d59f3f                    v5.8-rc6
> ---------------- --------------------------- ---------------------------
>           %stddev     %change         %stddev     %change         %stddev
>               \          |                \          |                \
>       20419 ±  3%      -4.9%      19423 ±  4%     +27.1%      25959   
> stress-ng.af-alg.ops
>       19655 ±  3%      -5.7%      18537 ±  4%     +27.8%      25111   
> stress-ng.af-alg.ops_per_sec
>       64.67 ±  5%     -17.0%      53.67 ± 38%     +22.2%      79.00 ± 
> 9%  stress-ng.chdir.ops
>       55.34 ±  3%     -13.3%      47.99 ± 38%     +26.4%      69.96 ± 
> 10%  stress-ng.chdir.ops_per_sec
>       64652 ±  7%     -14.1%      55545 ± 11%     -13.6%      55842 ± 
> 6%  stress-ng.chown.ops
>       64683 ±  7%     -14.1%      55565 ± 11%     -13.6%      55858 ± 
> 6%  stress-ng.chown.ops_per_sec
>        2805 ±  2%      +0.6%       2820 ±  2%    +130.0%       6452   
> stress-ng.clone.ops
>        2802 ±  2%      +0.6%       2818 ±  2%    +129.9%       6443   
> stress-ng.clone.ops_per_sec
>       34.67            +1.9%      35.33 ±  3%      -9.6%      31.33 ± 
> 3%  stress-ng.copy-file.ops
>       22297 ± 23%     +26.7%      28258 ±  2%     +38.1%      30783 ± 
> 14%  stress-ng.dir.ops_per_sec
>        4743 ±  7%     -12.6%       4147 ±  4%     -19.9%       3801 ± 
> 14%  stress-ng.dirdeep.ops_per_sec
>       24499 ±  8%      -9.3%      22223 ±  3%     -19.8%      19647 ± 
> 2%  stress-ng.enosys.ops
>       24494 ±  8%      -9.3%      22207 ±  3%     -19.8%      19637 ± 
> 2%  stress-ng.enosys.ops_per_sec
>       76618           -63.8%      27756 ±  5%     -55.5%      34092 ± 
> 2%  stress-ng.fiemap.ops
>       76042           -63.8%      27492 ±  5%     -55.4%      33899 ± 
> 2%  stress-ng.fiemap.ops_per_sec
>     3816791            -6.2%    3581880 ±  2%     -10.7%    3409584 ± 
> 3%  stress-ng.file-ioctl.ops
>     3817421            -6.2%    3582248 ±  2%     -10.7%    3409544 ± 
> 3%  stress-ng.file-ioctl.ops_per_sec
>       27332 ± 17%     +30.2%      35574           +37.2%      37506 ± 
> 6%  stress-ng.filename.ops
>       26222 ± 16%     +28.6%      33722           +35.6%      35567 ± 
> 6%  stress-ng.filename.ops_per_sec
>       23431 ±  2%      +0.5%      23559           +22.5%      28711 ± 
> 4%  stress-ng.fork.ops
>       23433 ±  2%      +0.5%      23558           +22.5%      28709 ± 
> 4%  stress-ng.fork.ops_per_sec
>       48074 ± 15%      -0.6%      47786 ±  9%     -42.5%      27623   
> stress-ng.get.ops
>       48088 ± 15%      -0.6%      47793 ±  9%     -42.6%      27624   
> stress-ng.get.ops_per_sec
>      111379            -1.4%     109815 ±  2%      -7.8%     102662 ± 
> 4%  stress-ng.getdent.ops
>      111398            -1.4%     109818 ±  2%      -7.9%     102651 ± 
> 4%  stress-ng.getdent.ops_per_sec
>      226.67 ±  2%      -1.0%     224.33           -15.1%     192.33 ± 
> 6%  stress-ng.inotify.ops
>      226.39 ±  2%      -1.3%     223.53           -15.3%     191.82 ± 
> 6%  stress-ng.inotify.ops_per_sec
>       18.00            +9.3%      19.67 ±  2%      +3.7%      18.67 ± 
> 2%  stress-ng.iomix.ops
>        4846 ± 10%      +2.9%       4984 ±  7%     -36.8%       3062 ± 
> 7%  stress-ng.kill.ops
>        4846 ± 10%      +2.8%       4982 ±  7%     -36.9%       3058 ± 
> 7%  stress-ng.kill.ops_per_sec
>      851324 ±  3%      -9.9%     766988 ±  2%      +2.3%     870703   
> stress-ng.lease.ops
>      850238 ±  3%      -9.9%     766255 ±  2%      +2.3%     870047   
> stress-ng.lease.ops_per_sec
>      134492 ± 14%      -7.0%     125085 ± 16%     -19.9%     107736 ± 
> 3%  stress-ng.mlock.ops_per_sec
>        1.38            +2.9%       1.42 ±  5%     -20.2%       1.10 ± 
> 12%  stress-ng.mmapfork.ops_per_sec
>     1117721 ±  5%      -9.2%    1015321 ± 20%     -36.2%     713598   
> stress-ng.mq.ops
>     1117186 ±  5%      -9.2%    1014593 ± 20%     -36.2%     712387   
> stress-ng.mq.ops_per_sec
>     4584592 ±  2%      -2.7%    4460470 ±  7%     -12.0%    4036627 ± 
> 8%  stress-ng.msg.ops_per_sec
>        8179 ± 10%      +7.3%       8772 ±  9%     +24.0%      10144   
> stress-ng.opcode.ops
>        8177 ± 10%      +7.2%       8770 ±  9%     +24.0%      10141   
> stress-ng.opcode.ops_per_sec
>        5218 ±  3%      +3.6%       5408 ±  2%     +27.3%       6642   
> stress-ng.reboot.ops
>        5217 ±  3%      +3.6%       5407 ±  2%     +27.3%       6640   
> stress-ng.reboot.ops_per_sec
>        1724 ± 21%     +27.9%       2205 ± 18%     +54.3%       2660 ± 
> 19%  stress-ng.resources.ops
>     1217728           -31.4%     835808 ± 14%     -18.7%     989423 ± 
> 10%  stress-ng.revio.ops
>     1217899           -31.4%     835846 ± 14%     -18.8%     989313 ± 
> 10%  stress-ng.revio.ops_per_sec
>      752664 ±  9%      -7.6%     695116 ± 10%      -0.6%     748117 ± 
> 6%  stress-ng.seal.ops
>      752957 ±  9%      -7.7%     695278 ± 10%      -0.6%     748068 ± 
> 6%  stress-ng.seal.ops_per_sec
>      628.00 ± 83%     -20.1%     502.00 ±  8%    +176.2%       1734   
> stress-ng.seccomp.ops
>      628.22 ± 83%     -20.1%     502.08 ±  8%    +176.1%       1734   
> stress-ng.seccomp.ops_per_sec
>     1361741 ±  7%      +4.0%    1416628 ±  8%     +17.2%    1596549 ± 
> 9%  stress-ng.seek.ops
>     1361799 ±  7%      +4.0%    1416578 ±  8%     +17.2%    1596376 ± 
> 9%  stress-ng.seek.ops_per_sec
>     1537496            +1.2%    1556468            -9.9%    1385679   
> stress-ng.sigrt.ops
>     1532297            +1.3%    1551993            -9.9%    1381175   
> stress-ng.sigrt.ops_per_sec
>     1324589 ± 12%     -26.6%     972626 ±  9%     -14.8%    1128923 ± 
> 4%  stress-ng.sigsuspend.ops
>     1323957 ± 12%     -26.6%     972084 ±  9%     -14.8%    1128174 ± 
> 4%  stress-ng.sigsuspend.ops_per_sec
>      806.67 ± 13%     +11.2%     896.67 ± 10%     +31.3%       1059 ± 
> 4%  stress-ng.sockabuse.ops
>      806.73 ± 13%     +11.1%     896.44 ± 10%     +31.3%       1059 ± 
> 4%  stress-ng.sockabuse.ops_per_sec
>     6224635 ±  2%      +0.8%    6274521 ±  2%     -12.8%    5427883 ± 
> 2%  stress-ng.sockpair.ops
>     5383397            -1.9%    5279526 ±  2%     -14.9%    4579084 ± 
> 4%  stress-ng.sockpair.ops_per_sec
>       22409 ±  6%      -2.8%      21784 ±  5%     -41.2%      13172 ± 
> 9%  stress-ng.sysbadaddr.ops
>       22414 ±  6%      -2.8%      21784 ±  5%     -41.3%      13167 ± 
> 9%  stress-ng.sysbadaddr.ops_per_sec
>      525.33 ± 13%      +2.6%     539.00 ± 15%     -42.1%     304.33 ± 
> 25%  stress-ng.sysfs.ops
>      524.98 ± 13%      +2.6%     538.58 ± 15%     -42.1%     303.79 ± 
> 25%  stress-ng.sysfs.ops_per_sec
>        3077 ± 12%     +19.9%       3689 ±  4%     +23.0%       3786 ± 
> 4%  stress-ng.sysinfo.ops
>        3009 ± 11%     +22.6%       3688 ±  4%     +25.7%       3783 ± 
> 4%  stress-ng.sysinfo.ops_per_sec
>     3001932            +1.2%    3038017           +15.7%    3474274   
> stress-ng.udp.ops
>     3002618            +1.2%    3037939           +15.7%    3473606   
> stress-ng.udp.ops_per_sec
>      184634 ± 12%      -6.9%     171823           +24.5%     229788 ± 
> 4%  stress-ng.utime.ops
>      184678 ± 12%      -7.0%     171836           +24.4%     229784 ± 
> 4%  stress-ng.utime.ops_per_sec
>        1835 ±  5%      +2.2%       1875 ±  4%      +8.2%       1986 ± 
> 3%  stress-ng.vforkmany.ops
>        1829 ±  5%      +1.9%       1863 ±  4%      +8.1%       1978 ± 
> 3%  stress-ng.vforkmany.ops_per_sec
>       14829            -1.0%      14682 ±  2%     -20.1%      11848 ± 
> 3%  stress-ng.vm-segv.ops
>       14830            -1.0%      14682 ±  2%     -29.7%      10433 ± 
> 13%  stress-ng.vm-segv.ops_per_sec
>     3527835 ±  3%      +1.9%    3596613 ±  5%    +495.6%   21010377   
> stress-ng.vm-splice.ops
>     3528845 ±  3%      +1.9%    3597219 ±  5%    +495.4%   21011045   
> stress-ng.vm-splice.ops_per_sec
> 
> 
> 
> 
> 

-- 
Zhengjun Xing

Powered by blists - more mailing lists