[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b29f6c18-f9c7-7e43-5b13-b5724fbf8d1a@linux.intel.com>
Date: Wed, 19 Aug 2020 09:52:56 +0800
From: Xing Zhengjun <zhengjun.xing@...ux.intel.com>
To: Ritesh Harjani <riteshh@...ux.ibm.com>,
kernel test robot <rong.a.chen@...el.com>
Cc: Theodore Ts'o <tytso@....edu>, kbuild test robot <lkp@...el.com>,
Jan Kara <jack@...e.cz>,
"Darrick J. Wong" <darrick.wong@...cle.com>,
LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org
Subject: Re: [LKP] Re: [ext4] d3b6f23f71: stress-ng.fiemap.ops_per_sec -60.5%
regression
On 7/22/2020 2:17 PM, Xing Zhengjun wrote:
>
>
> On 7/15/2020 7:04 PM, Ritesh Harjani wrote:
>> Hello Xing,
>>
>> On 4/7/20 1:30 PM, kernel test robot wrote:
>>> Greeting,
>>>
>>> FYI, we noticed a -60.5% regression of stress-ng.fiemap.ops_per_sec
>>> due to commit:
>>>
>>>
>>> commit: d3b6f23f71670007817a5d59f3fbafab2b794e8c ("ext4: move
>>> ext4_fiemap to use iomap framework")
>>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>>>
>>> in testcase: stress-ng
>>> on test machine: 96 threads Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz
>>> with 192G memory
>>> with following parameters:
>>>
>>> nr_threads: 10%
>>> disk: 1HDD
>>> testtime: 1s
>>> class: os
>>> cpufreq_governor: performance
>>> ucode: 0x500002c
>>> fs: ext4
>>
>> I started looking into this issue. But with my unit testing, I didn't
>> find any perf issue with fiemap ioctl call. I haven't yet explored about
>> how stress-ng take fiemap performance numbers, it could be doing
>> something differently. But in my testing I just made sure to create a
>> file with large number of extents and used xfs_io -c "fiemap -v" cmd
>> to check how much time it takes to read all the entries in 1st
>> and subsequent iterations.
>>
>>
>> Setup comprised of qemu machine on x86_64 with latest linux branch.
>>
>> 1. created a file of 10G using fallocate. (this allocated unwritten
>> extents for this file).
>>
>> 2. Then I punched hole on every alternate block of file. This step took
>> a long time. And after sufficiently long time, I had to cancel it.
>> for i in $(seq 1 2 xxxxx); do echo $i; fallocate -p -o $(($i*4096)) -l
>> 4096; done
>>
>> 3. Then issued fiemap call via xfs_io and took the time measurement.
>> time xfs_io -c "fiemap -v" bigfile > /dev/null
>>
>>
>> Perf numbers on latest default kernel build for above cmd.
>>
>> 1st iteration
>> ==============
>> real 0m31.684s
>> user 0m1.593s
>> sys 0m24.174s
>>
>> 2nd and subsequent iteration
>> ============================
>> real 0m3.379s
>> user 0m1.300s
>> sys 0m2.080s
>>
>>
>> 4. Then I reverted all the iomap_fiemap patches and re-tested this.
>> With this the older ext4_fiemap implementation will be tested:-
>>
>>
>> 1st iteration
>> ==============
>> real 0m31.591s
>> user 0m1.400s
>> sys 0m24.243s
>>
>>
>> 2nd and subsequent iteration (had to cancel it since it was taking
>> more time then 15m)
>> ============================
>> ^C^C
>> real 15m49.884s
>> user 0m0.032s
>> sys 15m49.722s
>>
>> I guess the reason why 2nd iteration with older implementation is taking
>> too much time is since with previous implementation we never cached
>> extent entries in extent_status tree. And also in 1st iteration the page
>> cache may get filled with lot of buffer_head entries. So maybe page
>> reclaims are taking more time.
>>
>> While with the latest implementation using iomap_fiemap(), the call to
>> query extent blocks is done using ext4_map_blocks(). ext4_map_blocks()
>> by default will also cache the extent entries into extent_status tree.
>> Hence during 2nd iteration, we will directly read the entries from
>> extent_status tree and will not do any disk I/O.
>>
>> -ritesh
I re-test it on the v5.9-rc1, the regression still existed. Have you
tried stress-ng test cases?
>
> Could you try stress-ng( https://kernel.ubuntu.com/~cking/stress-ng/)
> test cases? The tarballs can be get from
> https://kernel.ubuntu.com/~cking/tarballs/stress-ng/.
> The command for this case you can try "stress-ng --timeout 1 --times
> --verify --metrics-brief --sequential 9 --class os --minimize --exclude
> spawn,exec,swap"
> I re-test it on the v5.8-rc6, the regression still existed.
>
> =========================================================================================
>
> tbox_group/testcase/rootfs/kconfig/compiler/debug-setup/nr_threads/disk/testtime/fs/class/cpufreq_governor/ucode:
>
>
> lkp-csl-2sp5/stress-ng/debian-10.4-x86_64-20200603.cgz/x86_64-rhel-8.3/gcc-9/test/10%/1HDD/1s/ext4/os/performance/0x5002f01
>
>
> commit:
> b2c5764262edded1b1cfff5a6ca82c3d61bb4a4a
> d3b6f23f71670007817a5d59f3fbafab2b794e8c
> v5.8-rc6
>
> b2c5764262edded1 d3b6f23f71670007817a5d59f3f v5.8-rc6
> ---------------- --------------------------- ---------------------------
> %stddev %change %stddev %change %stddev
> \ | \ | \
> 20419 ± 3% -4.9% 19423 ± 4% +27.1% 25959
> stress-ng.af-alg.ops
> 19655 ± 3% -5.7% 18537 ± 4% +27.8% 25111
> stress-ng.af-alg.ops_per_sec
> 64.67 ± 5% -17.0% 53.67 ± 38% +22.2% 79.00 ±
> 9% stress-ng.chdir.ops
> 55.34 ± 3% -13.3% 47.99 ± 38% +26.4% 69.96 ±
> 10% stress-ng.chdir.ops_per_sec
> 64652 ± 7% -14.1% 55545 ± 11% -13.6% 55842 ±
> 6% stress-ng.chown.ops
> 64683 ± 7% -14.1% 55565 ± 11% -13.6% 55858 ±
> 6% stress-ng.chown.ops_per_sec
> 2805 ± 2% +0.6% 2820 ± 2% +130.0% 6452
> stress-ng.clone.ops
> 2802 ± 2% +0.6% 2818 ± 2% +129.9% 6443
> stress-ng.clone.ops_per_sec
> 34.67 +1.9% 35.33 ± 3% -9.6% 31.33 ±
> 3% stress-ng.copy-file.ops
> 22297 ± 23% +26.7% 28258 ± 2% +38.1% 30783 ±
> 14% stress-ng.dir.ops_per_sec
> 4743 ± 7% -12.6% 4147 ± 4% -19.9% 3801 ±
> 14% stress-ng.dirdeep.ops_per_sec
> 24499 ± 8% -9.3% 22223 ± 3% -19.8% 19647 ±
> 2% stress-ng.enosys.ops
> 24494 ± 8% -9.3% 22207 ± 3% -19.8% 19637 ±
> 2% stress-ng.enosys.ops_per_sec
> 76618 -63.8% 27756 ± 5% -55.5% 34092 ±
> 2% stress-ng.fiemap.ops
> 76042 -63.8% 27492 ± 5% -55.4% 33899 ±
> 2% stress-ng.fiemap.ops_per_sec
> 3816791 -6.2% 3581880 ± 2% -10.7% 3409584 ±
> 3% stress-ng.file-ioctl.ops
> 3817421 -6.2% 3582248 ± 2% -10.7% 3409544 ±
> 3% stress-ng.file-ioctl.ops_per_sec
> 27332 ± 17% +30.2% 35574 +37.2% 37506 ±
> 6% stress-ng.filename.ops
> 26222 ± 16% +28.6% 33722 +35.6% 35567 ±
> 6% stress-ng.filename.ops_per_sec
> 23431 ± 2% +0.5% 23559 +22.5% 28711 ±
> 4% stress-ng.fork.ops
> 23433 ± 2% +0.5% 23558 +22.5% 28709 ±
> 4% stress-ng.fork.ops_per_sec
> 48074 ± 15% -0.6% 47786 ± 9% -42.5% 27623
> stress-ng.get.ops
> 48088 ± 15% -0.6% 47793 ± 9% -42.6% 27624
> stress-ng.get.ops_per_sec
> 111379 -1.4% 109815 ± 2% -7.8% 102662 ±
> 4% stress-ng.getdent.ops
> 111398 -1.4% 109818 ± 2% -7.9% 102651 ±
> 4% stress-ng.getdent.ops_per_sec
> 226.67 ± 2% -1.0% 224.33 -15.1% 192.33 ±
> 6% stress-ng.inotify.ops
> 226.39 ± 2% -1.3% 223.53 -15.3% 191.82 ±
> 6% stress-ng.inotify.ops_per_sec
> 18.00 +9.3% 19.67 ± 2% +3.7% 18.67 ±
> 2% stress-ng.iomix.ops
> 4846 ± 10% +2.9% 4984 ± 7% -36.8% 3062 ±
> 7% stress-ng.kill.ops
> 4846 ± 10% +2.8% 4982 ± 7% -36.9% 3058 ±
> 7% stress-ng.kill.ops_per_sec
> 851324 ± 3% -9.9% 766988 ± 2% +2.3% 870703
> stress-ng.lease.ops
> 850238 ± 3% -9.9% 766255 ± 2% +2.3% 870047
> stress-ng.lease.ops_per_sec
> 134492 ± 14% -7.0% 125085 ± 16% -19.9% 107736 ±
> 3% stress-ng.mlock.ops_per_sec
> 1.38 +2.9% 1.42 ± 5% -20.2% 1.10 ±
> 12% stress-ng.mmapfork.ops_per_sec
> 1117721 ± 5% -9.2% 1015321 ± 20% -36.2% 713598
> stress-ng.mq.ops
> 1117186 ± 5% -9.2% 1014593 ± 20% -36.2% 712387
> stress-ng.mq.ops_per_sec
> 4584592 ± 2% -2.7% 4460470 ± 7% -12.0% 4036627 ±
> 8% stress-ng.msg.ops_per_sec
> 8179 ± 10% +7.3% 8772 ± 9% +24.0% 10144
> stress-ng.opcode.ops
> 8177 ± 10% +7.2% 8770 ± 9% +24.0% 10141
> stress-ng.opcode.ops_per_sec
> 5218 ± 3% +3.6% 5408 ± 2% +27.3% 6642
> stress-ng.reboot.ops
> 5217 ± 3% +3.6% 5407 ± 2% +27.3% 6640
> stress-ng.reboot.ops_per_sec
> 1724 ± 21% +27.9% 2205 ± 18% +54.3% 2660 ±
> 19% stress-ng.resources.ops
> 1217728 -31.4% 835808 ± 14% -18.7% 989423 ±
> 10% stress-ng.revio.ops
> 1217899 -31.4% 835846 ± 14% -18.8% 989313 ±
> 10% stress-ng.revio.ops_per_sec
> 752664 ± 9% -7.6% 695116 ± 10% -0.6% 748117 ±
> 6% stress-ng.seal.ops
> 752957 ± 9% -7.7% 695278 ± 10% -0.6% 748068 ±
> 6% stress-ng.seal.ops_per_sec
> 628.00 ± 83% -20.1% 502.00 ± 8% +176.2% 1734
> stress-ng.seccomp.ops
> 628.22 ± 83% -20.1% 502.08 ± 8% +176.1% 1734
> stress-ng.seccomp.ops_per_sec
> 1361741 ± 7% +4.0% 1416628 ± 8% +17.2% 1596549 ±
> 9% stress-ng.seek.ops
> 1361799 ± 7% +4.0% 1416578 ± 8% +17.2% 1596376 ±
> 9% stress-ng.seek.ops_per_sec
> 1537496 +1.2% 1556468 -9.9% 1385679
> stress-ng.sigrt.ops
> 1532297 +1.3% 1551993 -9.9% 1381175
> stress-ng.sigrt.ops_per_sec
> 1324589 ± 12% -26.6% 972626 ± 9% -14.8% 1128923 ±
> 4% stress-ng.sigsuspend.ops
> 1323957 ± 12% -26.6% 972084 ± 9% -14.8% 1128174 ±
> 4% stress-ng.sigsuspend.ops_per_sec
> 806.67 ± 13% +11.2% 896.67 ± 10% +31.3% 1059 ±
> 4% stress-ng.sockabuse.ops
> 806.73 ± 13% +11.1% 896.44 ± 10% +31.3% 1059 ±
> 4% stress-ng.sockabuse.ops_per_sec
> 6224635 ± 2% +0.8% 6274521 ± 2% -12.8% 5427883 ±
> 2% stress-ng.sockpair.ops
> 5383397 -1.9% 5279526 ± 2% -14.9% 4579084 ±
> 4% stress-ng.sockpair.ops_per_sec
> 22409 ± 6% -2.8% 21784 ± 5% -41.2% 13172 ±
> 9% stress-ng.sysbadaddr.ops
> 22414 ± 6% -2.8% 21784 ± 5% -41.3% 13167 ±
> 9% stress-ng.sysbadaddr.ops_per_sec
> 525.33 ± 13% +2.6% 539.00 ± 15% -42.1% 304.33 ±
> 25% stress-ng.sysfs.ops
> 524.98 ± 13% +2.6% 538.58 ± 15% -42.1% 303.79 ±
> 25% stress-ng.sysfs.ops_per_sec
> 3077 ± 12% +19.9% 3689 ± 4% +23.0% 3786 ±
> 4% stress-ng.sysinfo.ops
> 3009 ± 11% +22.6% 3688 ± 4% +25.7% 3783 ±
> 4% stress-ng.sysinfo.ops_per_sec
> 3001932 +1.2% 3038017 +15.7% 3474274
> stress-ng.udp.ops
> 3002618 +1.2% 3037939 +15.7% 3473606
> stress-ng.udp.ops_per_sec
> 184634 ± 12% -6.9% 171823 +24.5% 229788 ±
> 4% stress-ng.utime.ops
> 184678 ± 12% -7.0% 171836 +24.4% 229784 ±
> 4% stress-ng.utime.ops_per_sec
> 1835 ± 5% +2.2% 1875 ± 4% +8.2% 1986 ±
> 3% stress-ng.vforkmany.ops
> 1829 ± 5% +1.9% 1863 ± 4% +8.1% 1978 ±
> 3% stress-ng.vforkmany.ops_per_sec
> 14829 -1.0% 14682 ± 2% -20.1% 11848 ±
> 3% stress-ng.vm-segv.ops
> 14830 -1.0% 14682 ± 2% -29.7% 10433 ±
> 13% stress-ng.vm-segv.ops_per_sec
> 3527835 ± 3% +1.9% 3596613 ± 5% +495.6% 21010377
> stress-ng.vm-splice.ops
> 3528845 ± 3% +1.9% 3597219 ± 5% +495.4% 21011045
> stress-ng.vm-splice.ops_per_sec
>
>
>
>
>
--
Zhengjun Xing
Powered by blists - more mailing lists