lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210521092730.GE18952@quack2.suse.cz>
Date:   Fri, 21 May 2021 11:27:30 +0200
From:   Jan Kara <jack@...e.cz>
To:     Xing Zhengjun <zhengjun.xing@...ux.intel.com>
Cc:     Jan Kara <jack@...e.cz>, kernel test robot <oliver.sang@...el.com>,
        Theodore Ts'o <tytso@....edu>,
        LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org,
        lkp@...el.com
Subject: Re: [LKP] [ext4] 05c2c00f37: aim7.jobs-per-min -11.8% regression

On Fri 21-05-21 09:16:42, Xing Zhengjun wrote:
> Hi Jan,
> 
> On 5/20/2021 5:51 PM, Jan Kara wrote:
> > Hello!
> > 
> > On Thu 20-05-21 15:13:20, Xing Zhengjun wrote:
> > > 
> > >       Do you have time to look at this? The regression still existed in the
> > > latest Linux mainline v5.13-rc2.
> > 
> > Thanks for verification and for the ping! I had a look into this and I
> > think the regression is caused by the changes in orphan handling. The load
> > runs multiple tasks all creating and deleting files. This generally
> > contends on the orphan locking with fast storage (which is your case
> > because you use ramdisk as a backing store AFAICT). And the changes I did
> > move superblock checksum computation under the orphan lock so the lock hold
> > times are now higher.
> > 
> > Sadly it is not easy to move checksum update from under the orphan lock and
> > maintain checksum consistency since the checksum has to be recomputed
> > consistently with the changes of superblock state. But I have one idea how
> > we could maybe improve the situation. Can you check whether attached patch
> > recovers the regression? Because that's about how good it could get when we
> > are more careful when writing out superblock.
> > 
> > 								Honza
> > 
> 
> I apply the patch based on v5.13-rc2 and test, it can not recover the
> regression and the regression became more serious(-45.7%).

OK, thanks for testing. So the orphan code is indeed the likely cause of
this regression but I probably did not guess correctly what is the
contention point there. Then I guess I need to reproduce and do more
digging why the contention happens...

								Honza

> 
> =========================================================================================
> tbox_group/testcase/rootfs/kconfig/compiler/disk/md/fs/test/load/cpufreq_governor/ucode:
> 
> lkp-csl-2sp9/aim7/debian-10.4-x86_64-20200603.cgz/x86_64-rhel-8.3/gcc-9/4BRD_12G/RAID1/ext4/creat-clo/1000/performance/0x5003006
> 
> commit:
>   4392fbc4bab57db3760f0fb61258cb7089b37665
>   05c2c00f3769abb9e323fcaca70d2de0b48af7ba
>   v5.13-rc2
>   2a1eb1a2fc08daaaf76a5aa8ffa355b5a5013d86    (the test patch)
> 
> 4392fbc4bab57db3 05c2c00f3769abb9e323fcaca70                   v5.13-rc2
> 2a1eb1a2fc08daaaf76a5aa8ffa
> ---------------- --------------------------- ---------------------------
> ---------------------------
>          %stddev     %change         %stddev     %change %stddev     %change
> %stddev
>              \          |                \          |                \
> |                \
>      13342           -11.8%      11771 ±  2%     -14.2%      11450
> -45.7%       7240 ±  3%  aim7.jobs-per-min
> 
> 
> 
> -- 
> Zhengjun Xing
-- 
Jan Kara <jack@...e.com>
SUSE Labs, CR

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ