lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200811101827.GA7870@xiangao.remote.csb>
Date:   Tue, 11 Aug 2020 18:18:27 +0800
From:   Gao Xiang <hsiangkao@...hat.com>
To:     Daeho Jeong <daeho43@...il.com>
Cc:     Chao Yu <yuchao0@...wei.com>, Daeho Jeong <daehojeong@...gle.com>,
        kernel-team@...roid.com, linux-kernel@...r.kernel.org,
        linux-f2fs-devel@...ts.sourceforge.net
Subject: Re: [f2fs-dev] [PATCH] f2fs: change virtual mapping way for
 compression pages

On Tue, Aug 11, 2020 at 06:33:26PM +0900, Daeho Jeong wrote:
> Plus, when we use vmap(), vmap() normally executes in a short time
> like vm_map_ram().
> But, sometimes, it has a very long delay.
> 
> 2020년 8월 11일 (화) 오후 6:28, Daeho Jeong <daeho43@...il.com>님이 작성:
> >
> > Actually, as you can see, I use the whole zero data blocks in the test file.
> > It can maximize the effect of changing virtual mapping.
> > When I use normal files which can be compressed about 70% from the
> > original file,
> > The vm_map_ram() version is about 2x faster than vmap() version.

What f2fs does is much similar to btrfs compression. Even if these
blocks are all zeroed. In principle, the maximum compression ratio
is determined (cluster sized blocks into one compressed block, e.g
16k cluster into one compressed block).

So it'd be better to describe your configured cluster size (16k or
128k) and your hardware information in the commit message as well.

Actually, I also tried with this patch as well on my x86 laptop just
now with FIO (I didn't use zeroed block though), and I didn't notice
much difference with turbo boost off and maxfreq.

I'm not arguing this commit, just a note about this commit message.
> > > >> 1048576000 bytes (0.9 G) copied, 9.146217 s, 109 M/s
> > > >> 1048576000 bytes (0.9 G) copied, 9.997542 s, 100 M/s
> > > >> 1048576000 bytes (0.9 G) copied, 10.109727 s, 99 M/s

IMHO, the above number is much like decompressing in the arm64 little cores.

Thanks,
Gao Xiang


> >
> > 2020년 8월 11일 (화) 오후 4:55, Chao Yu <yuchao0@...wei.com>님이 작성:
> > >
> > > On 2020/8/11 15:15, Gao Xiang wrote:
> > > > On Tue, Aug 11, 2020 at 12:37:53PM +0900, Daeho Jeong wrote:
> > > >> From: Daeho Jeong <daehojeong@...gle.com>
> > > >>
> > > >> By profiling f2fs compression works, I've found vmap() callings are
> > > >> bottlenecks of f2fs decompression path. Changing these with
> > > >> vm_map_ram(), we can enhance f2fs decompression speed pretty much.
> > > >>
> > > >> [Verification]
> > > >> dd if=/dev/zero of=dummy bs=1m count=1000
> > > >> echo 3 > /proc/sys/vm/drop_caches
> > > >> dd if=dummy of=/dev/zero bs=512k
> > > >>
> > > >> - w/o compression -
> > > >> 1048576000 bytes (0.9 G) copied, 1.999384 s, 500 M/s
> > > >> 1048576000 bytes (0.9 G) copied, 2.035988 s, 491 M/s
> > > >> 1048576000 bytes (0.9 G) copied, 2.039457 s, 490 M/s
> > > >>
> > > >> - before patch -
> > > >> 1048576000 bytes (0.9 G) copied, 9.146217 s, 109 M/s
> > > >> 1048576000 bytes (0.9 G) copied, 9.997542 s, 100 M/s
> > > >> 1048576000 bytes (0.9 G) copied, 10.109727 s, 99 M/s
> > > >>
> > > >> - after patch -
> > > >> 1048576000 bytes (0.9 G) copied, 2.253441 s, 444 M/s
> > > >> 1048576000 bytes (0.9 G) copied, 2.739764 s, 365 M/s
> > > >> 1048576000 bytes (0.9 G) copied, 2.185649 s, 458 M/s
> > > >
> > > > Indeed, vmap() approach has some impact on the whole
> > > > workflow. But I don't think the gap is such significant,
> > > > maybe it relates to unlocked cpufreq (and big little
> > > > core difference if it's on some arm64 board).
> > >
> > > Agreed,
> > >
> > > I guess there should be other reason causing the large performance
> > > gap, scheduling, frequency, or something else.
> > >
> > > >
> > > >
> > > >
> > > > _______________________________________________
> > > > Linux-f2fs-devel mailing list
> > > > Linux-f2fs-devel@...ts.sourceforge.net
> > > > https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> > > > .
> > > >
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ