linux-kernel - Re: [RFC] Analyzing zpool allocators / Removing zbud and z3fold

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Zdbh9Ap8kZDszEWv@google.com>
Date: Thu, 22 Feb 2024 05:56:04 +0000
From: Yosry Ahmed <yosryahmed@...gle.com>
To: Chengming Zhou <chengming.zhou@...ux.dev>
Cc: Andrew Morton <akpm@...ux-foundation.org>, Vitaly Wool <vitaly.wool@...sulko.com>, 
	Miaohe Lin <linmiaohe@...wei.com>, Johannes Weiner <hannes@...xchg.org>, Nhat Pham <nphamcs@...il.com>, 
	Linux-MM <linux-mm@...ck.org>, 
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>, Christoph Hellwig <hch@...radead.org>, 
	Sergey Senozhatsky <senozhatsky@...omium.org>, Minchan Kim <minchan@...nel.org>, 
	Chris Down <chris@...isdown.name>, Seth Jennings <sjenning@...hat.com>, 
	Dan Streetman <ddstreet@...e.org>, Chris Li <chrisl@...nel.org>
Subject: Re: [RFC] Analyzing zpool allocators / Removing zbud and z3fold

On Thu, Feb 22, 2024 at 11:54:44AM +0800, Chengming Zhou wrote:
> On 2024/2/9 11:27, Yosry Ahmed wrote:
> > Hey folks,
> > 
> > This is a follow up on my previously sent RFC patch to deprecate
> > z3fold [1]. This is an RFC without code, I thought I could get some
> > discussion going before writing (or rather deleting) more code. I went
> > back to do some analysis on the 3 zpool allocators: zbud, zsmalloc,
> > and z3fold.
> 
> This is a great analysis! Sorry for being late to see it.
> 
> I want to vote for this direction, zram has been using zsmalloc directly,
> zswap can also do this, which is simpler and we can just maintain and optimize
> only one allocator. The only evident downside is dependence on MMU, right?

AFAICT, yes. I saw a lot of positive responses when I sent an RFC to
mark z3fold as deprecated, but there were some opposing opinions as
well, which is why I did this simple analysis. I was hoping we can make
forward progress with that, but was disappointed it didn't get as much
attention as the deprecation RFC :)

> 
> And I'm trying to optimize the scalability performance for zsmalloc now,
> which is bad so zswap has to use 32 pools to workaround it. (zram only use
> one pool, should also have the scalability problem on big server, maybe
> have to use many zram block devices to workaround it too.)

That's slightly orthogonal. Zsmalloc is not really showing worse
performance than other allocators, so this should be a separate effort.

> 
> But too many pools would cause more memory waste and more fragmentation,
> so the resulted compression ratio is not good enough.
> 
> As for the MMU dependence, we can actually avoid it? Maybe I missed something,
> we can get object's memory vecs from zsmalloc, then send it to decompress,
> which should support length(memory vecs) > 1?

IIUC the dependency on MMU is due to the use of kmalloc() APIs and the
fact that we may be using highmem pages. I think we may be able to work
around that dependency but I didn't look closely.  Hopefully Minchan or
Sergey could shed more light on this.

> 
> > 
> > [1]https://lore.kernel.org/linux-mm/20240112193103.3798287-1-yosryahmed@google.com/
> > 
> > In this analysis, for each of the allocators I ran a kernel build test
> > on tmpfs in a limit cgroup 5 times and captured:
> > (a) The build times.
> > (b) zswap_load() and zswap_store() latencies using bpftrace.
> > (c) The maximum size of the zswap pool from /proc/meminfo::Zswapped.
> 
> Here should use /proc/meminfo::Zswap, right?
> Zswap is the sum of pool pages size, Zswapped is the swapped/compressed pages.

Oh yes, it is /proc/meminfo::Zswap actually. I miswrote it in my email.

Thanks!