lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJD7tkbRF6od-2x_L8-A1QL3=2Ww13sCj4S3i4bNndqF+3+_Vg@mail.gmail.com>
Date: Thu, 8 Feb 2024 19:27:04 -0800
From: Yosry Ahmed <yosryahmed@...gle.com>
To: Andrew Morton <akpm@...ux-foundation.org>
Cc: Vitaly Wool <vitaly.wool@...sulko.com>, Miaohe Lin <linmiaohe@...wei.com>, 
	Johannes Weiner <hannes@...xchg.org>, Nhat Pham <nphamcs@...il.com>, Linux-MM <linux-mm@...ck.org>, 
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>, Christoph Hellwig <hch@...radead.org>, 
	Sergey Senozhatsky <senozhatsky@...omium.org>, Minchan Kim <minchan@...nel.org>, 
	Chris Down <chris@...isdown.name>, Seth Jennings <sjenning@...hat.com>, 
	Dan Streetman <ddstreet@...e.org>, Chris Li <chrisl@...nel.org>
Subject: [RFC] Analyzing zpool allocators / Removing zbud and z3fold

Hey folks,

This is a follow up on my previously sent RFC patch to deprecate
z3fold [1]. This is an RFC without code, I thought I could get some
discussion going before writing (or rather deleting) more code. I went
back to do some analysis on the 3 zpool allocators: zbud, zsmalloc,
and z3fold.

[1]https://lore.kernel.org/linux-mm/20240112193103.3798287-1-yosryahmed@google.com/

In this analysis, for each of the allocators I ran a kernel build test
on tmpfs in a limit cgroup 5 times and captured:
(a) The build times.
(b) zswap_load() and zswap_store() latencies using bpftrace.
(c) The maximum size of the zswap pool from /proc/meminfo::Zswapped.

Here are the results I have. I am using zsmalloc as the base for all
comparisons.

-------------------------------- <Results> --------------------------------

(a) Build times

*** zsmalloc ***
──────────────────────────────────────────────────────────────
 LABEL   │ MIN      │ MAX      │ MEAN     │ MEDIAN   │ STDDEV
────────────────────┼──────────┼──────────┼──────────┼────────
 real    │ 108.890  │ 116.160  │ 111.304  │ 110.310  │ 2.719
 sys     │ 6838.860 │ 7137.830 │ 6936.414 │ 6862.160 │ 114.860
 user    │ 2838.270 │ 2859.050 │ 2850.116 │ 2852.590 │ 7.388
──────────────────────────────────────────────────────────────

*** zbud ***
──────────────────────────────────────────────────────────────
 LABEL   │ MIN      │ MAX      │ MEAN     │ MEDIAN   │ STDDEV
────────────────────┼──────────┼──────────┼──────────┼────────
 real    │ 105.540  │ 114.430  │ 108.738  │ 108.140  │ 3.027
 sys     │ 6553.680 │ 6794.330 │ 6688.184 │ 6661.840 │ 86.471
 user    │ 2836.390 │ 2847.850 │ 2842.952 │ 2843.450 │ 3.721
──────────────────────────────────────────────────────────────

*** z3fold ***
──────────────────────────────────────────────────────────────
 LABEL   │ MIN      │ MAX      │ MEAN     │ MEDIAN   │ STDDEV
────────────────────┼──────────┼──────────┼──────────┼────────
 real    │ 113.020  │ 118.110  │ 114.642  │ 114.010  │ 1.803
 sys     │ 7168.860 │ 7284.900 │ 7243.930 │ 7265.290 │ 42.254
 user    │ 2865.630 │ 2869.840 │ 2868.208 │ 2868.710 │ 1.625
──────────────────────────────────────────────────────────────

Comparing the means, zbud is 2.3% faster, and z3fold is 3% slower.

(b) zswap_load() and zswap_store() latencies

*** zsmalloc ***

@load_ns:
[128, 256)           377 |                                                    |
[256, 512)           772 |                                                    |
[512, 1K)            923 |                                                    |
[1K, 2K)           22141 |                                                    |
[2K, 4K)           88297 |                                                    |
[4K, 8K)         1685833 |@@@@@                                               |
[8K, 16K)       17087712 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[16K, 32K)      10875077 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@                   |
[32K, 64K)        777656 |@@                                                  |
[64K, 128K)       127239 |                                                    |
[128K, 256K)       50301 |                                                    |
[256K, 512K)        1669 |                                                    |
[512K, 1M)            37 |                                                    |
[1M, 2M)               3 |                                                    |

@store_ns:
[512, 1K)            279 |                                                    |
[1K, 2K)           15969 |                                                    |
[2K, 4K)          193446 |                                                    |
[4K, 8K)          823283 |                                                    |
[8K, 16K)       14209844 |@@@@@@@@@@@                                         |
[16K, 32K)      62040863 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[32K, 64K)       9737713 |@@@@@@@@                                            |
[64K, 128K)      1278302 |@                                                   |
[128K, 256K)      487285 |                                                    |
[256K, 512K)        4406 |                                                    |
[512K, 1M)           117 |                                                    |
[1M, 2M)              24 |                                                    |

*** zbud ***

@load_ns:
[128, 256)           452 |                                                    |
[256, 512)           834 |                                                    |
[512, 1K)            998 |                                                    |
[1K, 2K)           22708 |                                                    |
[2K, 4K)          171247 |                                                    |
[4K, 8K)         2853227 |@@@@@@@@                                            |
[8K, 16K)       17727445 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[16K, 32K)       9523050 |@@@@@@@@@@@@@@@@@@@@@@@@@@@                         |
[32K, 64K)        752423 |@@                                                  |
[64K, 128K)       135560 |                                                    |
[128K, 256K)       52360 |                                                    |
[256K, 512K)        4071 |                                                    |
[512K, 1M)            57 |                                                    |

@store_ns:
[512, 1K)            518 |                                                    |
[1K, 2K)           13337 |                                                    |
[2K, 4K)          193043 |                                                    |
[4K, 8K)          846118 |                                                    |
[8K, 16K)       15240682 |@@@@@@@@@@@@@                                       |
[16K, 32K)      60945786 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[32K, 64K)      10230719 |@@@@@@@@                                            |
[64K, 128K)      1612647 |@                                                   |
[128K, 256K)      498344 |                                                    |
[256K, 512K)        8550 |                                                    |
[512K, 1M)           199 |                                                    |
[1M, 2M)               1 |                                                    |

*** z3fold ***

@load_ns:
[128, 256)           344 |                                                    |
[256, 512)           999 |                                                    |
[512, 1K)            859 |                                                    |
[1K, 2K)           21069 |                                                    |
[2K, 4K)           53704 |                                                    |
[4K, 8K)         1351571 |@@@@                                                |
[8K, 16K)       14142680 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[16K, 32K)      11788684 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@         |
[32K, 64K)       1133377 |@@@@                                                |
[64K, 128K)       121670 |                                                    |
[128K, 256K)       68663 |                                                    |
[256K, 512K)         120 |                                                    |
[512K, 1M)            21 |                                                    |

[512, 1K)            257 |                                                    |
[1K, 2K)           10162 |                                                    |
[2K, 4K)          149599 |                                                    |
[4K, 8K)          648121 |                                                    |
[8K, 16K)        9115497 |@@@@@@@@                                            |
[16K, 32K)      56467456 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[32K, 64K)      16235236 |@@@@@@@@@@@@@@                                      |
[64K, 128K)      1397437 |@                                                   |
[128K, 256K)      705916 |                                                    |
[256K, 512K)        3087 |                                                    |
[512K, 1M)            62 |                                                    |
[1M, 2M)               1 |                                                    |

I did not perform any sophisticated analysis on these histograms, but
eyeballing them makes it clear that all allocators have somewhat
similar latencies. zbud is slightly better than zsmalloc, and z3fold
is slightly worse than zsmalloc. This corresponds naturally to the
build times in (a).

(c) Maximum size of the zswap pool

*** zsmalloc ***
1,137,659,904 bytes = ~1.13G

*** zbud ***
1,535,741,952 bytes = ~1.5G

*** z3fold ***
1,151,303,680 bytes = ~1.15G

zbud consumes ~32.7% more memory, and z3fold consumes ~1.8% more
memory. This makes sense because zbud only stores a maximum of two
compressed pages on each order-0 page, regardless of the compression
ratio, so it is bound to consume more memory.

-------------------------------- </Results> --------------------------------

According to those results, it seems like zsmalloc is superior to
z3fold in both efficiency and latency. Zbud has a small latency
advantage, but that comes with a huge cost in terms of memory
consumption. Moreover, most known users of zswap are currently using
zsmalloc. Perhaps some folks are using zbud because it was the default
allocator up until recently. The only known disadvantage of zsmalloc
is the dependency on MMU.

Based on that, I think it doesn't make sense to keep all 3 allocators
going forward. I believe we should start with removing either zbud or
z3fold, leaving only one allocator supporting MMU. Once zsmalloc
supports !MMU (if possible), we can keep zsmalloc as the only
allocator.

Thoughts and feedback are highly appreciated. I tried to CC all the
interested folks, but others feel free to chime in.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ