lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240328193853.GG7597@cmpxchg.org>
Date: Thu, 28 Mar 2024 15:38:53 -0400
From: Johannes Weiner <hannes@...xchg.org>
To: Yosry Ahmed <yosryahmed@...gle.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
	Nhat Pham <nphamcs@...il.com>,
	Chengming Zhou <chengming.zhou@...ux.dev>, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH 7/9] mm: zswap: store zero-filled pages without a
 zswap_entry

On Mon, Mar 25, 2024 at 11:50:15PM +0000, Yosry Ahmed wrote:
> After the rbtree to xarray conversion, and dropping zswap_entry.refcount
> and zswap_entry.value, the only members of zswap_entry utilized by
> zero-filled pages are zswap_entry.length (always 0) and
> zswap_entry.objcg. Store the objcg pointer directly in the xarray as a
> tagged pointer and avoid allocating a zswap_entry completely for
> zero-filled pages.
> 
> This simplifies the code as we no longer need to special case
> zero-length cases. We are also able to further separate the zero-filled
> pages handling logic and completely isolate them within store/load
> helpers.  Handling tagged xarray pointers is handled in these two
> helpers, as well as the newly introduced helper for freeing tree
> elements, zswap_tree_free_element().
> 
> There is also a small performance improvement observed over 50 runs of
> kernel build test (kernbench) comparing the mean build time on a skylake
> machine when building the kernel in a cgroup v1 container with a 3G
> limit. This is on top of the improvement from dropping support for
> non-zero same-filled pages:
> 
> 		base            patched         % diff
> real            69.915          69.757		-0.229%
> user            2956.147        2955.244	-0.031%
> sys             2594.718        2575.747	-0.731%
> 
> This probably comes from avoiding the zswap_entry allocation and
> cleanup/freeing for zero-filled pages. Note that the percentage of
> zero-filled pages during this test was only around 1.5% on average.
> Practical workloads could have a larger proportion of such pages (e.g.
> Johannes observed around 10% [1]), so the performance improvement should
> be larger.
> 
> This change also saves a small amount of memory due to less allocated
> zswap_entry's. In the kernel build test above, we save around 2M of
> slab usage when we swap out 3G to zswap.
> 
> [1]https://lore.kernel.org/linux-mm/20240320210716.GH294822@cmpxchg.org/
> 
> Signed-off-by: Yosry Ahmed <yosryahmed@...gle.com>
> ---
>  mm/zswap.c | 137 ++++++++++++++++++++++++++++++-----------------------
>  1 file changed, 78 insertions(+), 59 deletions(-)

Tbh, I think this makes the code worse overall. Instead of increasing
type safety, it actually reduces it, and places that previously dealt
with a struct zswap_entry now deal with a void *.

If we go ahead with this series, I think it makes more sense to either

a) do the explicit subtyping of struct zswap_entry I had proposed, or

b) at least keep the specialness handling of the xarray entry as local
   as possible, so that instead of a proliferating API that operates
   on void *, you have explicit filtering only where the tree is
   accessed, and then work on struct zswap_entry as much as possible.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ