lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJD7tkZXS-UJVAFfvxJ0nNgTzWBiqepPYA4hEozi01_qktkitg@mail.gmail.com>
Date: Fri, 15 Dec 2023 13:21:57 -0800
From: Yosry Ahmed <yosryahmed@...gle.com>
To: Nhat Pham <nphamcs@...il.com>
Cc: akpm@...ux-foundation.org, tj@...nel.org, lizefan.x@...edance.com, 
	hannes@...xchg.org, cerasuolodomenico@...il.com, sjenning@...hat.com, 
	ddstreet@...e.org, vitaly.wool@...sulko.com, mhocko@...nel.org, 
	roman.gushchin@...ux.dev, shakeelb@...gle.com, muchun.song@...ux.dev, 
	hughd@...gle.com, corbet@....net, konrad.wilk@...cle.com, 
	senozhatsky@...omium.org, rppt@...nel.org, linux-mm@...ck.org, 
	kernel-team@...a.com, linux-kernel@...r.kernel.org, linux-doc@...r.kernel.org, 
	david@...t.cz, chrisl@...nel.org
Subject: Re: [PATCH v6] zswap: memcontrol: implement zswap writeback disabling

On Thu, Dec 7, 2023 at 11:24 AM Nhat Pham <nphamcs@...il.com> wrote:
>
> During our experiment with zswap, we sometimes observe swap IOs due to
> occasional zswap store failures and writebacks-to-swap. These swapping
> IOs prevent many users who cannot tolerate swapping from adopting zswap
> to save memory and improve performance where possible.
>
> This patch adds the option to disable this behavior entirely: do not
> writeback to backing swapping device when a zswap store attempt fail,
> and do not write pages in the zswap pool back to the backing swap
> device (both when the pool is full, and when the new zswap shrinker is
> called).
>
> This new behavior can be opted-in/out on a per-cgroup basis via a new
> cgroup file. By default, writebacks to swap device is enabled, which is
> the previous behavior. Initially, writeback is enabled for the root
> cgroup, and a newly created cgroup will inherit the current setting of
> its parent.
>
> Note that this is subtly different from setting memory.swap.max to 0, as
> it still allows for pages to be stored in the zswap pool (which itself
> consumes swap space in its current form).
>
> This patch should be applied on top of the zswap shrinker series:
>
> https://lore.kernel.org/linux-mm/20231130194023.4102148-1-nphamcs@gmail.com/
>
> as it also disables the zswap shrinker, a major source of zswap
> writebacks.
>
> Suggested-by: Johannes Weiner <hannes@...xchg.org>
> Signed-off-by: Nhat Pham <nphamcs@...il.com>
> Reviewed-by: Yosry Ahmed <yosryahmed@...gle.com>

Taking a step back from all the memory.swap.tiers vs.
memory.zswap.writeback discussions, I think there may be a more
fundamental problem here. If the zswap store failure is recurrent,
pages can keep going back to the LRUs and then sent back to zswap
eventually, only to be rejected again. For example, this can if zswap
is above the acceptance threshold, but could be even worse if it's the
allocator rejecting the page due to not compressing well enough. In
the latter case, the page can keep going back and forth between zswap
and LRUs indefinitely.

You probably did not run into this as you're using zsmalloc, but it
can happen with zbud AFAICT. Even with zsmalloc, a less problematic
version can happen if zswap is above its acceptance threshold.

This can cause thrashing and ineffective reclaim. We have an internal
implementation where we mark incompressible pages and put them on the
unevictable LRU when we don't have a backing swapfile (i.e. ghost
swapfiles), and something similar may work if writeback is disabled.
We need to scan such incompressible pages periodically though to
remove them from the unevictable LRU if they have been dirited.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ