linux-kernel - Re: [PATCH v2 6/6] mm: zswap: interrupt shrinker writeback while pagein/out IO

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAKEwX=NJjDL3aW3hXioxh=yASSsHbDBWubV9cE2RiH+tSXpscw@mail.gmail.com>
Date: Mon, 8 Jul 2024 17:57:39 -0700
From: Nhat Pham <nphamcs@...il.com>
To: Takero Funaki <flintglass@...il.com>
Cc: Johannes Weiner <hannes@...xchg.org>, Yosry Ahmed <yosryahmed@...gle.com>, 
	Chengming Zhou <chengming.zhou@...ux.dev>, Jonathan Corbet <corbet@....net>, 
	Andrew Morton <akpm@...ux-foundation.org>, 
	Domenico Cerasuolo <cerasuolodomenico@...il.com>, linux-mm@...ck.org, linux-doc@...r.kernel.org, 
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 6/6] mm: zswap: interrupt shrinker writeback while
 pagein/out IO

On Fri, Jul 5, 2024 at 7:25 PM Takero Funaki <flintglass@...il.com> wrote:
>
> To prevent the zswap global shrinker from writing back pages
> simultaneously with IO performed for memory reclaim and faults, delay
> the writeback when zswap_store() rejects pages or zswap_load() cannot
> find entry in pool.
>
> When the zswap shrinker is running and zswap rejects an incoming page,
> simulatenous zswap writeback and the rejected page lead to IO contention
> on swap device. In this case, the writeback of the rejected page must be
> higher priority as it is necessary for actual memory reclaim progress.
> The zswap global shrinker can run in the background and should not
> interfere with memory reclaim.

Hmm what about this scenario: when we disable zswap writeback on a
cgroup, if zswap_store() fails, we are delaying the global shrinker
for no gain essentially. There is no subsequent IO. I don't think we
are currently handling this, right?

>
> The same logic applies to zswap_load(). When zswap cannot find requested
> page from pool and read IO is performed, shrinker should be interrupted.
>

Yet another (less concerning IMHO) scenario is when a cgroup disables
zswap by setting zswap.max = 0 (for instance, if the sysadmin knows
that this cgroup's data are really cold, and/or that the workload is
latency-tolerant, and do not want it to take up valuable memory
resources of other cgroups). Every time this cgroup reclaims memory,
it would disable the global shrinker (including the new proactive
behavior) for other cgroup, correct? And, when they do need to swap
in, it would further delay the global shrinker. Would this break of
isolation be a problem?

There are other concerns I raised in the cover letter's response as
well - please take a look :)