linux-kernel - Re: [PATCHv2 1/7] zram: introduce compressed data writeback

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <luzn25fgin43cnbmvmxwps7isqeq2pt5kfn26jqzly6hbnedlp@ojpw52ldzmuw>
Date: Thu, 8 Jan 2026 12:39:35 +0900
From: Sergey Senozhatsky <senozhatsky@...omium.org>
To: zhangdongdong <zhangdongdong925@...a.com>, 
	Jens Axboe <axboe@...nel.dk>
Cc: Sergey Senozhatsky <senozhatsky@...omium.org>, 
	Andrew Morton <akpm@...ux-foundation.org>, Richard Chang <richardycc@...gle.com>, 
	Minchan Kim <minchan@...nel.org>, Brian Geffon <bgeffon@...gle.com>, 
	David Stevens <stevensd@...gle.com>, linux-kernel@...r.kernel.org, linux-mm@...ck.org, 
	linux-block@...r.kernel.org, Minchan Kim <minchan@...gle.com>
Subject: Re: [PATCHv2 1/7] zram: introduce compressed data writeback

Hi,

On (26/01/08 10:57), zhangdongdong wrote:
> > Do you use any strategies for writeback?  Compressed writeback
> > is supposed to be used for apps for which latency is not critical
> > or sensitive, because of on-demand decompression costs.
> > 
> 
> Hi Sergey,
> 
> Sorry for the delayed reply — I had some urgent matters come up and only
> got back to this now ;)

No worries, you reply in a perfectly reasonable time frame.

> Yes, we do use writeback strategies on our side. The current implementation
> focuses on batched writeback of compressed data from
> zram, managed on a per-app / per-memcg basis. We track and control how
> much data from each app is written back to the backing storage, with the
> same assumption you mentioned: compressed writeback is primarily
> intended for workloads where latency is not critical.
> 
> Accurate prefetching on swap-in is still an open problem for us. As you
> pointed out, both the I/O itself and on-demand decompression introduce
> additional latency on the readback path, and minimizing their impact
> remains challenging.
> 
> Regarding the workqueue choice: initially we used system_dfl_wq for the
> read/decompression path. Later, based on observed scheduling latency
> under memory pressure, we switched to a dedicated workqueue created with
> WQ_HIGHPRI | WQ_UNBOUND. This change helped reduce scheduling
> interference, but it also reinforced our concern that deferring
> decompression to a worker still adds an extra scheduling hop on the
> swap-in path.

How bad (and often) is your memory pressure situation?  I just wonder
if your case is an outlier, so to speak.


Just thinking aloud:

I really don't see a path back to atomic zram read/write.  Those
were very painful and problematic, I do not consider a possibility
of re-introducing them, especially if the reason is an optional
feature (which comp-wb is).  If we want to improve latency, we need
to find a way to do it without going back to atomic read/write,
assuming that latency becomes unbearable.  But at the same time under
memory pressure everything becomes janky at some point, so I don't
know if comp-wb latency is the biggest problem in that case.

Dunno, *maybe* we can explore a possibility of grabbing both entry-lock
and per-CPU compression stream before we queue async bio, so that in
the bio completion we already *sort of* have everything we need.
However, that comes with a bunch of issues:

- the number of per-CPU compression streams is limited, naturally,
  to the number of CPUs.  So if we have a bunch of comp-wb reads we
  can block all other activities: normal zram reads/writes, which
  compete for the same per-CPU compressions streams.

- this still puts atomicity requirements on the compressors.  I haven't
  looked into, for instance, zstd *de*-compression code, but I know for
  sure that zstd compression code allocates memory internally when
  configured to use pre-trained CD-dictionaries, effectively making zstd
  use GFP_ATOMIC allocations internally, if called from atomic context.
  Do we have anything like that in decompression - I don't know.  But in
  general we cannot be sure that all compressors work in atomic context
  in the same way as they do in non-atomic context.

I don't know if solving it on zram side alone is possible.  Maybe we
can get some help from the block layer: some sort of two-stage bio
submission.  First stage: submit chained bio-s, second stage: iterate
over all submitted and completed bio-s and decompress the data.  Again,
just thinking out loud.