[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a527b179-263f-40ad-9d7c-bfa86731bfde@sina.com>
Date: Thu, 8 Jan 2026 10:57:19 +0800
From: zhangdongdong <zhangdongdong925@...a.com>
To: Sergey Senozhatsky <senozhatsky@...omium.org>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Richard Chang <richardycc@...gle.com>, Minchan Kim <minchan@...nel.org>,
Brian Geffon <bgeffon@...gle.com>, David Stevens <stevensd@...gle.com>,
linux-kernel@...r.kernel.org, linux-mm@...ck.org,
linux-block@...r.kernel.org, Minchan Kim <minchan@...gle.com>
Subject: Re: [PATCHv2 1/7] zram: introduce compressed data writeback
On 1/7/26 18:14, Sergey Senozhatsky wrote:
> On (26/01/07 15:28), zhangdongdong wrote:
>> Hi,Sergey
>>
>> Yes, we have tried high priority workqueues. In fact, our current
>> implementation already uses a dedicated workqueue created with
>> WQ_HIGHPRI and marked as UNBOUND, which handles the read/decompression
>> path for swap-in.
>>
>> Below is a simplified snippet of the queue we are currently using:
>>
>> zgroup_read_wq = alloc_workqueue("zgroup_read",
>> WQ_HIGHPRI | WQ_UNBOUND, 0);
>>
>> static int zgroup_submit_zio_async(struct zgroup_io *zio,
>> struct zram_group *zgroup)
>> {
>> struct zgroup_req req = {
>> .zio = zio,
>> };
>>
>
> zgroup... That certainly looks like a lot of downstream code ;)
>
> Do you use any strategies for writeback? Compressed writeback
> is supposed to be used for apps for which latency is not critical
> or sensitive, because of on-demand decompression costs.
>
Hi Sergey,
Sorry for the delayed reply — I had some urgent matters come up and only
got back to this now ;)
Yes, we do use writeback strategies on our side. The current
implementation focuses on batched writeback of compressed data from
zram, managed on a per-app / per-memcg basis. We track and control how
much data from each app is written back to the backing storage, with the
same assumption you mentioned: compressed writeback is primarily
intended for workloads where latency is not critical.
Accurate prefetching on swap-in is still an open problem for us. As you
pointed out, both the I/O itself and on-demand decompression introduce
additional latency on the readback path, and minimizing their impact
remains challenging.
Regarding the workqueue choice: initially we used system_dfl_wq for the
read/decompression path. Later, based on observed scheduling latency
under memory pressure, we switched to a dedicated workqueue created with
WQ_HIGHPRI | WQ_UNBOUND. This change helped reduce scheduling
interference, but it also reinforced our concern that deferring
decompression to a worker still adds an extra scheduling hop on the
swap-in path.
Best regards,
dongdong
Powered by blists - more mailing lists