[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <5D8BCC32-4932-4030-AE42-C0009D92A7CA@nvidia.com>
Date: Wed, 07 May 2025 11:12:27 -0400
From: Zi Yan <ziy@...dia.com>
To: Nhat Pham <nphamcs@...il.com>, Barry Song <21cnbao@...il.com>
Cc: Harry Yoo <harry.yoo@...cle.com>, Qun-Wei Lin <qun-wei.lin@...iatek.com>,
Andrew Morton <akpm@...ux-foundation.org>, Mike Rapoport <rppt@...nel.org>,
Matthias Brugger <matthias.bgg@...il.com>,
AngeloGioacchino Del Regno <angelogioacchino.delregno@...labora.com>,
Sergey Senozhatsky <senozhatsky@...omium.org>,
Minchan Kim <minchan@...nel.org>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
linux-mediatek@...ts.infradead.org, Casper Li <casper.li@...iatek.com>,
Chinwen Chang <chinwen.chang@...iatek.com>,
Andrew Yang <andrew.yang@...iatek.com>, James Hsu <james.hsu@...iatek.com>
Subject: Re: [PATCH] mm: Add Kcompressd for accelerated memory compression
On 7 May 2025, at 11:00, Nhat Pham wrote:
> On Tue, May 6, 2025 at 7:04 PM Barry Song <21cnbao@...il.com> wrote:
>>
>> On Wed, May 7, 2025 at 1:50 PM Zi Yan <ziy@...dia.com> wrote:
>>>
>>> On 6 May 2025, at 21:12, Harry Yoo wrote:
>>>
>>>> On Wed, Apr 30, 2025 at 04:26:41PM +0800, Qun-Wei Lin wrote:
>>>>> This patch series introduces a new mechanism called kcompressd to
>>>>> improve the efficiency of memory reclaiming in the operating system.
>>>>>
>>>>> Problem:
>>>>> In the current system, the kswapd thread is responsible for both scanning
>>>>> the LRU pages and handling memory compression tasks (such as those
>>>>> involving ZSWAP/ZRAM, if enabled). This combined responsibility can lead
>>>>> to significant performance bottlenecks, especially under high memory
>>>>> pressure. The kswapd thread becomes a single point of contention, causing
>>>>> delays in memory reclaiming and overall system performance degradation.
>>>>>
>>>>> Solution:
>>>>> Introduced kcompressd to handle asynchronous compression during memory
>>>>> reclaim, improving efficiency by offloading compression tasks from
>>>>> kswapd. This allows kswapd to focus on its primary task of page reclaim
>>>>> without being burdened by the additional overhead of compression.
>>>>>
>>>>> In our handheld devices, we found that applying this mechanism under high
>>>>> memory pressure scenarios can increase the rate of pgsteal_anon per second
>>>>> by over 260% compared to the situation with only kswapd. Additionally, we
>>>>> observed a reduction of over 50% in page allocation stall occurrences,
>>>>> further demonstrating the effectiveness of kcompressd in alleviating memory
>>>>> pressure and improving system responsiveness.
>>>>>
>>>>> Co-developed-by: Barry Song <21cnbao@...il.com>
>>>>> Signed-off-by: Barry Song <21cnbao@...il.com>
>>>>> Signed-off-by: Qun-Wei Lin <qun-wei.lin@...iatek.com>
>>>>> Reference: Re: [PATCH 0/2] Improve Zram by separating compression context from kswapd - Barry Song
>>>>> https://lore.kernel.org/lkml/20250313093005.13998-1-21cnbao@gmail.com/
>>>>> ---
>>>>
>>>> +Cc Zi Yan, who might be interested in writing a framework (or improving
>>>> the existing one, padata) for parallelizing jobs (e.g. migration/compression)
>>>
>>> Thanks.
>>>
>>> I am currently looking into padata [1] to perform multithreaded page migration
>
> TIL about padata :)
>
>>> copy job. But based on this patch, it seems that kcompressed is just an additional
>>> kernel thread of executing zswap_store(). Is there any need for performing
>>> compression with multiple threads?
>>
>> The current focus is on enabling kswapd to perform asynchronous compression,
>> which can significantly reduce direct reclaim and allocstall events.
>> Therefore, the work begins with supporting a single thread. Supporting
>> multiple threads might be possible in the future, but it could be difficult
>> to control—especially on busy phones—since it consumes more power and may
>> interfere with other threads impacting user experience.
>
> Right, yeah.
>
>>
>>>
>>> BTW, I also notice that zswap IAA compress batching patchset[2] is using
>>> hardware accelerator (Intel Analytics Accelerator) to speed up zswap.
>>> I wonder if the handheld devices have similar hardware to get a similar benefit.
>>
>> Usually, the answer is no. We use zRAM and CPU, but this patch aims to provide
>> a common capability that can be shared by both zRAM and zswap.
>>
>
> Also, not everyone and every setup has access to hardware compression
> accelerators :) This provides benefits for all users.
Got it. Thanks for the explanation.
--
Best Regards,
Yan, Zi
Powered by blists - more mailing lists