[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <24e77aad-08ca-41c4-8e64-301fcc9370b1@konsulko.se>
Date: Tue, 8 Apr 2025 23:38:57 +0200
From: Vitaly Wool <vitaly.wool@...sulko.se>
To: Johannes Weiner <hannes@...xchg.org>, Igor Belousov <igor.b@...dev.am>
Cc: Nhat Pham <nphamcs@...il.com>, linux-mm@...ck.org,
akpm@...ux-foundation.org, linux-kernel@...r.kernel.org,
Shakeel Butt <shakeel.butt@...ux.dev>, Yosry Ahmed <yosryahmed@...gle.com>
Subject: Re: [PATCH v2] mm: add zblock allocator
On 4/8/25 21:55, Johannes Weiner wrote:
> On Tue, Apr 08, 2025 at 01:20:11PM +0400, Igor Belousov wrote:
>>>>>> Now what's funny is that when I tried to compare how 32 threaded build
>>>>>> would behave on a 8-core VM I couldn't do it because it OOMs with
>>>>>> zsmalloc as zswap backend. With zblock it doesn't, though, and the
>>>>>> results are:
>>>>>> real 12m14.012s
>>>>>> user 39m37.777s
>>>>>> sys 14m6.923s
>>>>>> Zswap: 440148 kB
>>>>>> Zswapped: 924452 kB
>>>>>> zswpin 594812
>>>>>> zswpout 2802454
>>>>>> zswpwb 10878
>>>>
>>>> It's LZ4 for all the test runs.
>>>
>>> Can you try zstd and let me know how it goes :)
>>
>> Sure. zstd/8 cores/make -j32:
>>
>> zsmalloc:
>> real 7m36.413s
>> user 38m0.481s
>> sys 7m19.108s
>> Zswap: 211028 kB
>> Zswapped: 925904 kB
>> zswpin 397851
>> zswpout 1625707
>> zswpwb 5126
>>
>> zblock:
>> real 7m55.009s
>> user 39m23.147s
>> sys 7m44.004s
>> Zswap: 253068 kB
>> Zswapped: 919956 kB
>> zswpin 456843
>> zswpout 2058963
>> zswpwb 3921
>
> So zstd results in nearly double the compression ratio, which in turn
> cuts total execution time *almost in half*.
>
> The numbers speak for themselves. Compression efficiency >>> allocator
> speed, because compression efficiency ultimately drives the continuous
> *rate* at which allocations need to occur. You're trying to optimize a
> constant coefficient at the expense of a higher-order one, which is a
> losing proposition.
Well, not really. This is an isolated use case with
a. significant computing power under the hood
b. relatively few cores
c. relatively short test
d. 4K pages
If any of these isn't true, zblock dominates.
!a => zstd is too slow
!b => parallelization gives more effect
!c => zsmalloc starts losing due to having to deal with internal
fragmentation
!d => compression efficiency of zblock is better.
Even !d alone makes zblock a better choice for ARM64 based servers.
~Vitaly
Powered by blists - more mailing lists