[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250416100634.GB13877@suse.cz>
Date: Wed, 16 Apr 2025 12:06:34 +0200
From: David Sterba <dsterba@...e.cz>
To: Integral <integral@...hlinuxcn.org>
Cc: Chris Mason <clm@...com>, Josef Bacik <josef@...icpanda.com>,
David Sterba <dsterba@...e.com>, linux-btrfs@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: Maybe we can set default zstd compression level to 1 when SSD
detected?
On Sun, Apr 13, 2025 at 12:07:26PM +0800, Integral wrote:
> Hi,
>
> When SSD is detected, maybe we can set default zstd compression level to 1.
>
> Current default compression level for zstd is 3, which is not optimal
> for SSDs.
>
> This GitHub Gist [1] can serve as a reference.
Well, while the linked gist is thorough I don't see that zstd:1 clearly
wins against zstd:3. The compression brings overhead (more extents, CPU
cost) so the preferred criteria should be space savings. The runtimes of
read and write seem to be roughly the same.
I haven't found any description or classification of the input data
(other than known to be incompressible). This is an important factor.
> An example is Fedora Workstation [2], which uses `zstd:1` as default
> compression option.
>
> [1] Link:
> https://gist.github.com/braindevices/fde49c6a8f6b9aaf563fb977562aafec
>
> [2] Link: https://fedoraproject.org/wiki/Changes/BtrfsTransparentCompression
Unfortunately the Fedora evaluation disqualifies itself because it uses
/dev/urandom (practically incompressible) and /dev/zero (trivially
compressible). I would not select the default based on that benchmark
for the wole distro, it's IMHO flawed or incomplete at best.
For the evaluation I recommend some commonly found data types based on
their compressibility, like we did for the recent fast zstd levels
https://lore.kernel.org/linux-btrfs/20250128132235.1356769-1-neelx@suse.com/ .
Binaries, documentation, enwik9 (commonly used for compression
benchmarks), linux sources.
The classes are not exact but should represent files that are common
and have a chance of being considered for compression. Already
compressed files or other hard to compress files like media are
detected and not considered by the heuristic.
Changing defaults is possible but it affects everybody and from past
experience breaks somebody's use case or negatively affects performance.
The evaluation from the gist could be enhanced with more input data
types and CPU strengths.
Powered by blists - more mailing lists