lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPjX3FdaxfzULnRjN7TqyS9uK_ZJSk2PRzLgQCLVGBrR0yKLGw@mail.gmail.com>
Date: Tue, 28 Jan 2025 09:46:02 +0100
From: Daniel Vacek <neelx@...e.com>
To: dsterba@...e.cz
Cc: Chris Mason <clm@...com>, Josef Bacik <josef@...icpanda.com>, David Sterba <dsterba@...e.com>, 
	Nick Terrell <terrelln@...com>, linux-btrfs@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] btrfs/zstd: enable negative compression levels mount option

On Mon, 27 Jan 2025 at 19:02, David Sterba <dsterba@...e.cz> wrote:
>
> On Fri, Jan 24, 2025 at 08:55:56AM +0100, Daniel Vacek wrote:
> > This patch allows using the fast modes (negative compression levels) of zstd.
> >
> > The performance benchmarks do not show any significant (positive or negative)
> > influence other than the lower compression ratio. But %system CPU usage
> > should also be lower which is not clearly visible from the results below.
> > That's because with the fast modes the processing is IO-bound and not CPU-bound.
> >
> > for level in {-15..-1} {1..15}; \
> > do printf "level %3d\n" $level; \
> >   mount -o compress=zstd:$level /dev/sdb /mnt/test/; \
> >   grep sdb /proc/mounts; \
> >   sync; time { time cp /dev/shm/linux-6.13.tar.xz /mnt/test/; sync; }; \
> >   compsize /mnt/test/linux-6.13.tar.xz; \
> >   sync; time { time cp /dev/shm/linux-6.13.tar /mnt/test/; sync; }; \
> >   compsize /mnt/test/linux-6.13.tar; \
> >   rm /mnt/test/linux-6.13.tar*; \
> >   umount /mnt/test/; \
> > done |& tee results | \
> > awk '/^level/{print}/^real/{print$2}/^TOTAL/{print$3"\t"$2"  |"}' | \
> > paste - - - - - - -
> >
> >                       linux-6.13.tar.xz       141M          |         linux-6.13.tar          1.4G
>
> It does not make much sense to compare it to a .xz type of compression,
> this will be detected by the heuristic as incompressible and skipped
> right away.

Yeah, these results are mostly useless.

> The linux sources are highly compressible as it's a text-like source, so
> this is one category. It would be good to see benchmarks on file types
> commonly found on systems, with similar characteristics regarding
> compressibility.
>
> - document-like (structured binary), ie. pdf, "office type of documents"
>
> - executable-like (/bin/*, libraries)
>
> - (maybe more)

I'll do some more tests with different data.

> Anything else can be considered incompressible, all the formats with
> internal compression or very compact binary format that is beyond the
> capabilities of the in-kernel implementation and its limitations.
>
> >               copy wall time  sync wall time  usage   ratio | copy wall time  sync wall time  usage   ratio
> > ==============================================================+===============================================
> > level -15     0m0,261s        0m0,329s        141M    100%  | 0m2,511s        0m2,794s        598M    40%  |
> > level -14     0m0,145s        0m0,291s        141M    100%  | 0m1,829s        0m2,443s        581M    39%  |
> > level -13     0m0,141s        0m0,289s        141M    100%  | 0m1,832s        0m2,347s        566M    38%  |
> > level -12     0m0,140s        0m0,291s        141M    100%  | 0m1,879s        0m2,246s        548M    37%  |
> > level -11     0m0,133s        0m0,271s        141M    100%  | 0m1,789s        0m2,257s        530M    35%  |
>
> I found an old mail asking ZSTD people which realtime levels are
> meaningful, the -10 was mentioned as a good cut-off. The numbers above
> confirm that although this is on a small sample.

The limit is really arbitrary. We may as well not even set one and
leave it to the user. It's not like we allocate additional memory or
any other resources.

> > level -10     0m0,146s        0m0,318s        141M    100%  | 0m1,769s        0m2,228s        512M    34%  |
> > level  -9     0m0,138s        0m0,288s        141M    100%  | 0m1,869s        0m2,304s        493M    33%  |
> > level  -8     0m0,146s        0m0,294s        141M    100%  | 0m1,846s        0m2,446s        475M    32%  |
> > level  -7     0m0,151s        0m0,298s        141M    100%  | 0m1,877s        0m2,319s        457M    30%  |
> > level  -6     0m0,134s        0m0,271s        141M    100%  | 0m1,918s        0m2,314s        437M    29%  |
> > level  -5     0m0,139s        0m0,307s        141M    100%  | 0m1,860s        0m2,254s        417M    28%  |
> > level  -4     0m0,153s        0m0,295s        141M    100%  | 0m1,916s        0m2,272s        391M    26%  |
> > level  -3     0m0,145s        0m0,308s        141M    100%  | 0m1,830s        0m2,369s        369M    24%  |
> > level  -2     0m0,150s        0m0,294s        141M    100%  | 0m1,841s        0m2,344s        349M    23%  |
> > level  -1     0m0,150s        0m0,312s        141M    100%  | 0m1,872s        0m2,487s        332M    22%  |
> > level   1     0m0,142s        0m0,310s        141M    100%  | 0m1,880s        0m2,331s        290M    19%  |
> > level   2     0m0,144s        0m0,286s        141M    100%  | 0m1,933s        0m2,266s        288M    19%  |
> > level   3     0m0,146s        0m0,304s        141M    100%  | 0m1,966s        0m2,300s        276M    18% *|
> > level   4     0m0,146s        0m0,287s        141M    100%  | 0m2,173s        0m2,496s        275M    18%  |
> > level   5     0m0,146s        0m0,304s        141M    100%  | 0m2,307s        0m2,728s        261M    17%  |
> > level   6     0m0,138s        0m0,267s        141M    100%  | 0m2,435s        0m3,151s        253M    17%  |
> > level   7     0m0,142s        0m0,301s        141M    100%  | 0m2,274s        0m3,617s        251M    16%  |
> > level   8     0m0,136s        0m0,291s        141M    100%  | 0m2,066s        0m3,913s        249M    16%  |
> > level   9     0m0,134s        0m0,283s        141M    100%  | 0m2,676s        0m4,496s        247M    16%  |
> > level  10     0m0,151s        0m0,297s        141M    100%  | 0m2,424s        0m5,102s        247M    16%  |
> > level  11     0m0,149s        0m0,296s        141M    100%  | 0m3,485s        0m7,803s        245M    16%  |
> > level  12     0m0,144s        0m0,304s        141M    100%  | 0m3,954s        0m9,067s        244M    16%  |
> > level  13     0m0,148s        0m0,319s        141M    100%  | 0m5,350s        0m13,307s       247M    16%  |
> > level  14     0m0,145s        0m0,296s        141M    100%  | 0m6,916s        0m18,218s       238M    16%  |
> > level  15     0m0,145s        0m0,293s        141M    100%  | 0m8,304s        0m24,675s       233M    15%  |
> >
> > Signed-off-by: Daniel Vacek <neelx@...e.com>
> > ---
> > Checking the ZSTD workspace memory sizes it looks like sharing
> > the level 1 workspace with all the fast modes should be safe.
> > >From the debug printf output:
> >
> >                                  level_size  max_size
> > [   11.032659] btrfs zstd ws: -15  926969  926969
>
> Yeah the level 1 should have enough memory, I think there are some
> tricks inside ZSTD to reduce the requirements on the dictionary so
> almost 1MiB is quite excessive (not only for the realtime levels), as we
> do only 128K at a time anyway.
>
> > [   11.032662] btrfs zstd ws: -14  926969  926969
> > [   11.032663] btrfs zstd ws: -13  926969  926969
> > [   11.032664] btrfs zstd ws: -12  926969  926969
> > [   11.032665] btrfs zstd ws: -11  926969  926969
> > [   11.032665] btrfs zstd ws: -10  926969  926969
> > [   11.032666] btrfs zstd ws:  -9  926969  926969
> > [   11.032666] btrfs zstd ws:  -8  926969  926969
> > [   11.032667] btrfs zstd ws:  -7  926969  926969
> > [   11.032668] btrfs zstd ws:  -6  926969  926969
> > [   11.032668] btrfs zstd ws:  -5  926969  926969
> > [   11.032669] btrfs zstd ws:  -4  926969  926969
> > [   11.032670] btrfs zstd ws:  -3  926969  926969
> > [   11.032670] btrfs zstd ws:  -2  926969  926969
> > [   11.032671] btrfs zstd ws:  -1  926969  926969
> > [   11.032672] btrfs zstd ws:   1  943353  943353
> > [   11.032673] btrfs zstd ws:   2 1041657 1041657
> > [   11.032674] btrfs zstd ws:   3 1303801 1303801
> > [   11.032674] btrfs zstd ws:   4 1959161 1959161
> > [   11.032675] btrfs zstd ws:   5 1697017 1959161
> > [   11.032676] btrfs zstd ws:   6 1697017 1959161
> > [   11.032676] btrfs zstd ws:   7 1697017 1959161
> > [   11.032677] btrfs zstd ws:   8 1697017 1959161
> > [   11.032678] btrfs zstd ws:   9 1697017 1959161
> > [   11.032679] btrfs zstd ws:  10 1697017 1959161
> > [   11.032679] btrfs zstd ws:  11 1959161 1959161
> > [   11.032680] btrfs zstd ws:  12 2483449 2483449
> > [   11.032681] btrfs zstd ws:  13 2632633 2632633
> > [   11.032681] btrfs zstd ws:  14 3277111 3277111
> > [   11.032682] btrfs zstd ws:  15 3277111 3277111
> >
> > Hence the implementation uses `zstd_ws_mem_sizes[0]` for all negative levels.
> >
> > I also plan to update the `btrfs fi defrag` interface to be able to use
> > these levels (or any levels at all).
> >
> > @@ -332,8 +335,9 @@ void zstd_put_workspace(struct list_head *ws)
> >               }
> >       }
> >
> > -     set_bit(workspace->level - 1, &wsm.active_map);
> > -     list_add(&workspace->list, &wsm.idle_ws[workspace->level - 1]);
> > +     level = max(0, workspace->level - 1);
>
> This seems to be a quite frequent pattern how to adjust the level,
> please create a helper for that so it's not the plain max() everywhere.

Ok.

> > +     set_bit(level, &wsm.active_map);
> > +     list_add(&workspace->list, &wsm.idle_ws[level]);
> >       workspace->req_level = 0;

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ