linux-kernel - Re: [PATCH 00/11] btrfs: add zstd compression level support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190130174059.GA18660@dennisz-mbp>
Date:   Wed, 30 Jan 2019 12:40:59 -0500
From:   Dennis Zhou <dennis@...nel.org>
To:     dsterba@...e.cz, Dennis Zhou <dennis@...nel.org>,
        David Sterba <dsterba@...e.com>,
        Josef Bacik <josef@...icpanda.com>, Chris Mason <clm@...com>,
        Omar Sandoval <osandov@...ndov.com>,
        Nick Terrell <terrelln@...com>, kernel-team@...com,
        linux-btrfs@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 00/11] btrfs: add zstd compression level support

Hi David,

On Tue, Jan 29, 2019 at 06:18:30PM +0100, David Sterba wrote:
> On Mon, Jan 28, 2019 at 04:24:26PM -0500, Dennis Zhou wrote:
> > As mentioned above, a requirement that differs zstd from zlib is that
> > higher levels of compression require more memory. To manage this, each
> > compression level has its own queue of workspaces. A global LRU is used
> > to help with reclaim. To guarantee forward progress, a max level
> > workspace is preallocated and hidden from the LRU.
> 
> Here I'd like to bring up what was mentioned in previous iteration, the
> workspace sizes.
> 
> Level   Compression Memory
> 1       0.8 MB
> 2       1.0 MB
> 3       1.3 MB
> 4       0.9 MB
> 5       1.4 MB
> 6       1.5 MB
> 7       1.4 MB
> 8       1.8 MB
> 9       1.8 MB
> 10      1.8 MB
> 11      1.8 MB
> 12      1.8 MB
> 13      2.3 MB
> 14      2.6 MB
> 15      2.6 MB
> 
> and decompression needs memory of level 1. The sizes can be grouped
> together to say 3 sizes, I'm not sure that we'd really need 15 distinct
> workspaces. The reclaim mechanism helps, but I'd rather keep a smaller
> number of workspaces that covers average use.
> 
> Default level is 3, that's 1.3 MiB, that also covers level 1, 2 and 4.
> For 5 to 12 it's 1.8 and the rest is 2.6 MiB.
> 

I realize the current implementation doesn't have a monotonic memory
requirement guarantee. I've added that, and below is updated memory
requirements per level. I've updated the commit to include this too.

Level     Memory (KB)
1            780 
2           1004
3           1260
4           1260
5           1388
6           1516
7           1516
8           1772
9           1772
10          1772
11          1772
12          1772
13          2284
14          2547
15          2547

> > btrfs filesystem 10 times and then read back after dropping the caches.
> > The btrfs filesystem was on an SSD.
> > 
> > Level   Ratio   Compression (MB/s)  Decompression (MB/s)
> > 1       2.658        438.47                910.51
> > 2       2.744        364.86                886.55
> > 3       2.801        336.33                828.41
> > 4       2.858        286.71                886.55
> > 5       2.916        212.77                556.84
> > 6       2.363        119.82                990.85
> > 7       3.000        154.06                849.30
> > 8       3.011        159.54                875.03
> > 9       3.025        100.51                940.15
> > 10      3.033        118.97                616.26
> > 11      3.036         94.19                802.11
> > 12      3.037         73.45                931.49
> > 13      3.041         55.17                835.26
> > 14      3.087         44.70                716.78
> > 15      3.126         37.30                878.84
> 
> From my casual user's perspective, I'd use the level 1 for speed, 7 for
> better ratio and 15 for the best compression. Anything else does not
> look good, though the results would vary based on the data set. I
> assume that the silesia corpus serves as a good approximation of the
> worst case average.
> 
> The levels 7-14 strike particularly obvious pattern: same ratio but the
> speed gets worse with each level. Taking the default level into account,
> (my) recommended levels would be 1, 3, 7, 15.
> 

I do see why we want to limit the number of levels as the memory
requirements do kind of bucket themselves. But, this means when zstd
gets updated, we'd have to reevaluate the compression levels btrfs
supports. I'm not sure it's a great idea to have that dependency.
I imagine we could offer some level of guidance, but it really would be
up to the user to figure out what works best for them.

The reclaim mechanism only keeps workspaces around if they are being
used by the appropriate level. So, the memory overhead is actively used
memory and if not, it is reclaimed after at most ~2 minutes later. I
also scan up before allocating a workspace, so that should help limit
the number of workspaces in circulation.

> I went through the patches, looks mostly ok, I don't like the
> indirections but at the moment it's an implementation detail as I'd like
> to agree on the overall approach first.
> 
> We might need a few revisions or cleanup rounds to converge to an
> efficient solution, the advantage here is that it's all in-memory and
> without compatibility concerns once the level support for zstd is in and
> works.
> 
> For that reason, I'm not opposed to the current version of the patchset.
> Given the time in development schedule, it's really close to code
> freeze, but the functionality has a narrow scope so I'm tentatively
> counting with it for 5.1.

Thanks,
Dennis