linux-kernel - Re: [PATCH] zram: add zstd to the supported algorithms list

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <315465C2-671F-4165-970E-B74ACFB9398D@fb.com>
Date:   Fri, 25 Aug 2017 19:31:14 +0000
From:   Nick Terrell <terrelln@...com>
To:     Minchan Kim <minchan@...nel.org>
CC:     Joonsoo Kim <iamjoonsoo.kim@....com>,
        "sergey.senozhatsky.work@...il.com" 
        <sergey.senozhatsky.work@...il.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Yann Collet <cyan@...com>
Subject: Re: [PATCH] zram: add zstd to the supported algorithms list

On 8/24/17, 10:19 PM, "Minchan Kim" <minchan@...nel.org> wrote:
> On Fri, Aug 25, 2017 at 01:35:35AM +0000, Nick Terrell wrote:
[..]
> > I think using dictionaries in zram could be very interesting. We could for
> > example, take a random sample of the RAM and use that as the dictionary
> > for compression. E.g. take 32 512B samples from RAM and build a 16 KB
> > dictionary (sizes may vary).
> 
> For static option, could we create the dictionary with data in zram
> and dump the dictionary into file. And then, rebuiling zram or kernel
> includes the dictionary into images.
> 
> For it, we would need some knob like
> 
>         cat /sys/block/zram/zstd_dict > dict.data
> 
>         CONFIG_ZSTD_DICT_DIR=
>         CONFIG_ZSTD_DICT_FILE= 

My guess is that a static dictionary won't cut it, since different
workloads will have drastically different RAM contents, so we won't be able
to construct a single dictionary that works for them all. I'd love to be
proven wrong though.

> For dynamic option, could we make the dictionary with data
> in zram dynamically? So, upcoming pages will use the newly
> created dictionary but old compressed pages will use own dictionary.

Yeah thats totally possible on the compression side, we would just need to
save which pages were compressed with which dictionary somewhere.

> I'm not sure it's possible, anyway, if predefined dict can help
> comp ratio a lot in 4K data, I really love the feature and will support
> to have it. ;)
> 
> > 
> > I'm not sure how you would pass a dictionary into the crypto compression
> > API, but I'm sure we can make something work if dictionary compression
> > proves to be beneficial enough.
> 
> Yes, it would be better to integrate the feature crypto but Please, don't tie to
> crypto API. If it's hard to support with current cypto API in short time,
> I really want to support it with zcomp_zstd.c.
> 
> Please look at old zcomp model.
> http://elixir.free-electrons.com/linux/v4.7/source/drivers/block/zram/zcomp_lz4.c

Thanks for the link, we could definitely make zcomp work with dictionaries.

> > What data have you, or anyone, used for benchmarking compression ratio and 
> > speed for RAM? Since it is such a specialized application, the standard
> > compression benchmarks aren't very applicable.
> 
> I have used my image dumped from desktop swap device.
> Of course, it doesn't cover all of cases in the world but it would be better
> to use IO benchmark buffer, IMHO. :)

Since adding dictionary support won't be quite as easy as adding zstd
support, I think the first step is building a set of benchmarks that
represent some common real world scenarios. We can easily test different
dictionary construction algorithms in userspace, and determine if the work
will pay off for some workloads. I'll collect some RAM samples from my
device and run some preliminary tests.