[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <49ca7ddc-4ea7-4081-84ee-609a23b815e4@linux.alibaba.com>
Date: Thu, 22 May 2025 15:54:14 +0800
From: Gao Xiang <hsiangkao@...ux.alibaba.com>
To: Bo Liu <liubo03@...pur.com>, xiang@...nel.org, chao@...nel.org
Cc: linux-erofs@...ts.ozlabs.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v5] erofs: support deflate decompress by using Intel QAT
Hi Bo,
On 2025/5/22 14:16, Bo Liu wrote:
> This patch introdueces the use of the Intel QAT to decompress compressed
> data in the EROFS filesystem, aiming to improve the decompression speed
> of compressed datea.
>
> We created a 285MiB compressed file and then used the following command to
> create EROFS images with different cluster size.
> # mkfs.erofs -zdeflate,level=9 -C16384
>
> fio command was used to test random read and small random read(~5%) and
> sequential read performance.
> # fio -filename=testfile -bs=4k -rw=read -name=job1
> # fio -filename=testfile -bs=4k -rw=randread -name=job1
> # fio -filename=testfile -bs=4k -rw=randread --io_size=14m -name=job1
>
> Here are some performance numbers for reference:
>
> Processors: Intel(R) Xeon(R) 6766E(144 core)
> Memory: 521 GiB
>
> |-----------------------------------------------------------------------------|
> | | Cluster size | sequential read | randread | small randread(5%) |
> |-----------|--------------|-----------------|-----------|--------------------|
> | Intel QAT | 4096 | 538 MiB/s | 112 MiB/s | 20.76 MiB/s |
> | Intel QAT | 16384 | 699 MiB/s | 158 MiB/s | 21.02 MiB/s |
> | Intel QAT | 65536 | 917 MiB/s | 278 MiB/s | 20.90 MiB/s |
> | Intel QAT | 131072 | 1056 MiB/s | 351 MiB/s | 23.36 MiB/s |
> | Intel QAT | 262144 | 1145 MiB/s | 431 MiB/s | 26.66 MiB/s |
> | deflate | 4096 | 499 MiB/s | 108 MiB/s | 21.50 MiB/s |
> | deflate | 16384 | 422 MiB/s | 125 MiB/s | 18.94 MiB/s |
> | deflate | 65536 | 452 MiB/s | 159 MiB/s | 13.02 MiB/s |
> | deflate | 131072 | 452 MiB/s | 177 MiB/s | 11.44 MiB/s |
> | deflate | 262144 | 466 MiB/s | 194 MiB/s | 10.60 MiB/s |
>
> Signed-off-by: Bo Liu <liubo03@...pur.com>
> ---
> v1: https://lore.kernel.org/linux-erofs/20250410042048.3044-1-liubo03@inspur.com/
> v2: https://lore.kernel.org/linux-erofs/20250410042048.3044-1-liubo03@inspur.com/T/#t
> v3: https://lore.kernel.org/linux-erofs/20250516082634.3801-1-liubo03@inspur.com/
> v4: https://lore.kernel.org/linux-erofs/20250521100326.2867828-1-hsiangkao@linux.alibaba.com/
> change since v4:
> - add sysfs documentation.
>
> Documentation/ABI/testing/sysfs-fs-erofs | 12 ++
> fs/erofs/Kconfig | 14 ++
> fs/erofs/Makefile | 1 +
> fs/erofs/compress.h | 10 ++
> fs/erofs/decompressor_crypto.c | 186 +++++++++++++++++++++++
> fs/erofs/decompressor_deflate.c | 17 ++-
> fs/erofs/sysfs.c | 34 ++++-
> fs/erofs/zdata.c | 1 +
> 8 files changed, 272 insertions(+), 3 deletions(-)
> create mode 100644 fs/erofs/decompressor_crypto.c
>
> diff --git a/Documentation/ABI/testing/sysfs-fs-erofs b/Documentation/ABI/testing/sysfs-fs-erofs
> index b134146d735b..95201a62f704 100644
> --- a/Documentation/ABI/testing/sysfs-fs-erofs
> +++ b/Documentation/ABI/testing/sysfs-fs-erofs
> @@ -27,3 +27,15 @@ Description: Writing to this will drop compression-related caches,
> - 1 : invalidate cached compressed folios
> - 2 : drop in-memory pclusters
> - 3 : drop in-memory pclusters and cached compressed folios
> +
> +What: /sys/fs/erofs/accel
> +Date: May 2025
> +Contact: "Bo Liu" <liubo03@...pur.com>
> +Description: The accel file is read-write and allows to set or show
> + hardware decompression accelerators, and it supports writing
> + multiple accelerators separated by ā\nā.
Used to set or show hardware accelerators in effect and multiple
accelerators are separated by '\n'.
Supported accelerator(s): qat_deflate
Disable all accelerators with an empty string (echo > accel).
> + Currently supported accelerators:
...
> +
> +static int __z_erofs_crypto_decompress(struct z_erofs_decompress_req *rq,
> + struct crypto_acomp *tfm)
> +{
> + struct sg_table st_src, st_dst;
> + struct acomp_req *req;
> + struct crypto_wait wait;
> + u8 *headpage;
> + int ret;
> +
> + headpage = kmap_local_page(*rq->in);
> + ret = z_erofs_fixup_insize(rq, headpage + rq->pageofs_in,
> + min_t(unsigned int, rq->inputsize,
> + rq->sb->s_blocksize - rq->pageofs_in));
ret = z_erofs_fixup_insize(rq, headpage + rq->pageofs_in,
min_t(unsigned int, rq->inputsize,
rq->sb->s_blocksize - rq->pageofs_in));
Otherwise it looks good to me.
Thanks,
Gao Xiang
Powered by blists - more mailing lists