linux-kernel - Re: [syzbot] [crypto?] general protection fault in scatterwalk

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGsJ_4xknGt5Q6vTKHdnqvXi7fKMORwXywurdN09G2uOUJzCSg@mail.gmail.com>
Date: Wed, 27 Dec 2023 19:10:38 +0800
From: Barry Song <21cnbao@...il.com>
To: Chengming Zhou <zhouchengming@...edance.com>
Cc: Nhat Pham <nphamcs@...il.com>, Chris Li <chrisl@...nel.org>, 
	syzbot <syzbot+3eff5e51bf1db122a16e@...kaller.appspotmail.com>, 
	akpm@...ux-foundation.org, davem@...emloft.net, herbert@...dor.apana.org.au, 
	linux-crypto@...r.kernel.org, linux-kernel@...r.kernel.org, 
	syzkaller-bugs@...glegroups.com, yosryahmed@...gle.com
Subject: Re: [syzbot] [crypto?] general protection fault in
 scatterwalk_copychunks (5)

On Wed, Dec 27, 2023 at 5:16 PM Chengming Zhou
<zhouchengming@...edance.com> wrote:
>
> On 2023/12/27 15:01, Barry Song wrote:
> > On Wed, Dec 27, 2023 at 7:38 PM Chengming Zhou
> > <zhouchengming@...edance.com> wrote:
> >>
> >> On 2023/12/27 14:25, Barry Song wrote:
> >>> On Wed, Dec 27, 2023 at 4:51 PM Chengming Zhou
> >>> <zhouchengming@...edance.com> wrote:
> >>>>
> >>>> On 2023/12/27 08:23, Nhat Pham wrote:
> >>>>> On Tue, Dec 26, 2023 at 3:30 PM Chris Li <chrisl@...nel.org> wrote:
> >>>>>>
> >>>>>> Again, sorry I was looking at the decompression side rather than the
> >>>>>> compression side. The compression side does not even offer a safe
> >>>>>> version of the compression function.
> >>>>>> That seems to be dangerous. It seems for now we should make the zswap
> >>>>>> roll back to 2 page buffer until we have a safe way to do compression
> >>>>>> without overwriting the output buffers.
> >>>>>
> >>>>> Unfortunately, I think this is the way - at least until we rework the
> >>>>> crypto/compression API (if that's even possible?).
> >>>>> I still think the 2 page buffer is dumb, but it is what it is :(
> >>>>
> >>>> Hi,
> >>>>
> >>>> I think it's a bug in `scomp_acomp_comp_decomp()`, which doesn't use
> >>>> the caller passed "src" and "dst" scatterlist. Instead, it uses its own
> >>>> per-cpu "scomp_scratch", which have 128KB src and dst.
> >>>>
> >>>> When compression done, it uses the output req->dlen to copy scomp_scratch->dst
> >>>> to our dstmem, which has only one page now, so this problem happened.
> >>>>
> >>>> I still don't know why the alg->compress(src, slen, dst, &dlen) doesn't
> >>>> check the dlen? It seems an obvious bug, right?
> >>>>
> >>>> As for this problem in `scomp_acomp_comp_decomp()`, this patch below
> >>>> should fix it. I will set up a few tests to check later.
> >>>>
> >>>> Thanks!
> >>>>
> >>>> diff --git a/crypto/scompress.c b/crypto/scompress.c
> >>>> index 442a82c9de7d..e654a120ae5a 100644
> >>>> --- a/crypto/scompress.c
> >>>> +++ b/crypto/scompress.c
> >>>> @@ -117,6 +117,7 @@ static int scomp_acomp_comp_decomp(struct acomp_req *req, int dir)
> >>>>         struct crypto_scomp *scomp = *tfm_ctx;
> >>>>         void **ctx = acomp_request_ctx(req);
> >>>>         struct scomp_scratch *scratch;
> >>>> +       unsigned int dlen;
> >>>>         int ret;
> >>>>
> >>>>         if (!req->src || !req->slen || req->slen > SCOMP_SCRATCH_SIZE)
> >>>> @@ -128,6 +129,8 @@ static int scomp_acomp_comp_decomp(struct acomp_req *req, int dir)
> >>>>         if (!req->dlen || req->dlen > SCOMP_SCRATCH_SIZE)
> >>>>                 req->dlen = SCOMP_SCRATCH_SIZE;
> >>>>
> >>>> +       dlen = req->dlen;
> >>>> +
> >>>>         scratch = raw_cpu_ptr(&scomp_scratch);
> >>>>         spin_lock(&scratch->lock);
> >>>>
> >>>> @@ -145,6 +148,9 @@ static int scomp_acomp_comp_decomp(struct acomp_req *req, int dir)
> >>>>                                 ret = -ENOMEM;
> >>>>                                 goto out;
> >>>>                         }
> >>>> +               } else if (req->dlen > dlen) {
> >>>> +                       ret = -ENOMEM;
> >>>> +                       goto out;
> >>>>                 }
> >>>
> >>> This can't fix the problem as crypto_scomp_compress() has written overflow data.
> >>
> >> No, crypto_scomp_compress() writes to its own scomp_scratch->dst memory, then copy
> >> to our dstmem.
> >>
> >>>
> >>> BTW, in many cases, hardware-accelerators drivers/crypto can do compression and
> >>> decompression by off-loading CPU;
> >>> we won't have a chance to let hardware check the dst buffer size. so
> >>> giving the dst buffer
> >>> with enough length to the hardware's dma engine is the right way. I
> >>> mean, we shouldn't
> >>> change dst from 2pages to 1page.
> >>
> >> But how do we know 2 pages is enough for any compression algorithm?
> >>
> >
> > There is no guarrette 2 pages is enough.
> >
> > We can invent our own compression algorithm, in our algorithm, we can
> > add a lot of metadata, for example, longer than 4KB when the source data
> > is uncompress-able. then dst can be larger than 2 * PAGE_SIZE.  but this
> > is not the case :-) This kind of algorithm may never succeed.
> >
> > For those in-tree algorithms, we have a WORST_COMPR_FACTOR. in
> > ubifs, zram and zswap, we all use "2" as the worst comp factor.
>
> Thanks for your explanation! Maybe it's best for us to return to 2 pages
> if no other people's comments. And this really need more documentation :-)

I agree. we need some doc.

besides, i actually think we can skip zswap frontend if
over-compression is really
happening.

--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -1318,7 +1318,7 @@ bool zswap_store(struct folio *folio)
        if (zpool_malloc_support_movable(zpool))
                gfp |= __GFP_HIGHMEM | __GFP_MOVABLE;
        ret = zpool_malloc(zpool, dlen, gfp, &handle);
-       if (ret == -ENOSPC) {
+       if (ret == -ENOSPC || dlen > PAGE_SIZE) {
                zswap_reject_compress_poor++;
                goto put_dstmem;
        }

since we are not saving but wasting space in this corner case?

> since there is no any comment or check in the acomp compress interface.
>
> /*
>  * @src:        Source Data
>  * @dst:        Destination data
>  * @slen:       Size of the input buffer
>  * @dlen:       Size of the output buffer and number of bytes produced
>  * @flags:      Internal flags
>  * @__ctx:      Start of private context data
>  */
> struct acomp_req {
>         struct crypto_async_request base;
>         struct scatterlist *src;
>         struct scatterlist *dst;
>         unsigned int slen;
>         unsigned int dlen;
>         u32 flags;
>         void *__ctx[] CRYPTO_MINALIGN_ATTR;
> };
>
> >
> > /*
> >  * Some compressors, like LZO, may end up with more data then the input buffer.
> >  * So UBIFS always allocates larger output buffer, to be sure the compressor
> >  * will not corrupt memory in case of worst case compression.
> >  */
> > #define WORST_COMPR_FACTOR 2
> >
> > #ifdef CONFIG_FS_ENCRYPTION
> > #define UBIFS_CIPHER_BLOCK_SIZE FSCRYPT_CONTENTS_ALIGNMENT
> > #else
> > #define UBIFS_CIPHER_BLOCK_SIZE 0
> > #endif
> >
> > /*
> >  * How much memory is needed for a buffer where we compress a data node.
> >  */
> > #define COMPRESSED_DATA_NODE_BUF_SZ \
> >         (UBIFS_DATA_NODE_SZ + UBIFS_BLOCK_SIZE * WORST_COMPR_FACTOR)
> >
> >
> > For years, we have never seen 2 pages that can be a problem. but 1
> > page is definitely
> > not enough, I remember I once saw many cases where accelerators' dmaengine
> > can write more than 1 page.
> >
> >> Thanks.
> >>
> >>>
> >>>>                 scatterwalk_map_and_copy(scratch->dst, req->dst, 0, req->dlen,
> >>>>                                          1);
> >>>
> >>>
>

Thanks
Barry