linux-kernel - Re: [syzbot] BUG: sleeping function called from invalid context in __fdget

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAMj1kXFyJ9C4aqq9+DxOMCPOxQUApQK+Oa3V8F0H39wwoK9wxA@mail.gmail.com>
Date:   Wed, 30 Jun 2021 11:13:00 +0200
From:   Ard Biesheuvel <ardb@...nel.org>
To:     Herbert Xu <herbert@...dor.apana.org.au>
Cc:     Dave Hansen <dave.hansen@...el.com>,
        syzbot <syzbot+5d1bad8042a8f0e8117a@...kaller.appspotmail.com>,
        Borislav Petkov <bp@...en8.de>,
        "H. Peter Anvin" <hpa@...or.com>, jpa@....mail.kapsi.fi,
        kan.liang@...ux.intel.com,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Andy Lutomirski <luto@...nel.org>,
        Ingo Molnar <mingo@...hat.com>,
        syzkaller-bugs <syzkaller-bugs@...glegroups.com>,
        Thomas Gleixner <tglx@...utronix.de>, X86 ML <x86@...nel.org>
Subject: Re: [syzbot] BUG: sleeping function called from invalid context in __fdget_pos

On Wed, 30 Jun 2021 at 10:10, Herbert Xu <herbert@...dor.apana.org.au> wrote:
>
> Hi Ard:
>
> On Wed, Jun 30, 2021 at 09:42:14AM +0200, Ard Biesheuvel wrote:
> >
> > > There's one suspect-looking site in xts_crypt():
> > >
> > > >       kernel_fpu_begin();
> > > >
> > > >       /* calculate first value of T */
> > > >       aesni_enc(aes_ctx(ctx->raw_tweak_ctx), walk.iv, walk.iv);
> > > >
> > > >       while (walk.nbytes > 0) {
> > > >               int nbytes = walk.nbytes;
> > > >
> > > >               ...
> > > >
> > > >               err = skcipher_walk_done(&walk, walk.nbytes - nbytes);
> > > >
> > > >               kernel_fpu_end();
> > > >
> > > >               if (walk.nbytes > 0)
> > > >                       kernel_fpu_begin();
> > > >       }
> > >
> > > I wonder if a slab allocation failure could leave us with walk.nbytes==0.
> >
> > The code is actually the other way around: kernel_fpu_end() comes
> > before the call to skcipher_walk_done().
> >
> > So IIUC, this code forces an allocation failure, and checks whether
> > the code deals with this gracefully, right?
> >
> > The skcipher walk API guarantees that walk.nbytes == 0 if an error is
> > returned, so the pairing of FPU begin/end looks correct to me. And
> > skcipher_walk_next() should not invoke anything that might sleep from
> > this particular context.
> >
> > Herbert, any ideas?
>
> xts_crypt looks buggy to me.  In particular, if the second
> skcipher_walk_virt call (the one in the if clause) fails, then
> we will return without calling kernel_fpu_end.
>
> Another issue, we are not checking for errors on the first
> skcipher_walk_virt call, this may cause a double-free with
> the subsequent skcipher_walk_abort inside the if clause.
>
> With skcikpher_walk_virt, you must check for errors explicitly
> *unless* you use it in a loop construct which exits on !walk->nbytes.
>

So something like this, I suppose?

--- a/arch/x86/crypto/aesni-intel_glue.c
+++ b/arch/x86/crypto/aesni-intel_glue.c
@@ -849,6 +849,8 @@
                return -EINVAL;

        err = skcipher_walk_virt(&walk, req, false);
+       if (err)
+               return err;

        if (unlikely(tail > 0 && walk.nbytes < walk.total)) {
                int blocks = DIV_ROUND_UP(req->cryptlen, AES_BLOCK_SIZE) - 2;
@@ -862,7 +864,10 @@
                skcipher_request_set_crypt(&subreq, req->src, req->dst,
                                           blocks * AES_BLOCK_SIZE, req->iv);
                req = &subreq;
+
                err = skcipher_walk_virt(&walk, req, false);
+               if (err)
+                       return err;
        } else {
                tail = 0;
        }