linux-kernel - Re: [syzbot] BUG: sleeping function called from invalid context in __fdget

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <b7f0725f-2731-24af-f15d-1054d6398749@intel.com>
Date:   Tue, 29 Jun 2021 07:46:25 -0700
From:   Dave Hansen <dave.hansen@...el.com>
To:     syzbot <syzbot+5d1bad8042a8f0e8117a@...kaller.appspotmail.com>,
        bp@...en8.de, hpa@...or.com, jpa@....mail.kapsi.fi,
        kan.liang@...ux.intel.com, linux-kernel@...r.kernel.org,
        luto@...nel.org, mingo@...hat.com, syzkaller-bugs@...glegroups.com,
        tglx@...utronix.de, x86@...nel.org,
        Ard Biesheuvel <ardb@...nel.org>,
        Herbert Xu <herbert@...dor.apana.org.au>
Subject: Re: [syzbot] BUG: sleeping function called from invalid context in
 __fdget_pos

... adding Ard who was recently modifying some of the
kernel_fpu_begin/end() sites in the AESNI crypto code.

On 6/28/21 12:22 PM, syzbot wrote:
> console output: https://syzkaller.appspot.com/x/log.txt?x=170e6c94300000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=42ecca11b759d96c
> dashboard link: https://syzkaller.appspot.com/bug?extid=5d1bad8042a8f0e8117a
> 
> Unfortunately, I don't have any reproducer for this issue yet.
...
> BUG: sleeping function called from invalid context at kernel/locking/mutex.c:938
> in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 29652, name: syz-executor.0
> no locks held by syz-executor.0/29652.
> Preemption disabled at:
> [<ffffffff812aa454>] kernel_fpu_begin_mask+0x64/0x260 arch/x86/kernel/fpu/core.c:126
> CPU: 0 PID: 29652 Comm: syz-executor.0 Not tainted 5.13.0-rc7-syzkaller #0

There's a better backtrace in the log before the rather useless
backtrace from lockdep:

> [ 1341.360547][T29635] FAULT_INJECTION: forcing a failure.
> [ 1341.360547][T29635] name failslab, interval 1, probability 0, space 0, times 0
> [ 1341.374439][T29635] CPU: 1 PID: 29635 Comm: syz-executor.0 Not tainted 5.13.0-rc7-syzkaller #0
> [ 1341.374712][T29630] FAT-fs (loop2): bogus number of reserved sectors
> [ 1341.383571][T29635] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> [ 1341.383591][T29635] Call Trace:
> [ 1341.383603][T29635]  dump_stack+0x141/0x1d7
> [ 1341.383630][T29635]  should_fail.cold+0x5/0xa
> [ 1341.383651][T29635]  ? skcipher_walk_next+0x6e2/0x1680
> [ 1341.383673][T29635]  should_failslab+0x5/0x10
> [ 1341.383691][T29635]  __kmalloc+0x72/0x330
> [ 1341.383720][T29635]  skcipher_walk_next+0x6e2/0x1680
> [ 1341.383744][T29635]  ? kfree+0xe5/0x7f0
> [ 1341.383776][T29635]  skcipher_walk_first+0xf8/0x3c0
> [ 1341.383805][T29635]  skcipher_walk_virt+0x523/0x760
> [ 1341.445438][T29635]  xts_crypt+0x137/0x7f0
> [ 1341.449689][T29635]  ? aesni_encrypt+0x80/0x80

There's one suspect-looking site in xts_crypt():

>	kernel_fpu_begin();
> 
>	/* calculate first value of T */
>	aesni_enc(aes_ctx(ctx->raw_tweak_ctx), walk.iv, walk.iv);
> 
>	while (walk.nbytes > 0) {
>		int nbytes = walk.nbytes;
> 	
> 		...
> 
>		err = skcipher_walk_done(&walk, walk.nbytes - nbytes);
> 
>		kernel_fpu_end();
> 
>               if (walk.nbytes > 0)
>			kernel_fpu_begin();
>	}

I wonder if a slab allocation failure could leave us with walk.nbytes==0.