[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b7f0725f-2731-24af-f15d-1054d6398749@intel.com>
Date: Tue, 29 Jun 2021 07:46:25 -0700
From: Dave Hansen <dave.hansen@...el.com>
To: syzbot <syzbot+5d1bad8042a8f0e8117a@...kaller.appspotmail.com>,
bp@...en8.de, hpa@...or.com, jpa@....mail.kapsi.fi,
kan.liang@...ux.intel.com, linux-kernel@...r.kernel.org,
luto@...nel.org, mingo@...hat.com, syzkaller-bugs@...glegroups.com,
tglx@...utronix.de, x86@...nel.org,
Ard Biesheuvel <ardb@...nel.org>,
Herbert Xu <herbert@...dor.apana.org.au>
Subject: Re: [syzbot] BUG: sleeping function called from invalid context in
__fdget_pos
... adding Ard who was recently modifying some of the
kernel_fpu_begin/end() sites in the AESNI crypto code.
On 6/28/21 12:22 PM, syzbot wrote:
> console output: https://syzkaller.appspot.com/x/log.txt?x=170e6c94300000
> kernel config: https://syzkaller.appspot.com/x/.config?x=42ecca11b759d96c
> dashboard link: https://syzkaller.appspot.com/bug?extid=5d1bad8042a8f0e8117a
>
> Unfortunately, I don't have any reproducer for this issue yet.
...
> BUG: sleeping function called from invalid context at kernel/locking/mutex.c:938
> in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 29652, name: syz-executor.0
> no locks held by syz-executor.0/29652.
> Preemption disabled at:
> [<ffffffff812aa454>] kernel_fpu_begin_mask+0x64/0x260 arch/x86/kernel/fpu/core.c:126
> CPU: 0 PID: 29652 Comm: syz-executor.0 Not tainted 5.13.0-rc7-syzkaller #0
There's a better backtrace in the log before the rather useless
backtrace from lockdep:
> [ 1341.360547][T29635] FAULT_INJECTION: forcing a failure.
> [ 1341.360547][T29635] name failslab, interval 1, probability 0, space 0, times 0
> [ 1341.374439][T29635] CPU: 1 PID: 29635 Comm: syz-executor.0 Not tainted 5.13.0-rc7-syzkaller #0
> [ 1341.374712][T29630] FAT-fs (loop2): bogus number of reserved sectors
> [ 1341.383571][T29635] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> [ 1341.383591][T29635] Call Trace:
> [ 1341.383603][T29635] dump_stack+0x141/0x1d7
> [ 1341.383630][T29635] should_fail.cold+0x5/0xa
> [ 1341.383651][T29635] ? skcipher_walk_next+0x6e2/0x1680
> [ 1341.383673][T29635] should_failslab+0x5/0x10
> [ 1341.383691][T29635] __kmalloc+0x72/0x330
> [ 1341.383720][T29635] skcipher_walk_next+0x6e2/0x1680
> [ 1341.383744][T29635] ? kfree+0xe5/0x7f0
> [ 1341.383776][T29635] skcipher_walk_first+0xf8/0x3c0
> [ 1341.383805][T29635] skcipher_walk_virt+0x523/0x760
> [ 1341.445438][T29635] xts_crypt+0x137/0x7f0
> [ 1341.449689][T29635] ? aesni_encrypt+0x80/0x80
There's one suspect-looking site in xts_crypt():
> kernel_fpu_begin();
>
> /* calculate first value of T */
> aesni_enc(aes_ctx(ctx->raw_tweak_ctx), walk.iv, walk.iv);
>
> while (walk.nbytes > 0) {
> int nbytes = walk.nbytes;
>
> ...
>
> err = skcipher_walk_done(&walk, walk.nbytes - nbytes);
>
> kernel_fpu_end();
>
> if (walk.nbytes > 0)
> kernel_fpu_begin();
> }
I wonder if a slab allocation failure could leave us with walk.nbytes==0.
Powered by blists - more mailing lists