linux-kernel - Re: BUG: unable to handle kernel paging request in do

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.DEB.2.20.1712141614320.4998@nanos>
Date:   Thu, 14 Dec 2017 16:31:27 +0100 (CET)
From:   Thomas Gleixner <tglx@...utronix.de>
To:     syzbot 
        <bot+4c9cbf73a47b663c6a6364f432b3e183b6896f25@...kaller.appspotmail.com>
cc:     Darren Hart <dvhart@...radead.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Ingo Molnar <mingo@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        syzkaller-bugs@...glegroups.com,
        Andrey Ryabinin <aryabinin@...tuozzo.com>,
        Alexander Potapenko <glider@...gle.com>,
        Dmitry Vyukov <dvyukov@...gle.com>
Subject: Re: BUG: unable to handle kernel paging request in do_futex

On Thu, 30 Nov 2017, syzbot wrote:
> Hello,
> 
> syzkaller hit the following crash on 11fed7829beff10184503fd65e5919926464601a
> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/master
> compiler: gcc (GCC) 7.1.1 20170620
> .config is attached
> Raw console output is attached.
> 
> Unfortunately, I don't have any reproducer for this bug yet.
> 
> 
> BUG: unable to handle kernel paging request at 00000000c314149f

That's a user space address which is nowhere in the registers. Is that
perhaps pre commit: 328b4ed93b69a ?

> IP: arch_futex_atomic_op_inuser arch/x86/include/asm/futex.h:67 [inline]
> IP: futex_atomic_op_inuser kernel/futex.c:1588 [inline]
> IP: futex_wake_op kernel/futex.c:1637 [inline]
> IP: do_futex+0x14c8/0x2280 kernel/futex.c:3483
> PGD 5e28067 P4D 5e28067 PUD 5e2a067 PMD 0
> Oops: 0002 [#1] SMP KASAN

  	^^^^ X86_PF_WRITE

> Dumping ftrace buffer:
>   (ftrace buffer empty)
> Modules linked in:
> CPU: 0 PID: 14626 Comm: syz-executor6 Not tainted 4.15.0-rc1-next-20171130+
> #56
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google
> 01/01/2011
> task: 000000005f17dad6 task.stack: 000000005af7607c
> RIP: 0010:arch_futex_atomic_op_inuser arch/x86/include/asm/futex.h:67 [inline]
> RIP: 0010:futex_atomic_op_inuser kernel/futex.c:1588 [inline]
> RIP: 0010:futex_wake_op kernel/futex.c:1637 [inline]
> RIP: 0010:do_futex+0x14c8/0x2280 kernel/futex.c:3483
> RSP: 0018:ffff8801cffafa18 EFLAGS: 00010246
> RAX: 000000007fffffff RBX: 0000000040000002 RCX: ffffffff8164e3d9
> RDX: 0000000000000000 RSI: ffffc900034e8000 RDI: 0000000000000000
> RBP: ffff8801cffafe38 R08: 1ffffffff0d31367 R09: 0000000000000004
> R10: 0000000000000000 R11: ffffffff8748cd60 R12: ffff8801d0f30180
> R13: 0000000020000000 R14: dffffc0000000000 R15: ffff8801cffafe10
> FS:  00007f66305e0700(0000) GS:ffff8801db400000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: fffffffffffffff8 CR3: 00000001ccc2e000 CR4: 00000000001426f0

       ^^^^^^^^^^^^^^^^ is a totally different address so its either
       completely bogus or the above address is a hashed pointer because
       that printk used to be %p and was changed to %px in 328b4ed93b69a

> DR0: 0000000020000000 DR1: 0000000020000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff3 DR7: 0000000000bb060a
> Call Trace:
> SYSC_futex kernel/futex.c:3533 [inline]
> SyS_futex+0x260/0x390 kernel/futex.c:3501
> entry_SYSCALL_64_fastpath+0x1f/0x96
> RIP: 0033:0x4529d9
> RSP: 002b:00007f66305dfc58 EFLAGS: 00000212 ORIG_RAX: 00000000000000ca
> RAX: ffffffffffffffda RBX: 00007f66305e0700 RCX: 00000000004529d9
> RDX: 0000000000000007 RSI: 0000000000000085 RDI: 0000000020062000
> RBP: 0000000000000000 R08: 0000000020000000 R09: 0000000040000002
> R10: 000000002085fff0 R11: 0000000000000212 R12: 0000000000000000
> R13: 0000000000a6f7ff R14: 00007f66305e09c0 R15: 0000000000000000

The arguments are:

RDI uaddr   0000000020062000
RSI op      0000000000000085
RDX val     0000000000000007
RCX utime   00000000004529d9
R8  uaddr2  0000000020000000
R9  val2    0000000040000002

> Code: 31 d2 0f 1f 00 45 87 65 00 0f 1f 00 89 95 30 fc ff ff e9 1d ff ff ff e8
> 67 56 0b 00 31 d2 8b bd 00 fc ff ff 0f 1f 00 41 8b 45 00 <89> c1 31 f9 f0 41
> 0f b1 4d 00 75 f0 0f 1f 00 41 89 c4 89 95 30

and the code is:

27:   41 8b 45 00             mov    0x0(%r13),%eax
2b:*  89 c1                   mov    %eax,%ecx                <-- trapping instruction
2d:   31 f9                   xor    %edi,%ecx
2f:   f0 41 0f b1 4d 00       lock cmpxchg %ecx,0x0(%r13)
35:   75 f0                   jne    0x27

The trapping instruction cannot trap :). Assumed it's the move before that,
then the accessed location is R13 + 0 = 0000000020000000, which is uaddr2
and entirely correct.

And what I completely fail to understand why this triggers at all. That
code section is guarded by an extable fixup so this should never come in.

Is this a KASAN artifact?

Thanks,

	tglx