[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CACT4Y+Zs5YCRgpsu6v781HvEUU6EZEuz=zj=D2VXQxrO3_L4aA@mail.gmail.com>
Date: Sun, 22 Mar 2020 08:03:17 +0100
From: Dmitry Vyukov <dvyukov@...gle.com>
To: syzbot <syzbot+3f29ca2efb056a761e38@...kaller.appspotmail.com>,
clang-built-linux <clang-built-linux@...glegroups.com>
Cc: Borislav Petkov <bp@...en8.de>, "H. Peter Anvin" <hpa@...or.com>,
Jim Mattson <jmattson@...gle.com>,
Joerg Roedel <joro@...tes.org>, KVM list <kvm@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>,
Ingo Molnar <mingo@...hat.com>,
Paolo Bonzini <pbonzini@...hat.com>,
"Christopherson, Sean J" <sean.j.christopherson@...el.com>,
syzkaller-bugs <syzkaller-bugs@...glegroups.com>,
Thomas Gleixner <tglx@...utronix.de>,
Vitaly Kuznetsov <vkuznets@...hat.com>, wanpengli@...cent.com,
"the arch/x86 maintainers" <x86@...nel.org>
Subject: Re: BUG: unable to handle kernel NULL pointer dereference in handle_external_interrupt_irqoff
On Sun, Mar 22, 2020 at 7:59 AM Dmitry Vyukov <dvyukov@...gle.com> wrote:
>
> On Sun, Mar 22, 2020 at 7:43 AM syzbot
> <syzbot+3f29ca2efb056a761e38@...kaller.appspotmail.com> wrote:
> >
> > Hello,
> >
> > syzbot found the following crash on:
> >
> > HEAD commit: b74b991f Merge tag 'block-5.6-20200320' of git://git.kerne..
> > git tree: upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=16403223e00000
> > kernel config: https://syzkaller.appspot.com/x/.config?x=6dfa02302d6db985
> > dashboard link: https://syzkaller.appspot.com/bug?extid=3f29ca2efb056a761e38
> > compiler: clang version 10.0.0 (https://github.com/llvm/llvm-project/ c2443155a0fb245c8f17f2c1c72b6ea391e86e81)
> >
> > Unfortunately, I don't have any reproducer for this crash yet.
> >
> > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > Reported-by: syzbot+3f29ca2efb056a761e38@...kaller.appspotmail.com
>
> +clang-built-linux
>
> This only happens on the instance that uses clang. So potentially this
> is related to clang. The instance also uses smack lsm, but it's less
> likely to be involved I think.
> This actually started happening around Mar 6, but the ORC unwinder
> somehow fails to unwind stack and prints only questionable frames, so
> the reports were classified as "corrupted" and all thrown in the
> "corrupted reports" bucket:
> https://syzkaller.appspot.com/bug?id=d5bc3e0c66d200d72216ab343a67c4327e4a3452
>
> There is already some discussion about this on the clang-built-linux list:
> https://groups.google.com/d/msg/clang-built-linux/Cm3VojRK69I/cfDGxIlTAwAJ
>
> The handle_external_interrupt_irqoff has some inline asm and the
> special STACK_FRAME_NON_STANDARD. So it has some potential for bad
> interaction with compilers...
>
> The commit range is presumably
> fb279f4e238617417b132a550f24c1e86d922558..63849c8f410717eb2e6662f3953ff674727303e7
> But I don't see anything that says "it's me". The only commit that
> does non-trivial changes to x86/vmx seems to be "KVM: VMX: check
> descriptor table exits on instruction emulation":
>
> $ git log --oneline
> fb279f4e238617417b132a550f24c1e86d922558..63849c8f410717eb2e6662f3953ff674727303e7
> virt/kvm/ arch/x86/kvm/
> 86f7e90ce840a KVM: VMX: check descriptor table exits on instruction emulation
> e951445f4d3b5 Merge tag 'kvmarm-fixes-5.6-1' of
> git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
> ef935c25fd648 kvm: x86: Limit the number of "kvm: disabled by bios" messages
> aaec7c03de92c KVM: x86: avoid useless copy of cpufreq policy
> 4f337faf1c55e KVM: allow disabling -Werror
> 575b255c1663c KVM: x86: allow compiling as non-module with W=1
> 7943f4acea3ca KVM: SVM: allocate AVIC data structures based on kvm_amd
> module parameter
> b3f15ec3d809c kvm: arm/arm64: Fold VHE entry/exit work into kvm_vcpu_run_vhe()
> 51b2569402a38 KVM: arm/arm64: Fix up includes for trace.h
And the problem with this crash is that it happens all the time,
basically the only crash that now happens on the instance. So
effectively all kernel testing of all subsystems has stalled due to
this.
> > BUG: kernel NULL pointer dereference, address: 0000000000000086
> > #PF: supervisor instruction fetch in kernel mode
> > #PF: error_code(0x0010) - not-present page
> > PGD a63a4067 P4D a63a4067 PUD a7627067 PMD 0
> > Oops: 0010 [#1] PREEMPT SMP KASAN
> > CPU: 0 PID: 9785 Comm: syz-executor.2 Not tainted 5.6.0-rc6-syzkaller #0
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> > RIP: 0010:0x86
> > Code: Bad RIP value.
> > RSP: 0018:ffffc90001ac7998 EFLAGS: 00010086
> > RAX: ffffc90001ac79c8 RBX: fffffe0000000000 RCX: 0000000000040000
> > RDX: ffffc9000e20f000 RSI: 000000000000b452 RDI: 000000000000b453
> > RBP: 0000000000000ec0 R08: ffffffff83987523 R09: ffffffff811c7eca
> > R10: ffff8880a4e94200 R11: 0000000000000002 R12: dffffc0000000000
> > R13: fffffe0000000ec8 R14: ffffffff880016f0 R15: fffffe0000000ecb
> > FS: 00007fb50e370700(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000
> > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 000000000000005c CR3: 0000000092fc7000 CR4: 00000000001426f0
> > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > Call Trace:
> > handle_external_interrupt_irqoff+0x154/0x280 arch/x86/kvm/vmx/vmx.c:6274
> > kvm_before_interrupt arch/x86/kvm/x86.h:343 [inline]
> > handle_external_interrupt_irqoff+0x132/0x280 arch/x86/kvm/vmx/vmx.c:6272
> > __irqentry_text_start+0x8/0x8
> > vcpu_enter_guest+0x6c77/0x9290 arch/x86/kvm/x86.c:8405
> > save_stack mm/kasan/common.c:72 [inline]
> > set_track mm/kasan/common.c:80 [inline]
> > kasan_set_free_info mm/kasan/common.c:337 [inline]
> > __kasan_slab_free+0x12e/0x1e0 mm/kasan/common.c:476
> > __cache_free mm/slab.c:3426 [inline]
> > kfree+0x10a/0x220 mm/slab.c:3757
> > tomoyo_path_number_perm+0x525/0x690 security/tomoyo/file.c:736
> > security_file_ioctl+0x55/0xb0 security/security.c:1441
> > entry_SYSCALL_64_after_hwframe+0x49/0xbe
> > __lock_acquire+0xc5a/0x1bc0 kernel/locking/lockdep.c:3954
> > test_bit include/asm-generic/bitops/instrumented-non-atomic.h:110 [inline]
> > hlock_class kernel/locking/lockdep.c:163 [inline]
> > mark_lock+0x107/0x1650 kernel/locking/lockdep.c:3642
> > lock_acquire+0x154/0x250 kernel/locking/lockdep.c:4484
> > rcu_lock_acquire+0x9/0x30 include/linux/rcupdate.h:208
> > kvm_check_async_pf_completion+0x34e/0x360 arch/x86/kvm/../../../virt/kvm/async_pf.c:137
> > vcpu_run+0x3a3/0xd50 arch/x86/kvm/x86.c:8513
> > kvm_arch_vcpu_ioctl_run+0x419/0x880 arch/x86/kvm/x86.c:8735
> > kvm_vcpu_ioctl+0x67c/0xa80 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2932
> > kvm_vm_release+0x50/0x50 arch/x86/kvm/../../../virt/kvm/kvm_main.c:858
> > vfs_ioctl fs/ioctl.c:47 [inline]
> > ksys_ioctl fs/ioctl.c:763 [inline]
> > __do_sys_ioctl fs/ioctl.c:772 [inline]
> > __se_sys_ioctl+0xf9/0x160 fs/ioctl.c:770
> > do_syscall_64+0xf3/0x1b0 arch/x86/entry/common.c:294
> > entry_SYSCALL_64_after_hwframe+0x49/0xbe
> > Modules linked in:
> > CR2: 0000000000000086
> > ---[ end trace 4da75c292cd7e3e8 ]---
> > RIP: 0010:0x86
> > Code: Bad RIP value.
> > RSP: 0018:ffffc90001ac7998 EFLAGS: 00010086
> > RAX: ffffc90001ac79c8 RBX: fffffe0000000000 RCX: 0000000000040000
> > RDX: ffffc9000e20f000 RSI: 000000000000b452 RDI: 000000000000b453
> > RBP: 0000000000000ec0 R08: ffffffff83987523 R09: ffffffff811c7eca
> > R10: ffff8880a4e94200 R11: 0000000000000002 R12: dffffc0000000000
> > R13: fffffe0000000ec8 R14: ffffffff880016f0 R15: fffffe0000000ecb
> > FS: 00007fb50e370700(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000
> > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 000000000000005c CR3: 0000000092fc7000 CR4: 00000000001426f0
> > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> >
> >
> > ---
> > This bug is generated by a bot. It may contain errors.
> > See https://goo.gl/tpsmEJ for more information about syzbot.
> > syzbot engineers can be reached at syzkaller@...glegroups.com.
> >
> > syzbot will keep track of this bug report. See:
> > https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> >
> > --
> > You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group.
> > To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@...glegroups.com.
> > To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/000000000000277a0405a16bd5c9%40google.com.
Powered by blists - more mailing lists