linux-kernel - Re: unexpected kernel reboot (3)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 16 Jul 2018 12:09:46 +0200
From:   Dmitry Vyukov <dvyukov@...gle.com>
To:     Andrew Morton <akpm@...ux-foundation.org>
Cc:     syzbot <syzbot+cce9ef2dd25246f815ee@...kaller.appspotmail.com>,
        Alexey Dobriyan <adobriyan@...il.com>,
        Gargi Sharma <gs051095@...il.com>, jhugo@...eaurora.org,
        Josh Poimboeuf <jpoimboe@...hat.com>,
        Laura Abbott <lauraa@...eaurora.org>,
        LKML <linux-kernel@...r.kernel.org>, linux@...inikbrodowski.net,
        Ingo Molnar <mingo@...nel.org>,
        Steven Rostedt <rostedt@...dmis.org>,
        syzkaller-bugs <syzkaller-bugs@...glegroups.com>,
        Thomas Gleixner <tglx@...utronix.de>, thomas.lendacky@....com,
        Paolo Bonzini <pbonzini@...hat.com>,
        Radim Krčmář <rkrcmar@...hat.com>,
        KVM list <kvm@...r.kernel.org>,
        Jim Mattson <jmattson@...gle.com>
Subject: Re: unexpected kernel reboot (3)

On Fri, Jul 13, 2018 at 11:58 PM, Andrew Morton
<akpm@...ux-foundation.org> wrote:
> On Fri, 13 Jul 2018 14:39:02 -0700 syzbot <syzbot+cce9ef2dd25246f815ee@...kaller.appspotmail.com> wrote:
>
>> Hello,
>>
>> syzbot found the following crash on:
>
> hm, I don't think I've seen an "unexpected reboot" report before.
>
> Can you expand on specifically what happened here?  Did the machine
> simply magically reboot itself?  Or did an external monitor whack it,
> or...

We put some user-space workload (not involving reboot syscall), and
the machine suddenly rebooted. We don't know what triggered the
reboot, we only see the consequences. We've seen few such bugs before,
e.g.:
https://syzkaller.appspot.com/bug?id=4f1db8b5e7dfcca55e20931aec0ee707c5cafc99
Usually it involves KVM. Potentially it's a bug in the outer
kernel/VMM, it may or may not be present in tip kernel.


> Does this test distinguish from a kernel which simply locks up?

Yes. If you look at the log:

https://syzkaller.appspot.com/x/log.txt?x=17c6a6d0400000

We've booted the machine, started running a program, and them boom! it
reboots without any other diagnostics. It's not a hang.



>> HEAD commit:    1e4b044d2251 Linux 4.18-rc4
>> git tree:       upstream
>> console output: https://syzkaller.appspot.com/x/log.txt?x=17c6a6d0400000
>> kernel config:  https://syzkaller.appspot.com/x/.config?x=25856fac4e580aa7
>> dashboard link: https://syzkaller.appspot.com/bug?extid=cce9ef2dd25246f815ee
>> compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
>> syzkaller repro:https://syzkaller.appspot.com/x/repro.syz?x=165012c2400000
>> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=1571462c400000
>
> I assume the "C reproducer" is irrelevant here.
>
> Is it reproducible?

Yes, it is reproducible and the C reproducer is relevant.
If syzbot provides a reproducer, it means that it booted a clean
machine, run the provided program (nothing else besides typical init
code and ssh/scp invocation) and that's the kernel output it observed
running this exact program.
However in this case, the exact setup can be relevant. syzbot uses GCE
VMs, it may or may not reproduce with other VMMs/physical hardware,
sometimes such bugs depend on exact CPU type.


>> IMPORTANT: if you fix the bug, please add the following tag to the commit:
>> Reported-by: syzbot+cce9ef2dd25246f815ee@...kaller.appspotmail.com
>>
>> output_len: 0x00000000092459b0
>> kernel_total_size: 0x000000000a505000
>> trampoline_32bit: 0x000000000009d000
>>
>> Decompressing Linux... Parsing ELF... done.
>> Booting the kernel.
>> [    0.000000] Linux version 4.18.0-rc4+ (syzkaller@ci) (gcc version 8.0.1
>> 20180413 (experimental) (GCC)) #138 SMP Mon Jul 9 10:45:11 UTC 2018
>> [    0.000000] Command line: BOOT_IMAGE=/vmlinuz root=/dev/sda1
>> console=ttyS0 earlyprintk=serial vsyscall=native rodata=n
>> ftrace_dump_on_oops=orig_cpu oops=panic panic_on_warn=1 nmi_watchdog=panic
>> panic=86400 workqueue.watchdog_thresh=140 kvm-intel.nested=1
>>
>> ...
>>
>> regulatory database
>> [    4.519364] cfg80211: Loaded X.509 cert 'sforshee: 00b28ddf47aef9cea7'
>> [    4.520839] platform regulatory.0: Direct firmware load for
>> regulatory.db failed with error -2
>> [    4.522155] cfg80211: failed to load regulatory.db
>> [    4.522185] ALSA device list:
>> [    4.523499]   #0: Dummy 1
>> [    4.523951]   #1: Loopback 1
>> [    4.524389]   #2: Virtual MIDI Card 1
>> [    4.825991] input: ImExPS/2 Generic Explorer Mouse as
>> /devices/platform/i8042/serio1/input/input4
>> [    4.829533] md: Waiting for all devices to be available before autodetect
>> [    4.830562] md: If you don't use raid, use raid=noautodetect
>> [    4.835237] md: Autodetecting RAID arrays.
>> [    4.835882] md: autorun ...
>> [    4.836364] md: ... autorun DONE.
>
> Can we assume that the failure occurred in or immediately after the MD code,
> or might some output have been truncated?
>
> It would be useful to know what the kernel was initializing immediately
> after MD.  Do you have a kernel log for the same config when the kerenl
> didn't fail?  Or maybe enable initcall_debug?
>
> --
> You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@...glegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/20180713145811.683ffd0043cac26a5a5af725%40linux-foundation.org.
> For more options, visit https://groups.google.com/d/optout.