lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190626081522.GX24419@MiWiFi-R3L-srv>
Date:   Wed, 26 Jun 2019 16:15:22 +0800
From:   Baoquan He <bhe@...hat.com>
To:     airlied@...hat.com
Cc:     kexec@...ts.infradead.org, x86@...nel.org,
        linux-kernel@...r.kernel.org, dyoung@...hat.com
Subject: mgag200 fails kdump kernel booting

Hi Dave,

We met an kdump kernel boot failure on a lenovo system. Kdump kernel
failed to boot, but just reset to firmware to reboot system. And nothing
is printed out.

The machine is a big server, with 6T memory and many cpu, its graphic
driver module is mgag200.

When added 'earlyprintk=ttyS0' into kernel command line, it printed
out only one line to console during kdump kernel booting:
     KASLR disabled: 'nokaslr' on cmdline.

Then reset to firmware to reboot system.

By further code debugging, the failure happened in
arch/x86/boot/compressed/misc.c, during kernel decompressing stage. It's
triggered by the vga printing. As you can see, in __putstr() of
arch/x86/boot/compressed/misc.c, the code checks if earlyprintk= is
specified, and print out to the target. And no matter if earlyprintk= is
added or not, it will print to VGA. And printing to VGA caused it to
reset to firmware. That's why we see nothing when didn't specify
earlyprintk=, but see only one line of printing about the 'KASLR
disabled'.

To confirm it's caused by VGA printing, I blacklist the mgag200 by
writting it into /etc/modprobe.d/blacklist.conf. The kdump kernel can
boot up successfully. And add 'nomodeset' can also make it work. So it's
for sure mgag driver or related code have something wrong when booting
code tries to re-init it.

This is the only case we ever see, tend to pursuit fix in mgag200 driver
side. Any idea or suggestion? We have two machines to be able to
reproduce it stablly.

Thanks
Baoquan

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ