linux-kernel - Re: [PATCH v1] panic: push panic() messages to the console even from the MCE nmi handler

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <X/hUfBaYBCPqek5T@alley>
Date:   Fri, 8 Jan 2021 13:47:56 +0100
From:   Petr Mladek <pmladek@...e.com>
To:     “William Roche <william.roche@...cle.com>
Cc:     linux-kernel@...r.kernel.org,
        John Ogness <john.ogness@...utronix.de>,
        Peter Zijlstra <peterz@...radead.org>,
        Steven Rostedt <rostedt@...dmis.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Sergey Senozhatsky <sergey.senozhatsky@...il.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Borislav Petkov <bp@...en8.de>
Subject: Re: [PATCH v1] panic: push panic() messages to the console even from
 the MCE nmi handler

On Mon 2021-01-04 16:15:55, “William Roche wrote:
> From: William Roche <william.roche@...cle.com>
> 
> Force push panic messages to the console as panic() can be called from NMI
> interrupt handler functions where printed messages can't always reach the
> console without an explicit push provided by printk_safe_flush_on_panic()
> and console_flush_on_panic().
> This is the case with the MCE handler that can lead to a system panic
> giving information on the fatal MCE root cause that must reach the console.
> 
> Signed-off-by: William Roche <william.roche@...cle.com>
> ---
> 
> Notes:
>     	While testing MCE injection and kernel reaction, we discovered a bug
>     in the way the kernel provides the panic reason information: When dealing
>     with fatal MCE, the machine (physical or virtual) can reboot without
>     leaving any message on the console.
>     
>     	This behavior can be reproduced on Intel with the mce-inject tool
>     with a simple:
>     	# modprobe mce-inject
>     	# mce-inject test/uncorrected
>     
>     	The investigations showed that the MCE panic can be totally message-less
>     or can give a small set of messages. This behavior depends on the use of the
>     crash_kexec mechanism (using the "crashkernel" parameter). Not using this
>     parameter, we get a partial [Hardware Error] information on panic, but some
>     important notifications can be missing. And when using it, a fatal MCE can
>     panic the system without leaving any information.
>     
>     . Without "crashkernel", a Fatal MCE injection shows:
>     
>     [  212.153928] mce: Machine check injector initialized
>     [  236.730682] mce: Triggering MCE exception on CPU 0
>     [  236.731304] Disabling lock debugging due to kernel taint
>     [  236.731947] mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 1: b000000000000000
>     [  236.731948] mce: [Hardware Error]: TSC 78418fb4a83f
>     [  236.731949] mce: [Hardware Error]: PROCESSOR 0:406f1 TIME 1605312952 SOCKET 0 APIC 0 microcode 1
>     [  236.731949] mce: [Hardware Error]: Run the above through 'mcelog --ascii'
>     [  236.731950] mce: [Hardware Error]: Machine check: MCIP not set in MCA handler
>     [  236.731950] Kernel panic - not syncing: Fatal machine check
>     [  236.732047] Kernel Offset: disabled
>     
>     	The system hangs 30 seconds without any additional message, and finally
>     reboots.
>     
>     . With the use of "crashkernel", a Fatal MCE injection shows only the
>     injection message
>     
>     [   80.811708] mce: Machine check injector initialized
>     [   92.298755] mce: Triggering MCE exception on CPU 0
>     [   92.299362] Disabling lock debugging due to kernel taint
>     
>     	No other messages is displayed and the system reboots immediately.

But you could find the messages in the crashdump. Aren't you?

It works this way by "design". The idea is the following:

Taking any locks from NMI context might lead to a deadlock.
Re-initializing the locks might lead to deadlock as well
because of possible double unlock. Ignoring the locks might
lead to problems either.

A compromise is needed:

1. crashdump disabled

   console_flush_on_panic() is called. It tries hard to get the
   messages on the console because it is the only chance.

   It does console_trylock(). It is called after
   bust_spinlocks(1) so that even the console-specific locks
   are taken only with trylock, see oops_in_progress.

   BTW: There are people that do not like this because there
	is still a risk of a deadlock. Some code paths
	take locks without checking oops_in_progress.
	For these people, more reliable reboot is more
	important because they want to have the system
	back ASAP (cloud people).


2. crashdump enabled:

  Only printk_safe_flush_on_panic() is called. It does the best effort
  to flush messages from the per-CPU buffers into the main log buffer
  so that they can be found easily in the core.

  It it pretty reliable. It should not be needed at all once the new
  lockless ringbuffer gets fully integrated,

  It does not try to flush the messages to the console. Getting
  the crash dump is more important than risking a deadlock with
  consoles.


Best Regards,
Petr