[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <X/hUfBaYBCPqek5T@alley>
Date: Fri, 8 Jan 2021 13:47:56 +0100
From: Petr Mladek <pmladek@...e.com>
To: “William Roche <william.roche@...cle.com>
Cc: linux-kernel@...r.kernel.org,
John Ogness <john.ogness@...utronix.de>,
Peter Zijlstra <peterz@...radead.org>,
Steven Rostedt <rostedt@...dmis.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Sergey Senozhatsky <sergey.senozhatsky@...il.com>,
Thomas Gleixner <tglx@...utronix.de>,
Borislav Petkov <bp@...en8.de>
Subject: Re: [PATCH v1] panic: push panic() messages to the console even from
the MCE nmi handler
On Mon 2021-01-04 16:15:55, “William Roche wrote:
> From: William Roche <william.roche@...cle.com>
>
> Force push panic messages to the console as panic() can be called from NMI
> interrupt handler functions where printed messages can't always reach the
> console without an explicit push provided by printk_safe_flush_on_panic()
> and console_flush_on_panic().
> This is the case with the MCE handler that can lead to a system panic
> giving information on the fatal MCE root cause that must reach the console.
>
> Signed-off-by: William Roche <william.roche@...cle.com>
> ---
>
> Notes:
> While testing MCE injection and kernel reaction, we discovered a bug
> in the way the kernel provides the panic reason information: When dealing
> with fatal MCE, the machine (physical or virtual) can reboot without
> leaving any message on the console.
>
> This behavior can be reproduced on Intel with the mce-inject tool
> with a simple:
> # modprobe mce-inject
> # mce-inject test/uncorrected
>
> The investigations showed that the MCE panic can be totally message-less
> or can give a small set of messages. This behavior depends on the use of the
> crash_kexec mechanism (using the "crashkernel" parameter). Not using this
> parameter, we get a partial [Hardware Error] information on panic, but some
> important notifications can be missing. And when using it, a fatal MCE can
> panic the system without leaving any information.
>
> . Without "crashkernel", a Fatal MCE injection shows:
>
> [ 212.153928] mce: Machine check injector initialized
> [ 236.730682] mce: Triggering MCE exception on CPU 0
> [ 236.731304] Disabling lock debugging due to kernel taint
> [ 236.731947] mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 1: b000000000000000
> [ 236.731948] mce: [Hardware Error]: TSC 78418fb4a83f
> [ 236.731949] mce: [Hardware Error]: PROCESSOR 0:406f1 TIME 1605312952 SOCKET 0 APIC 0 microcode 1
> [ 236.731949] mce: [Hardware Error]: Run the above through 'mcelog --ascii'
> [ 236.731950] mce: [Hardware Error]: Machine check: MCIP not set in MCA handler
> [ 236.731950] Kernel panic - not syncing: Fatal machine check
> [ 236.732047] Kernel Offset: disabled
>
> The system hangs 30 seconds without any additional message, and finally
> reboots.
>
> . With the use of "crashkernel", a Fatal MCE injection shows only the
> injection message
>
> [ 80.811708] mce: Machine check injector initialized
> [ 92.298755] mce: Triggering MCE exception on CPU 0
> [ 92.299362] Disabling lock debugging due to kernel taint
>
> No other messages is displayed and the system reboots immediately.
But you could find the messages in the crashdump. Aren't you?
It works this way by "design". The idea is the following:
Taking any locks from NMI context might lead to a deadlock.
Re-initializing the locks might lead to deadlock as well
because of possible double unlock. Ignoring the locks might
lead to problems either.
A compromise is needed:
1. crashdump disabled
console_flush_on_panic() is called. It tries hard to get the
messages on the console because it is the only chance.
It does console_trylock(). It is called after
bust_spinlocks(1) so that even the console-specific locks
are taken only with trylock, see oops_in_progress.
BTW: There are people that do not like this because there
is still a risk of a deadlock. Some code paths
take locks without checking oops_in_progress.
For these people, more reliable reboot is more
important because they want to have the system
back ASAP (cloud people).
2. crashdump enabled:
Only printk_safe_flush_on_panic() is called. It does the best effort
to flush messages from the per-CPU buffers into the main log buffer
so that they can be found easily in the core.
It it pretty reliable. It should not be needed at all once the new
lockless ringbuffer gets fully integrated,
It does not try to flush the messages to the console. Getting
the crash dump is more important than risking a deadlock with
consoles.
Best Regards,
Petr
Powered by blists - more mailing lists