lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <1544176279-77828-1-git-send-email-feng.tang@intel.com>
Date:   Fri,  7 Dec 2018 17:51:19 +0800
From:   Feng Tang <feng.tang@...el.com>
To:     Andrew Morton <akpm@...ux-foundation.org>,
        linux-kernel@...r.kernel.org
Cc:     Feng Tang <feng.tang@...el.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Kees Cook <keescook@...omium.org>,
        Borislav Petkov <bp@...e.de>,
        Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>,
        Petr Mladek <pmladek@...e.com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Andi Kleen <ak@...ux.intel.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Sasha Levin <sashal@...nel.org>, stable@...nel.org
Subject: [PATCH v4] panic: Avoid the extra noise dmesg

When kernel panic happens, it will first print the panic call stack,
then the ending msg like:

[   35.743249] ---[ end Kernel panic - not syncing: Fatal exception
[   35.749975] ------------[ cut here ]------------

The above message are very useful for debugging.

But if system is configured to not reboot on panic, say the "panic_timeout"
parameter equals 0, it will likely print out many noisy message like
WARN() call stack for each and every CPU except the panic one, messages
like below:

	WARNING: CPU: 1 PID: 280 at kernel/sched/core.c:1198 set_task_cpu+0x183/0x190
	Call Trace:
	<IRQ>
	try_to_wake_up
	default_wake_function
	autoremove_wake_function
	__wake_up_common
	__wake_up_common_lock
	__wake_up
	wake_up_klogd_work_func
	irq_work_run_list
	irq_work_tick
	update_process_times
	tick_sched_timer
	__hrtimer_run_queues
	hrtimer_interrupt
	smp_apic_timer_interrupt
	apic_timer_interrupt

For people working in console mode, the screen will first show the panic
call stack, but immediately overridded by these noisy extra messages, which
makes debugging much more difficult, as the original context gets lost on
screen.

Also these noisy messages will confuse some users, as I have seen many bug
reporters posted the noisy message into bugzilla, instead of the real panic
call stack and context.

Keeping the interrupt disabled will avoid the noisy message.

When code runs to this point, it means user has chosed to not reboot, or
do any special handling by using the panic notifier method, the only reason
to enable the interrupt may be sysrq migic key and panic_blink function
(though may not work even with irq enabled). 

So make the irq disabled by default and add a cmdline parameter
"panic_keep_irq_on" to turn it on when needed.

Signed-off-by: Feng Tang <feng.tang@...el.com>
Cc: Thomas Gleixner <tglx@...utronix.de>
Cc: Kees Cook <keescook@...omium.org>
Cc: Borislav Petkov <bp@...e.de>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>
Cc: Petr Mladek <pmladek@...e.com>
Cc: Steven Rostedt <rostedt@...dmis.org>
Cc: Andi Kleen <ak@...ux.intel.com>
Cc: Peter Zijlstra <peterz@...radead.org>
Cc: Sasha Levin <sashal@...nel.org>
Cc: stable@...nel.org
---
Changelog:

  v4:
     - make the local_irq_enable conditional and default off
       to cover possible use of interrupt/scheduling, as
       mentioned by Sergey and Petr
  v3:
     - Make the change log clearer as suggested by Andrew Morton
  v2:
     - Move the solution from hacking arch/scheduler code back
       to panic.c

 Documentation/admin-guide/kernel-parameters.txt | 6 ++++++
 kernel/panic.c                                  | 9 ++++++++-
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index ad8dc61..45621e9 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -3093,6 +3093,12 @@
 	panic_on_warn	panic() instead of WARN().  Useful to cause kdump
 			on a WARN().
 
+	panic_keep_irq_on
+			To keep the panic call stack info clean and suppress
+			noisy kernel message, the interrupt on panic cpu is
+			disabled by default. Add this to cmdline to enable
+			interrupt for panic cpu.
+
 	crash_kexec_post_notifiers
 			Run kdump after running panic-notifiers and dumping
 			kmsg. This only for the users who doubt kdump always
diff --git a/kernel/panic.c b/kernel/panic.c
index 85c4d14..04ace9b 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -40,6 +40,7 @@ static int pause_on_oops;
 static int pause_on_oops_flag;
 static DEFINE_SPINLOCK(pause_on_oops_lock);
 bool crash_kexec_post_notifiers;
+bool panic_keep_irq_on;
 int panic_on_warn __read_mostly;
 
 int panic_timeout = CONFIG_PANIC_TIMEOUT;
@@ -322,7 +323,12 @@ void panic(const char *fmt, ...)
 	}
 #endif
 	pr_emerg("---[ end Kernel panic - not syncing: %s ]---\n", buf);
-	local_irq_enable();
+
+	if (panic_keep_irq_on)
+		local_irq_enable();
+	else
+		pr_emerg("Please add panic_keep_irq_on to cmdline if you want blink/sysrq to work.");
+
 	for (i = 0; ; i += PANIC_TIMER_STEP) {
 		touch_softlockup_watchdog();
 		if (i >= i_next) {
@@ -684,6 +690,7 @@ core_param(panic, panic_timeout, int, 0644);
 core_param(panic_print, panic_print, ulong, 0644);
 core_param(pause_on_oops, pause_on_oops, int, 0644);
 core_param(panic_on_warn, panic_on_warn, int, 0644);
+core_param(panic_keep_irq_on, panic_keep_irq_on, bool, 0644);
 core_param(crash_kexec_post_notifiers, crash_kexec_post_notifiers, bool, 0644);
 
 static int __init oops_setup(char *s)
-- 
2.7.4

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ