lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20220113005425.74988-1-stephen.s.brennan@oracle.com>
Date:   Wed, 12 Jan 2022 16:54:25 -0800
From:   Stephen Brennan <stephen.s.brennan@...cle.com>
To:     Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Petr Mladek <pmladek@...e.com>,
        Sergey Senozhatsky <senozhatsky@...omium.org>,
        Steven Rostedt <rostedt@...dmis.org>,
        John Ogness <john.ogness@...utronix.de>
Cc:     Stephen Brennan <stephen.s.brennan@...cle.com>,
        Sergey Senozhatsky <sergey.senozhatsky@...il.com>,
        linux-kernel@...r.kernel.org
Subject: [PATCH] panic: disable optimistic spin after halting CPUs

A CPU executing with console lock spinning enabled might be halted
during a panic. Before console_flush_on_panic(), it's possible for
console_trylock() to attempt optimistic spinning, deadlocking the panic
CPU:

CPU 0 (panic CPU)             CPU 1
-----------------             ------
                              printk() {
                                vprintk_func() {
                                  vprintk_default() {
                                    vprintk_emit() {
                                      console_unlock() {
                                        console_lock_spinning_enable();
                                        ... printing to console ...
panic() {
  crash_smp_send_stop() {
    NMI  -------------------> HALT
  }
  atomic_notifier_call_chain() {
    printk() {
      ...
      console_trylock_spinnning() {
        // optimistic spin infinitely

This hang during panic can be induced when a kdump kernel is loaded, and
crash_kexec_post_notifiers=1 is present on the kernel command line. The
following script which concurrently writes to /dev/kmsg, and triggers a
panic, can result in this hang:

    #!/bin/bash
    date
    # 991 chars (based on log buffer size):
    chars="$(printf 'a%.0s' {1..991})"
    while :; do
        echo $chars > /dev/kmsg
    done &
    echo c > /proc/sysrq-trigger &
    date
    exit

Below are the hang rates for the above test case, based on v5.16-rc8
before and after this patch:

before:  197 hangs / 340 trials - 57.9%
after :    0 hangs / 245 trials -  0.0%

Fixes: dbdda842fe96 ("printk: Add console owner and waiter logic to load balance console writes")

Signed-off-by: Stephen Brennan <stephen.s.brennan@...cle.com>
Reviewed-by: Junxiao Bi <junxiao.bi@...cle.com>
---
 include/linux/console.h |  1 +
 kernel/panic.c          |  7 +++++++
 kernel/printk/printk.c  | 17 +++++++++++++++++
 3 files changed, 25 insertions(+)

diff --git a/include/linux/console.h b/include/linux/console.h
index a97f277cfdfa..4eeb46927d96 100644
--- a/include/linux/console.h
+++ b/include/linux/console.h
@@ -179,6 +179,7 @@ extern void console_unlock(void);
 extern void console_conditional_schedule(void);
 extern void console_unblank(void);
 extern void console_flush_on_panic(enum con_flush_mode mode);
+extern void console_lock_spinning_disable_on_panic(void);
 extern struct tty_driver *console_device(int *);
 extern void console_stop(struct console *);
 extern void console_start(struct console *);
diff --git a/kernel/panic.c b/kernel/panic.c
index cefd7d82366f..a9340e580b20 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -265,6 +265,13 @@ void panic(const char *fmt, ...)
 		crash_smp_send_stop();
 	}
 
+	/*
+	 * Now that we've halted other CPUs, disable optimistic spinning in
+	 * printk(). This avoids deadlocking in console_trylock(), until we call
+	 * console_flush_on_panic().
+	 */
+	console_lock_spinning_disable_on_panic();
+
 	/*
 	 * Run any panic handlers, including those that might need to
 	 * add information to the kmsg dump output.
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 57b132b658e1..d824fdf7d3fd 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -1823,6 +1823,23 @@ static int console_lock_spinning_disable_and_check(void)
 	return 1;
 }
 
+/**
+ * console_lock_spinning_disable_on_panic - disable spinning so that
+ *	a panic CPU does not enter an infinite loop
+ *
+ * This is called once all CPUs are halted. A CPU halted during a section which
+ * allowed spinning, could trigger an infinite loop in console_trylock. To avoid
+ * this, mark console_owner as NULL.
+ */
+void console_lock_spinning_disable_on_panic(void)
+{
+	WRITE_ONCE(console_owner, NULL);
+	if (raw_spin_is_locked(&console_owner_lock)) {
+		debug_locks_off();
+		raw_spin_lock_init(&console_owner_lock);
+	}
+}
+
 /**
  * console_trylock_spinning - try to get console_lock by busy waiting
  *
-- 
2.30.2

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ