lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1445835169-8203-4-git-send-email-jack@suse.com>
Date:	Mon, 26 Oct 2015 05:52:46 +0100
From:	Jan Kara <jack@...e.com>
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	LKML <linux-kernel@...r.kernel.org>, pmladek@...e.com,
	KY Srinivasan <kys@...rosoft.com>, rostedt@...dmis.org,
	Jan Kara <jack@...e.cz>
Subject: [PATCH 3/7] kernel: Avoid softlockups in stop_machine() during heavy printing

From: Jan Kara <jack@...e.cz>

When there are lots of messages accumulated in printk buffer, printing
them (especially over serial console) can take a long time (tens of
seconds). stop_machine() will effectively make all cpus spin in
multi_cpu_stop() waiting for the CPU doing printing to print all the
messages which triggers NMI softlockup watchdog and RCU stall detector
which add even more to the messages to print. Since machine doesn't do
anything (except serving interrupts) during this time, also network
connections are dropped and other disturbances may happen.

Paper over the problem by waiting for printk buffer to be empty before
starting to stop CPUs. In theory a burst of new messages can be appended
to the printk buffer before CPUs enter multi_cpu_stop() so this isn't a 100%
solution but it works OK in practice and I'm not aware of a reasonably
simple better solution.

Signed-off-by: Jan Kara <jack@...e.cz>
---
 include/linux/console.h | 11 +++++++++++
 kernel/printk/printk.c  | 25 +++++++++++++++++++++++++
 kernel/stop_machine.c   |  9 +++++++++
 3 files changed, 45 insertions(+)

diff --git a/include/linux/console.h b/include/linux/console.h
index bd194343c346..96da462cdfeb 100644
--- a/include/linux/console.h
+++ b/include/linux/console.h
@@ -150,6 +150,17 @@ extern int console_trylock(void);
 extern void console_unlock(void);
 extern void console_conditional_schedule(void);
 extern void console_unblank(void);
+#ifdef CONFIG_SMP
+extern void printk_log_buf_drain(void);
+#else
+/*
+ * In non-SMP kernels there won't be much to drain so save some code for tiny
+ * kernels.
+ */
+static inline void printk_log_buf_drain(void)
+{
+}
+#endif
 extern struct tty_driver *console_device(int *);
 extern void console_stop(struct console *);
 extern void console_start(struct console *);
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index b9bb4a7a6dff..8dc6c146d022 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -2488,6 +2488,31 @@ struct tty_driver *console_device(int *index)
 	return driver;
 }
 
+/* For non-SMP kernels this function isn't used and would be pointless anyway */
+#ifdef CONFIG_SMP
+/*
+ * Wait until all messages accumulated in the printk buffer are printed to
+ * console. Note that as soon as this function returns, new messages may be
+ * added to the printk buffer by other CPUs.
+ */
+void printk_log_buf_drain(void)
+{
+	bool retry;
+	unsigned long flags;
+
+	while (1) {
+		raw_spin_lock_irqsave(&logbuf_lock, flags);
+		retry = console_seq != log_next_seq;
+		raw_spin_unlock_irqrestore(&logbuf_lock, flags);
+		if (!retry || console_suspended)
+			break;
+		/* Cycle console_sem to wait for outstanding printing */
+		console_lock();
+		console_unlock();
+	}
+}
+#endif
+
 /*
  * Prevent further output on the passed console device so that (for example)
  * serial drivers can disable console output before suspending a port, and can
diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index 12484e5d5c88..e9496b4a3825 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -21,6 +21,7 @@
 #include <linux/smpboot.h>
 #include <linux/atomic.h>
 #include <linux/lglock.h>
+#include <linux/console.h>
 
 /*
  * Structure to determine completion condition and record errors.  May
@@ -543,6 +544,14 @@ static int __stop_machine(cpu_stop_fn_t fn, void *data, const struct cpumask *cp
 		return ret;
 	}
 
+	/*
+	 * If there are lots of outstanding messages, printing them can take a
+	 * long time and all cpus would be spinning waiting for the printing to
+	 * finish thus triggering NMI watchdog, RCU lockups etc. Wait for the
+	 * printing here to avoid these.
+	 */
+	printk_log_buf_drain();
+
 	/* Set the initial state and stop all online cpus. */
 	set_state(&msdata, MULTI_STOP_PREPARE);
 	return stop_cpus(cpu_online_mask, multi_cpu_stop, &msdata);
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ