linux-kernel - [PATCH 02/10] printk: Try harder to get logbuf

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Mon, 25 May 2015 14:46:25 +0200
From:	Petr Mladek <pmladek@...e.cz>
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	Frederic Weisbecker <fweisbec@...il.com>,
	Steven Rostedt <rostedt@...dmis.org>,
	Dave Anderson <anderson@...hat.com>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	Kay Sievers <kay@...y.org>, Jiri Kosina <jkosina@...e.cz>,
	Michal Hocko <mhocko@...e.cz>, Jan Kara <jack@...e.cz>,
	linux-kernel@...r.kernel.org, Wang Long <long.wanglong@...wei.com>,
	peifeiyue@...wei.com, dzickus@...hat.com, morgan.wang@...wei.com,
	sasha.levin@...cle.com, Petr Mladek <pmladek@...e.cz>
Subject: [PATCH 02/10] printk: Try harder to get logbuf_lock on NMI

If the logbuf_lock is not available immediately, it does not mean
that there is a deadlock. We should try harder and wait a bit.

On the other hand, we must not forget that we are in NMI and the timeout
has to be rather small. It must not cause dangerous stalls.

I even got full system freeze when the timeout was 10ms and I printed
backtraces from all CPUs. In this case, all CPUs were blocked for
too long.

Signed-off-by: Petr Mladek <pmladek@...e.cz>
---
 kernel/printk/printk.c | 38 +++++++++++++++++++++++++++++++++++---
 1 file changed, 35 insertions(+), 3 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 94fcf6f0b542..e6c00d6ee8dc 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -231,6 +231,8 @@ static DEFINE_RAW_SPINLOCK(logbuf_lock);
 
 #ifdef CONFIG_PRINTK
 DECLARE_WAIT_QUEUE_HEAD(log_wait);
+/* cpu currently holding logbuf_lock */
+static unsigned int logbuf_cpu = UINT_MAX;
 /* the next printk record to read by syslog(READ) or /proc/kmsg */
 static u64 syslog_seq;
 static u32 syslog_idx;
@@ -1610,6 +1612,38 @@ static size_t cont_print_text(char *text, size_t size)
 	return textlen;
 }
 
+/*
+ * This value defines the maximum delay that we spend waiting for logbuf_lock
+ * in NMI context. 100us looks like a good compromise. Note that, for example,
+ * syslog_print_all() might hold the lock for quite some time. On the other
+ * hand, waiting 10ms caused system freeze when many backtraces were printed
+ * in NMI.
+ */
+#define TRY_LOCKBUF_LOCK_MAX_DELAY_NS 100000UL
+
+/* We must be careful in NMI when we managed to preempt a running printk */
+static int try_logbuf_lock_in_nmi(void)
+{
+	u64 start_time, current_time;
+	int this_cpu = smp_processor_id();
+
+	/* no way if we are already locked on this CPU */
+	if (logbuf_cpu == this_cpu)
+		return 0;
+
+	/* try hard to get the lock but do not wait forever */
+	start_time = cpu_clock(this_cpu);
+	current_time = start_time;
+	while (current_time - start_time < TRY_LOCKBUF_LOCK_MAX_DELAY_NS) {
+		if (raw_spin_trylock(&logbuf_lock))
+			return 1;
+		cpu_relax();
+		current_time = cpu_clock(this_cpu);
+	}
+
+	return 0;
+}
+
 asmlinkage int vprintk_emit(int facility, int level,
 			    const char *dict, size_t dictlen,
 			    const char *fmt, va_list args)
@@ -1624,8 +1658,6 @@ asmlinkage int vprintk_emit(int facility, int level,
 	int this_cpu;
 	int printed_len = 0;
 	bool in_sched = false;
-	/* cpu currently holding logbuf_lock in this function */
-	static unsigned int logbuf_cpu = UINT_MAX;
 
 	if (level == LOGLEVEL_SCHED) {
 		level = LOGLEVEL_DEFAULT;
@@ -1672,7 +1704,7 @@ asmlinkage int vprintk_emit(int facility, int level,
 	 */
 	if (!in_nmi()) {
 		raw_spin_lock(&logbuf_lock);
-	} else if (!raw_spin_trylock(&logbuf_lock)) {
+	} else if (!try_logbuf_lock_in_nmi()) {
 		atomic_inc(&nmi_message_lost);
 		lockdep_on();
 		local_irq_restore(flags);
-- 
1.8.5.6

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/