[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1432557993-20458-3-git-send-email-pmladek@suse.cz>
Date: Mon, 25 May 2015 14:46:25 +0200
From: Petr Mladek <pmladek@...e.cz>
To: Andrew Morton <akpm@...ux-foundation.org>
Cc: Frederic Weisbecker <fweisbec@...il.com>,
Steven Rostedt <rostedt@...dmis.org>,
Dave Anderson <anderson@...hat.com>,
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
Kay Sievers <kay@...y.org>, Jiri Kosina <jkosina@...e.cz>,
Michal Hocko <mhocko@...e.cz>, Jan Kara <jack@...e.cz>,
linux-kernel@...r.kernel.org, Wang Long <long.wanglong@...wei.com>,
peifeiyue@...wei.com, dzickus@...hat.com, morgan.wang@...wei.com,
sasha.levin@...cle.com, Petr Mladek <pmladek@...e.cz>
Subject: [PATCH 02/10] printk: Try harder to get logbuf_lock on NMI
If the logbuf_lock is not available immediately, it does not mean
that there is a deadlock. We should try harder and wait a bit.
On the other hand, we must not forget that we are in NMI and the timeout
has to be rather small. It must not cause dangerous stalls.
I even got full system freeze when the timeout was 10ms and I printed
backtraces from all CPUs. In this case, all CPUs were blocked for
too long.
Signed-off-by: Petr Mladek <pmladek@...e.cz>
---
kernel/printk/printk.c | 38 +++++++++++++++++++++++++++++++++++---
1 file changed, 35 insertions(+), 3 deletions(-)
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 94fcf6f0b542..e6c00d6ee8dc 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -231,6 +231,8 @@ static DEFINE_RAW_SPINLOCK(logbuf_lock);
#ifdef CONFIG_PRINTK
DECLARE_WAIT_QUEUE_HEAD(log_wait);
+/* cpu currently holding logbuf_lock */
+static unsigned int logbuf_cpu = UINT_MAX;
/* the next printk record to read by syslog(READ) or /proc/kmsg */
static u64 syslog_seq;
static u32 syslog_idx;
@@ -1610,6 +1612,38 @@ static size_t cont_print_text(char *text, size_t size)
return textlen;
}
+/*
+ * This value defines the maximum delay that we spend waiting for logbuf_lock
+ * in NMI context. 100us looks like a good compromise. Note that, for example,
+ * syslog_print_all() might hold the lock for quite some time. On the other
+ * hand, waiting 10ms caused system freeze when many backtraces were printed
+ * in NMI.
+ */
+#define TRY_LOCKBUF_LOCK_MAX_DELAY_NS 100000UL
+
+/* We must be careful in NMI when we managed to preempt a running printk */
+static int try_logbuf_lock_in_nmi(void)
+{
+ u64 start_time, current_time;
+ int this_cpu = smp_processor_id();
+
+ /* no way if we are already locked on this CPU */
+ if (logbuf_cpu == this_cpu)
+ return 0;
+
+ /* try hard to get the lock but do not wait forever */
+ start_time = cpu_clock(this_cpu);
+ current_time = start_time;
+ while (current_time - start_time < TRY_LOCKBUF_LOCK_MAX_DELAY_NS) {
+ if (raw_spin_trylock(&logbuf_lock))
+ return 1;
+ cpu_relax();
+ current_time = cpu_clock(this_cpu);
+ }
+
+ return 0;
+}
+
asmlinkage int vprintk_emit(int facility, int level,
const char *dict, size_t dictlen,
const char *fmt, va_list args)
@@ -1624,8 +1658,6 @@ asmlinkage int vprintk_emit(int facility, int level,
int this_cpu;
int printed_len = 0;
bool in_sched = false;
- /* cpu currently holding logbuf_lock in this function */
- static unsigned int logbuf_cpu = UINT_MAX;
if (level == LOGLEVEL_SCHED) {
level = LOGLEVEL_DEFAULT;
@@ -1672,7 +1704,7 @@ asmlinkage int vprintk_emit(int facility, int level,
*/
if (!in_nmi()) {
raw_spin_lock(&logbuf_lock);
- } else if (!raw_spin_trylock(&logbuf_lock)) {
+ } else if (!try_logbuf_lock_in_nmi()) {
atomic_inc(&nmi_message_lost);
lockdep_on();
local_irq_restore(flags);
--
1.8.5.6
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists