linux-kernel - [PATCH] printk: fixing the deadlock when calling printk in nmi handle

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <27240C0AC20F114CBF8149A2696CBE4A10C7F6@SHSMSX101.ccr.corp.intel.com>
Date:	Wed, 4 Jul 2012 13:00:49 +0000
From:	"Liu, Chuansheng" <chuansheng.liu@...el.com>
To:	"'linux-kernel@...r.kernel.org' (linux-kernel@...r.kernel.org)" 
	<linux-kernel@...r.kernel.org>
CC:	"a.p.zijlstra@...llo.nl" <a.p.zijlstra@...llo.nl>,
	"kay@...y.org" <kay@...y.org>,
	"gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>,
	"mingo@...e.hu" <mingo@...e.hu>
Subject: [PATCH] printk: fixing the deadlock when calling printk in nmi
 handle

From: liu chuansheng <chuansheng.liu@...el.com>
Subject: [PATCH] printk: fixing the deadlock when calling printk in nmi handle

Current printk implementation can not fully support that
calling it in nmi handler for SMP arch.

There is typical case in nmi handler function arch_trigger_all_cpu_backtrace_handler().

In my platform, there are 2 CPUs, when function arch_trigger_all_cpu_backtrace()
is called, 2 CPUs will recevied the nmi interrupts, and the
arch_trigger_all_cpu_backtrace_handler() will called on 2 CPUs:

case1:
CPU0                                            CPU1
calling arch_trigger_all_cpu_backtrace()        calling printk, and has obtain the logbuf_lock
nmi interrupt received                          nmi interrupt received
call arch_trigger_all_cpu_backtrace_handler()   call arch_trigger_all_cpu_backtrace_handler()
Obtain arch_spin_lock(&lock);                   Waiting for arch_spin_lock(&lock);
Continue to call printk()
CPU0 will be blocked by logbuf_lock             CPU1 is blocked by arch_spin_lock(&lock)

The deadlock will be happening.

case2:
CPU0                                             CPU1:(run dmesg command)
calling arch_trigger_all_cpu_backtrace()         calling do_syslog
                                                 Obtaining the logbuf_lock
nmi interrupt received                           nmi interrupt received
....
The dealock will happen also somtimes.

I just write a simple interface to run the arch_trigger_all_cpu_backtrace_handler() every 5s,
it will trigger dead lock many times.

The solution is when printk is called in nmi handler, we will use trylock instead of lock.
And in nmi handler, do the call the console write function because normal console write function
include many spin locks also. This fix can confirm the traces in nmi handler can be output successfully
almost.

Signed-off-by: liu chuansheng <chuansheng.liu@...el.com>
---
 kernel/printk.c |   11 +++++++++--
 1 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/kernel/printk.c b/kernel/printk.c
index dba1821..de68e24 100644
--- a/kernel/printk.c
+++ b/kernel/printk.c
@@ -1275,7 +1275,7 @@ static int console_trylock_for_printk(unsigned int cpu)
 {
        int retval = 0, wake = 0;
 
-       if (console_trylock()) {
+       if (!in_nmi() && console_trylock()) {
                retval = 1;
 
                /*
@@ -1432,7 +1432,13 @@ asmlinkage int vprintk_emit(int facility, int level,
        }
 
        lockdep_off();
-       raw_spin_lock(&logbuf_lock);
+       if(unlikely(in_nmi())) {
+               if(!raw_spin_trylock(&logbuf_lock))
+                       goto out_restore_lockdep_irqs;
+       } else {
+               raw_spin_lock(&logbuf_lock);
+       }
+
        logbuf_cpu = this_cpu;
 
        if (recursion_bug) {
@@ -1524,6 +1530,7 @@ asmlinkage int vprintk_emit(int facility, int level,
        if (console_trylock_for_printk(this_cpu))
                console_unlock();
 
+out_restore_lockdep_irqs:
        lockdep_on();
 out_restore_irqs:
        local_irq_restore(flags);
-- 
1.7.0.4
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/