linux-kernel - x86: Complete data loss during NMI lockup

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [day] [month] [year] [list]

Message-ID: <8ac97a4a-619a-1c50-c290-3a4fc4693d00@redhat.com>
Date:   Mon, 18 Mar 2019 08:25:33 -0400
From:   Prarit Bhargava <prarit@...hat.com>
To:     pmladek@...e.com
Cc:     linux-kernel@...r.kernel.org, Don Zickus <dzickus@...hat.com>
Subject: x86: Complete data loss during NMI lockup

Kernel 4.18.0 has a partial data loss when an NMI occurs but 5.0.0 has
complete data loss.

I tested with this simple kernel lockup code on kernels 4.18.0 through
5.0.0.  On 4.18.0 (and after) the expected lockup messages

 Uhhuh. NMI received for unknown reason 25 on CPU 0.
 Do you have a strange power saving mode enabled?
 Kernel panic - not syncing: NMI: Not continuing

and other debug messages are no longer printed to any console after commit
719f6a7040f1 ("printk: Use the main logbuf in NMI when logbuf_lock is available").

Things get worse with commit 03fc7f9c99c1 ("printk/nmi: Prevent deadlock when
accessing the main log buffer in NMI") after which *NO* messages (including the
important stack trace) are output to the console at all.  There is a 100% data
loss with no clue as to what happened.

You can reproduce with a very simple spinlock lockup module:

static DEFINE_SPINLOCK(prarit_lock);

static void take_lock(void *data)
{
       unsigned long flags;

       pr_emerg("Taking lock on cpu %d\n", smp_processor_id());
       spin_lock_irqsave(&prarit_lock, flags);
}

static int init_dummy(void)
{
       on_each_cpu(take_lock, NULL, 0);
       return 0;
}

and the simple patch (sorry for the cut-and-paste)

diff --git a/arch/x86/kernel/nmi.c b/arch/x86/kernel/nmi.c
index 18bc9b51ac9b..d1800ecc16fc 100644
--- a/arch/x86/kernel/nmi.c
+++ b/arch/x86/kernel/nmi.c
@@ -312,6 +312,7 @@ static void default_do_nmi(struct pt_regs *regs)
        int handled;
        bool b2b = false;

+       pr_emerg("NMI on CPU %d.\n", smp_processor_id());
        /*
         * CPU-specific NMI must be processed before non-CPU-specific
         * NMI, otherwise we may lose it, because the CPU-specific

During run-time a few NMIs happen, which result in

[  683.127105] NMI on CPU 0.
[  689.502992] NMI on CPU 91.
[  713.586459] NMI on CPU 92.
[  719.708041] NMI on CPU 3.
[ 1428.664558] NMI on CPU 95.
[ 2646.795310] NMI on CPU 87.

but nothing is output during the actual kernel lockup itself.  If I revert back
to the pre-719f6a7040f1 behaviour I do see the messages
on the console as well as the stack trace.

The call sequence is 5.0.0 is

nmi_enter()
	-> printk_nmi_enter()
		-> this_cpu_or(printk_context, PRINTK_NMI_CONTEXT_MASK);
		-> printk()
		-> vprintk_func()
			-> vprintk_nmi()
				-> printk_safe_log_store()

P.