linux-kernel - Re: NMIs reported by console

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20240402103414.KkkX5RuV@linutronix.de>
Date: Tue, 2 Apr 2024 12:34:14 +0200
From: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
To: "John B. Wyatt IV" <jwyatt@...hat.com>
Cc: John Ogness <john.ogness@...utronix.de>, Petr Mladek <pmladek@...e.com>,
	Clark Williams <williams@...hat.com>,
	Juri Lelli <jlelli@...hat.com>, Derek Barbosa <debarbos@...hat.com>,
	Bruno Goncalves <bgoncalv@...hat.com>,
	"John B. Wyatt IV" <sageofredondo@...il.com>,
	LKML <linux-kernel@...r.kernel.org>,
	linux-rt-users <linux-rt-users@...r.kernel.org>
Subject: Re: NMIs reported by console_blast.sh with 6.6.20-rt25

On 2024-03-27 19:44:20 [-0400], John B. Wyatt IV wrote:
> > where is this output from? The `ret' opcode usually does not cause a
> > trap. My guess is that the machine has been interrupted by an external
> > user at this position.
> 
> Just before the sysrq that crashes the system.

so this is intentional.

…
> > Side note: This is using early_printk, correct?
> 
> I believe so, but it might be preempted? This is the part it stopped in.
> 
> static void io_serial_out(unsigned long addr, int offset, int value)
> {
> 	outb(value, addr + offset);
> }

The function is invoked in NMU context so it can't be preempted.

> > According to this, someone issued a `crash' via sysrq. Why?
> > 
> 
> This is part of the console_blast.sh script that John Ogness sent me.
> 
> Please see below:
…

Okay. Then everything works as it should…

> > > NMI Backtrace for 6.6.20-rt25 no forced preemption with tuned throughput-performance profile
> > > -----------------------------
> > 
> > This and the following backtrace shows the same picture: The CPU is
> > crashing due to proc/sysrq request and does CPU-backtraces via NMI and
> > polls in early_printk, waiting for the UART to become idle (probably).
> > 
> > I don't see an issue here so far.
> 
> Luis Goncalves discussed it with me after reading your response. Thank
> you for your help. The NMI was needed to flush the buffers upon the
> system crashing itself. Does this part about NMI watchdog need to be
> documented?

Not sure about that one. There is an _a_ _lot_ to be printed from NMI
and the NMI watchdog might trigger if nothing is triggering the
NMI-watchdog during the print job. Also, the crash was requested.

Sebastian