linux-kernel - Re: NMI Reported with console

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20240301101105.eUPvRzUv@linutronix.de>
Date: Fri, 1 Mar 2024 11:11:05 +0100
From: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
To: Petr Mladek <pmladek@...e.com>
Cc: "John B. Wyatt IV" <jwyatt@...hat.com>,
	John Ogness <john.ogness@...utronix.de>,
	Clark Williams <williams@...hat.com>, jlelli@...hat.com,
	Derek Barbosa <debarbos@...hat.com>,
	"John B. Wyatt IV" <sageofredondo@...il.com>,
	LKML <linux-kernel@...r.kernel.org>,
	linux-rt-users <linux-rt-users@...r.kernel.org>
Subject: Re: NMI Reported with console_blast.sh

On 2024-02-29 12:15:28 [+0100], Petr Mladek wrote:
> > [ T2481] Call Trace:
> > [ T2477] Kernel panic - not syncing: sysrq triggered crash
> > [    C0] NMI backtrace for cpu 0
> 
> This message seems to be printed by nmi_cpu_backtrace().
> 
> I am surprised. I would expect to see the backtrace printed from panic().
> It calls dump_stack() directly on the panic-CPU. And this panic() should
> be called from sysrq_handle_crash(). IMHO, it should be (normal)
> interrupt context.
> 
> Is it different on RT?

No, it is all okay and the same. panic() was triggered from sysrq which
is a threaded-IRQ and you don't see this. This triggered an NMI
backtrace on CPU0 and it shows the NMI stack. After that, the TASK
stack follows which could show panic() if the backtrace would show
something. The stack trace for NMI is prefixed with ? so it is a guess
from something on stack, not an actual unroll.

And this part:
> >   15:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)
> >   1a:	0f b6 8f c1 00 00 00 	movzbl 0xc1(%rdi),%ecx
> >   21:	0f b7 57 08          	movzwl 0x8(%rdi),%edx
> >   25:	d3 e6                	shl    %cl,%esi
> >   27:	01 f2                	add    %esi,%edx
> >   29:	ec                   	in     (%dx),%al
> >   2a:*	0f b6 c0             	movzbl %al,%eax		<-- trapping instruction
> >   2d:	c3                   	ret

the error of some kind occurred here at 0x2a. But this instruction is
innocent, it simply a zeros the upper 24bit of the register eax. So this
isn't a crash, it is just where the CPUs at the time this was captured.

> >  <NMI>
> >  dump_stack_lvl (lib/dump_stack.c:107) 
> >  panic (kernel/panic.c:344) 
> >  nmi_panic (kernel/panic.c:203) 
> >  watchdog_hardlockup_check (kernel/watchdog.c:180) 
> >  __perf_event_overflow (kernel/events/core.c:9612 (discriminator 2)) 
> >  handle_pmi_common (arch/x86/events/intel/core.c:3052 (discriminator 1)) 
> >  ? set_pte_vaddr_p4d (arch/x86/mm/init_64.c:307 arch/x86/mm/init_64.c:315) 
> >  ? flush_tlb_one_kernel (./arch/x86/include/asm/paravirt.h:81 arch/x86/mm/tlb.c:1171 arch/x86/mm/tlb.c:1126) 
> >  ? native_set_fixmap (arch/x86/mm/pgtable.c:679 arch/x86/mm/pgtable.c:688) 
> >  ? ghes_copy_tofrom_phys (drivers/acpi/apei/ghes.c:330) 
> >  intel_pmu_handle_irq (./arch/x86/include/asm/msr.h:84 ./arch/x86/include/asm/msr.h:118 arch/x86/events/intel/core.c:2427 arch/x86/events/intel/core.c:3118) 
> >  perf_event_nmi_handler (arch/x86/events/core.c:1743 arch/x86/events/core.c:1729) 
> >  ? native_queued_spin_lock_slowpath (./arch/x86/include/asm/vdso/processor.h:13 ./arch/x86/include/asm/vdso/processor.h:18 kernel/locking/qspinlock.c:383) 
> >  nmi_handle (arch/x86/kernel/nmi.c:150) 
> >  ? native_queued_spin_lock_slowpath (./arch/x86/include/asm/vdso/processor.h:13 ./arch/x86/include/asm/vdso/processor.h:18 kernel/locking/qspinlock.c:383) 
> >  default_do_nmi (arch/x86/kernel/nmi.c:351) 
> >  exc_nmi (arch/x86/kernel/nmi.c:545) 
> >  end_repeat_nmi (arch/x86/entry/entry_64.S:1394) 
> 
> This actually seems to be from perf_event() used by the hardlockup
> detector. It triggers NMI.

And the stack is properly unrolled and this the third backtrace. The
second had <NMI> in it as suggesting this is from NMI but also stack
traces from tasks due to sysrq-t.
Now this last backtrace seems fine, and it appears that something tried
to acquire a raw_spinlock_t, it took too long at which point the
hardware watchdog triggered and all we got was this backtrace. Sadly we
don't see the other side to get an idea on which lock it stuck.

> Best Regards,
> Petr

Sebastian