linux-kernel - Re: [PATCH 1/2] x86/crash: Disable virt in core NMI crash handler to avoid double list

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87wnervxb7.ffs@tglx>
Date:   Thu, 12 May 2022 12:51:08 +0200
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Sean Christopherson <seanjc@...gle.com>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org
Cc:     "H. Peter Anvin" <hpa@...or.com>, linux-kernel@...r.kernel.org,
        "Guilherme G . Piccoli" <gpiccoli@...lia.com>,
        Vitaly Kuznetsov <vkuznets@...hat.com>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Sean Christopherson <seanjc@...gle.com>
Subject: Re: [PATCH 1/2] x86/crash: Disable virt in core NMI crash handler
 to avoid double list_add

Sean,

On Wed, May 11 2022 at 23:43, Sean Christopherson wrote:
> @@ -840,6 +858,20 @@ void nmi_shootdown_cpus(nmi_shootdown_cb callback)
>  	unsigned long msecs;
>  	local_irq_disable();
>  
> +	/*
> +	 * Invoking multiple callbacks is not currently supported, registering
> +	 * the NMI handler twice will cause a list_add() double add BUG().
> +	 * The exception is the "nop" handler in the emergency reboot path,
> +	 * which can run after e.g. kdump's shootdown.  Do nothing if the crash
> +	 * handler has already run, i.e. has already prepared other CPUs, the
> +	 * reboot path doesn't have any work of its to do, it just needs to
> +	 * ensure all CPUs have prepared for reboot.

This is confusing at best. The double list add is just one part of the
problem, which would be trivial enough to fix.

The real point is that after the first shoot down all other CPUs are
stuck in crash_nmi_callback() and won't respond to the second NMI.

So trying to run this twice is completely pointless and guaranteed to
run into the timeout.

> +	 */
> +	if (shootdown_callback) {
> +		WARN_ON_ONCE(callback != nmi_shootdown_nop);
> +		return;

Instead of playing games with the callback pointer, I prefer to make
this all explicit. Delta patch below.

Thanks,

        tglx
---
--- a/arch/x86/kernel/reboot.c
+++ b/arch/x86/kernel/reboot.c
@@ -528,10 +528,7 @@ static inline void kb_wait(void)
 	}
 }
 
-static void nmi_shootdown_nop(int cpu, struct pt_regs *regs)
-{
-	/* Nothing to do, the NMI shootdown handler disables virtualization. */
-}
+static inline void nmi_shootdown_cpus_on_restart(void);
 
 /* Use NMIs as IPIs to tell all CPUs to disable virtualization */
 static void emergency_vmx_disable_all(void)
@@ -554,7 +551,7 @@ static void emergency_vmx_disable_all(vo
 		__cpu_emergency_vmxoff();
 
 		/* Halt and exit VMX root operation on the other CPUs. */
-		nmi_shootdown_cpus(nmi_shootdown_nop);
+		nmi_shootdown_cpus_on_restart();
 	}
 }
 
@@ -829,7 +826,8 @@ static int crash_nmi_callback(unsigned i
 		return NMI_HANDLED;
 	local_irq_disable();
 
-	shootdown_callback(cpu, regs);
+	if (shootdown_callback)
+		shootdown_callback(cpu, regs);
 
 	/*
 	 * Prepare the CPU for reboot _after_ invoking the callback so that the
@@ -846,31 +844,30 @@ static int crash_nmi_callback(unsigned i
 	return NMI_HANDLED;
 }
 
-/*
- * Halt all other CPUs, calling the specified function on each of them
+/**
+ * nmi_shootdown_cpus - Stop other CPUs via NMI
+ * @callback:	Optional callback to be invoked from the NMI handler
  *
- * This function can be used to halt all other CPUs on crash
- * or emergency reboot time. The function passed as parameter
- * will be called inside a NMI handler on all CPUs.
+ * The NMI handler on the remote CPUs invokes @callback, if not
+ * NULL, first and then disables virtualization to ensure that
+ * INIT is recognized during reboot.
+ *
+ * nmi_shootdown_cpus() can only be invoked once. After the first
+ * invocation all other CPUs are stuck in crash_nmi_callback() and
+ * cannot respond to a second NMI.
  */
 void nmi_shootdown_cpus(nmi_shootdown_cb callback)
 {
 	unsigned long msecs;
+
 	local_irq_disable();
 
 	/*
-	 * Invoking multiple callbacks is not currently supported, registering
-	 * the NMI handler twice will cause a list_add() double add BUG().
-	 * The exception is the "nop" handler in the emergency reboot path,
-	 * which can run after e.g. kdump's shootdown.  Do nothing if the crash
-	 * handler has already run, i.e. has already prepared other CPUs, the
-	 * reboot path doesn't have any work of its to do, it just needs to
-	 * ensure all CPUs have prepared for reboot.
+	 * Aside of being pointless this would register the NMI handler
+	 * twice causing list corruption.
 	 */
-	if (shootdown_callback) {
-		WARN_ON_ONCE(callback != nmi_shootdown_nop);
+	if (WARN_ON_ONCE(crash_ipi_issued))
 		return;
-	}
 
 	/* Make a note of crashing cpu. Will be used in NMI callback. */
 	crashing_cpu = safe_smp_processor_id();
@@ -902,6 +899,12 @@ void nmi_shootdown_cpus(nmi_shootdown_cb
 	/* Leave the nmi callback set */
 }
 
+static inline void nmi_shootdown_cpus_on_restart(void)
+{
+	if (!crash_ipi_issued)
+		nmi_shootdown_cpus(NULL);
+}
+
 /*
  * Check if the crash dumping IPI got issued and if so, call its callback
  * directly. This function is used when we have already been in NMI handler.
@@ -929,6 +932,8 @@ void nmi_shootdown_cpus(nmi_shootdown_cb
 	/* No other CPUs to shoot down */
 }
 
+static inline void nmi_shootdown_cpus_on_restart(void) { }
+
 void run_crash_ipi_callback(struct pt_regs *regs)
 {
 }