lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87bkw3xgcl.fsf@redhat.com>
Date:   Thu, 12 May 2022 11:14:34 +0200
From:   Vitaly Kuznetsov <vkuznets@...hat.com>
To:     Sean Christopherson <seanjc@...gle.com>
Cc:     "H. Peter Anvin" <hpa@...or.com>, linux-kernel@...r.kernel.org,
        "Guilherme G . Piccoli" <gpiccoli@...lia.com>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Sean Christopherson <seanjc@...gle.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org,
        Michael Kelley <mikelley@...rosoft.com>,
        Dexuan Cui <decui@...rosoft.com>
Subject: Re: [PATCH 1/2] x86/crash: Disable virt in core NMI crash handler
 to avoid double list_add

Sean Christopherson <seanjc@...gle.com> writes:

> Disable virtualization in crash_nmi_callback() and skip the requested NMI
> shootdown if a shootdown has already occurred, i.e. a callback has been
> registered.  The NMI crash shootdown path doesn't play nice with multiple
> invocations, e.g. attempting to register the NMI handler multiple times
> will trigger a double list_add() and hang the system (in addition to
> multiple other issues).  If "crash_kexec_post_notifiers" is specified on
> the kernel command line, panic() will invoke crash_smp_send_stop() and
> result in a second call to nmi_shootdown_cpus() during
> native_machine_emergency_restart().

crash_kexec_post_notifiers is also automatically enabled on Hyper-V (see
hv_common_init()) which means the hang should also be observed there.

Cc: Michael and Dexuan

>
> Invoke the callback _before_ disabling virtualization, as the current
> VMCS needs to be cleared before doing VMXOFF.  Note, this results in a
> subtle change in ordering between disabling virtualization and stopping
> Intel PT on the responding CPUs.  While VMX and Intel PT do interact,
> VMXOFF and writes to MSR_IA32_RTIT_CTL do not induce faults between one
> another, which is all that matters when panicking.
>
> WARN if nmi_shootdown_cpus() is called a second time with anything other
> than the reboot path's "nop" handler, as bailing means the requested
> isn't being invoked.  Punt true handling of multiple shootdown callbacks
> until there's an actual use case for doing so (beyond disabling
> virtualization).
>
> Extract the disabling logic to a common helper to deduplicate code, and
> to prepare for doing the shootdown in the emergency reboot path if SVM
> is supported.
>
> Note, prior to commit ed72736183c4 ("x86/reboot: Force all cpus to exit
> VMX root if VMX is supported"), nmi_shootdown_cpus() was subtly protected
> against a second invocation by a cpu_vmx_enabled() check as the kdump
> handler would disable VMX if it ran first.
>
> Fixes: ed72736183c4 ("x86/reboot: Force all cpus to exit VMX root if VMX is supported)
> Cc: stable@...r.kernel.org
> Reported-and-tested-by: Guilherme G. Piccoli <gpiccoli@...lia.com>
> Cc: Vitaly Kuznetsov <vkuznets@...hat.com>
> Cc: Paolo Bonzini <pbonzini@...hat.com>
> Link: https://lore.kernel.org/all/20220427224924.592546-2-gpiccoli@igalia.com
> Signed-off-by: Sean Christopherson <seanjc@...gle.com>
> ---
>  arch/x86/include/asm/reboot.h |  1 +
>  arch/x86/kernel/crash.c       | 16 +--------------
>  arch/x86/kernel/reboot.c      | 38 ++++++++++++++++++++++++++++++++---
>  3 files changed, 37 insertions(+), 18 deletions(-)
>
> diff --git a/arch/x86/include/asm/reboot.h b/arch/x86/include/asm/reboot.h
> index 04c17be9b5fd..8f2da36435a6 100644
> --- a/arch/x86/include/asm/reboot.h
> +++ b/arch/x86/include/asm/reboot.h
> @@ -25,6 +25,7 @@ void __noreturn machine_real_restart(unsigned int type);
>  #define MRR_BIOS	0
>  #define MRR_APM		1
>  
> +void cpu_crash_disable_virtualization(void);
>  typedef void (*nmi_shootdown_cb)(int, struct pt_regs*);
>  void nmi_panic_self_stop(struct pt_regs *regs);
>  void nmi_shootdown_cpus(nmi_shootdown_cb callback);
> diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
> index e8326a8d1c5d..fe0cf83843ba 100644
> --- a/arch/x86/kernel/crash.c
> +++ b/arch/x86/kernel/crash.c
> @@ -81,15 +81,6 @@ static void kdump_nmi_callback(int cpu, struct pt_regs *regs)
>  	 */
>  	cpu_crash_vmclear_loaded_vmcss();
>  
> -	/* Disable VMX or SVM if needed.
> -	 *
> -	 * We need to disable virtualization on all CPUs.
> -	 * Having VMX or SVM enabled on any CPU may break rebooting
> -	 * after the kdump kernel has finished its task.
> -	 */
> -	cpu_emergency_vmxoff();
> -	cpu_emergency_svm_disable();
> -
>  	/*
>  	 * Disable Intel PT to stop its logging
>  	 */
> @@ -148,12 +139,7 @@ void native_machine_crash_shutdown(struct pt_regs *regs)
>  	 */
>  	cpu_crash_vmclear_loaded_vmcss();
>  
> -	/* Booting kdump kernel with VMX or SVM enabled won't work,
> -	 * because (among other limitations) we can't disable paging
> -	 * with the virt flags.
> -	 */
> -	cpu_emergency_vmxoff();
> -	cpu_emergency_svm_disable();
> +	cpu_crash_disable_virtualization();
>  
>  	/*
>  	 * Disable Intel PT to stop its logging
> diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c
> index fa700b46588e..f9543a4e9b09 100644
> --- a/arch/x86/kernel/reboot.c
> +++ b/arch/x86/kernel/reboot.c
> @@ -528,9 +528,9 @@ static inline void kb_wait(void)
>  	}
>  }
>  
> -static void vmxoff_nmi(int cpu, struct pt_regs *regs)
> +static void nmi_shootdown_nop(int cpu, struct pt_regs *regs)
>  {
> -	cpu_emergency_vmxoff();
> +	/* Nothing to do, the NMI shootdown handler disables virtualization. */
>  }
>  
>  /* Use NMIs as IPIs to tell all CPUs to disable virtualization */
> @@ -554,7 +554,7 @@ static void emergency_vmx_disable_all(void)
>  		__cpu_emergency_vmxoff();
>  
>  		/* Halt and exit VMX root operation on the other CPUs. */
> -		nmi_shootdown_cpus(vmxoff_nmi);
> +		nmi_shootdown_cpus(nmi_shootdown_nop);
>  	}
>  }
>  
> @@ -802,6 +802,18 @@ static nmi_shootdown_cb shootdown_callback;
>  static atomic_t waiting_for_crash_ipi;
>  static int crash_ipi_issued;
>  
> +void cpu_crash_disable_virtualization(void)
> +{
> +	/*
> +	 * Disable virtualization, i.e. VMX or SVM, so that INIT is recognized
> +	 * during reboot.  VMX blocks INIT if the CPU is post-VMXON, and SVM
> +	 * blocks INIT if GIF=0.  Note, CLGI #UDs if SVM isn't enabled, so it's
> +	 * easier to just disable SVM unconditionally.
> +	 */
> +	cpu_emergency_vmxoff();
> +	cpu_emergency_svm_disable();
> +}
> +
>  static int crash_nmi_callback(unsigned int val, struct pt_regs *regs)
>  {
>  	int cpu;
> @@ -819,6 +831,12 @@ static int crash_nmi_callback(unsigned int val, struct pt_regs *regs)
>  
>  	shootdown_callback(cpu, regs);
>  
> +	/*
> +	 * Prepare the CPU for reboot _after_ invoking the callback so that the
> +	 * callback can safely use virtualization instructions, e.g. VMCLEAR.
> +	 */
> +	cpu_crash_disable_virtualization();
> +
>  	atomic_dec(&waiting_for_crash_ipi);
>  	/* Assume hlt works */
>  	halt();
> @@ -840,6 +858,20 @@ void nmi_shootdown_cpus(nmi_shootdown_cb callback)
>  	unsigned long msecs;
>  	local_irq_disable();
>  
> +	/*
> +	 * Invoking multiple callbacks is not currently supported, registering
> +	 * the NMI handler twice will cause a list_add() double add BUG().
> +	 * The exception is the "nop" handler in the emergency reboot path,
> +	 * which can run after e.g. kdump's shootdown.  Do nothing if the crash
> +	 * handler has already run, i.e. has already prepared other CPUs, the
> +	 * reboot path doesn't have any work of its to do, it just needs to
> +	 * ensure all CPUs have prepared for reboot.
> +	 */
> +	if (shootdown_callback) {
> +		WARN_ON_ONCE(callback != nmi_shootdown_nop);
> +		return;
> +	}
> +
>  	/* Make a note of crashing cpu. Will be used in NMI callback. */
>  	crashing_cpu = safe_smp_processor_id();

Assuming I didn't get lost in the (infinite) multiverse of possible
shutdown/reboot/crash/kexec-not-kexec paths,

Reviewed-by: Vitaly Kuznetsov <vkuznets@...hat.com>

-- 
Vitaly

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ