linux-kernel - RE: [PATCH] x86/hyperv: Handle unknown NMIs on one CPU when unknown_nmi

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <DM5PR03MB2490CAE8377A83A6BE43DC0CA08C0@DM5PR03MB2490.namprd03.prod.outlook.com>
Date:   Wed, 30 Nov 2016 19:30:06 +0000
From:   KY Srinivasan <kys@...rosoft.com>
To:     Vitaly Kuznetsov <vkuznets@...hat.com>,
        "x86@...nel.org" <x86@...nel.org>,
        "devel@...uxdriverproject.org" <devel@...uxdriverproject.org>
CC:     "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "Haiyang Zhang" <haiyangz@...rosoft.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        "Ingo Molnar" <mingo@...hat.com>, "H. Peter Anvin" <hpa@...or.com>
Subject: RE: [PATCH] x86/hyperv: Handle unknown NMIs on one CPU when
 unknown_nmi_panic



> -----Original Message-----
> From: Vitaly Kuznetsov [mailto:vkuznets@...hat.com]
> Sent: Wednesday, November 30, 2016 9:55 AM
> To: x86@...nel.org; devel@...uxdriverproject.org
> Cc: linux-kernel@...r.kernel.org; KY Srinivasan <kys@...rosoft.com>;
> Haiyang Zhang <haiyangz@...rosoft.com>; Thomas Gleixner
> <tglx@...utronix.de>; Ingo Molnar <mingo@...hat.com>; H. Peter Anvin
> <hpa@...or.com>
> Subject: [PATCH] x86/hyperv: Handle unknown NMIs on one CPU when
> unknown_nmi_panic
> 
> There is a feature in Hyper-V (Debug-VM --InjectNonMaskableInterrupt)
> which
> injects NMI to the guest. Prior to WS2016 the NMI is injected to all CPUs
> of the guest and WS2016 injects it to CPU0 only. When unknown_nmi_panic
> is
> enabled and we'd like to do kdump we need to perform some minimal
> cleanup
> so the kdump kernel will be able to initialize VMBus devices, this cleanup
> includes sending CHANNELMSG_UNLOAD to the host waiting for
> CHANNELMSG_UNLOAD_RESPONSE to arrive. WS2012R2 always sends the
> response
> to the CPU which was used to send CHANNELMSG_REQUESTOFFERS on
> VMBus module
> load and not to the CPU which is sending CHANNELMSG_UNLOAD. As we
> can't do
> any cross-CPU work reliably on crash we have vmbus_wait_for_unload()
> function which tries to read CHANNELMSG_UNLOAD_RESPONSE on all CPUs
> message
> pages and this sometimes works. It was discovered that in case the host
> wants to send more than one message to a secondary CPU (not the CPU
> running
> vmbus_wait_for_unload()) we're unable to get it as after reading the first
> message we're supposed to do EOMing by doing
> wrmsrl(HV_X64_MSR_EOM, 0) but
> this is per-CPU. I have a feeling that this was working some time ago when
> I implemented vmbus_wait_for_unload(), the host was re-trying to deliver a
> message even without wrmsrl() but apparently this doesn't work any more.
> Unfortunately there is not that much we can do when all CPUs get NMI as
> all but the first one are getting blocked with interrupts disabled. What we
> can do is limit processing unknown interrupts to the first CPU which gets
> it in case we're about to crash.
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@...hat.com>
Thanks Vitaly.

Acked-by: K. Y. Srinivasan <kys@...rosoft.com>


> ---
>  arch/x86/kernel/cpu/mshyperv.c | 22 ++++++++++++++++++++++
>  1 file changed, 22 insertions(+)
> 
> diff --git a/arch/x86/kernel/cpu/mshyperv.c
> b/arch/x86/kernel/cpu/mshyperv.c
> index 8f44c5a..6e4181ff 100644
> --- a/arch/x86/kernel/cpu/mshyperv.c
> +++ b/arch/x86/kernel/cpu/mshyperv.c
> @@ -31,6 +31,7 @@
>  #include <asm/apic.h>
>  #include <asm/timer.h>
>  #include <asm/reboot.h>
> +#include <asm/nmi.h>
> 
>  struct ms_hyperv_info ms_hyperv;
>  EXPORT_SYMBOL_GPL(ms_hyperv);
> @@ -158,6 +159,24 @@ static unsigned char hv_get_nmi_reason(void)
>  	return 0;
>  }
> 
> +/*
> + * Prior to WS2016 Debug-VM sends NMIs to all CPUs which makes
> + * it dificult to process CHANNELMSG_UNLOAD in case of crash. Handle
> + * unknown NMI on the first CPU which gets it.
> + */
> +static int hv_nmi_unknown(unsigned int val, struct pt_regs *regs)
> +{
> +	static atomic_t nmi_cpu = ATOMIC_INIT(-1);
> +
> +	if (!unknown_nmi_panic)
> +		return NMI_DONE;
> +
> +	if (atomic_cmpxchg(&nmi_cpu, -1, raw_smp_processor_id()) != -1)
> +		return NMI_HANDLED;
> +
> +	return NMI_DONE;
> +}
> +
>  static void __init ms_hyperv_init_platform(void)
>  {
>  	/*
> @@ -204,6 +223,9 @@ static void __init ms_hyperv_init_platform(void)
>  	 */
>  	if (efi_enabled(EFI_BOOT))
>  		x86_platform.get_nmi_reason = hv_get_nmi_reason;
> +
> +	register_nmi_handler(NMI_LOCAL, hv_nmi_unknown,
> NMI_FLAG_FIRST,
> +			     "hv_nmi_unknown");
>  }
> 
>  const __refconst struct hypervisor_x86 x86_hyper_ms_hyperv = {
> --
> 2.9.3