[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID:
<SN6PR02MB415764759BCA5070B8303AF2D4CDA@SN6PR02MB4157.namprd02.prod.outlook.com>
Date: Thu, 13 Nov 2025 19:42:10 +0000
From: Michael Kelley <mhklinux@...look.com>
To: Praveen K Paladugu <prapal@...ux.microsoft.com>, "kys@...rosoft.com"
<kys@...rosoft.com>, "haiyangz@...rosoft.com" <haiyangz@...rosoft.com>,
"wei.liu@...nel.org" <wei.liu@...nel.org>, "decui@...rosoft.com"
<decui@...rosoft.com>, "tglx@...utronix.de" <tglx@...utronix.de>,
"mingo@...hat.com" <mingo@...hat.com>, "linux-hyperv@...r.kernel.org"
<linux-hyperv@...r.kernel.org>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>, "bp@...en8.de" <bp@...en8.de>,
"dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>, "x86@...nel.org"
<x86@...nel.org>, "hpa@...or.com" <hpa@...or.com>, "arnd@...db.de"
<arnd@...db.de>
CC: "anbelski@...ux.microsoft.com" <anbelski@...ux.microsoft.com>,
"easwar.hariharan@...ux.microsoft.com"
<easwar.hariharan@...ux.microsoft.com>, "nunodasneves@...ux.microsoft.com"
<nunodasneves@...ux.microsoft.com>, "skinsburskii@...ux.microsoft.com"
<skinsburskii@...ux.microsoft.com>
Subject: RE: [PATCH v4 3/3] hyperv: Cleanly shutdown root partition with MSHV
From: Praveen K Paladugu <prapal@...ux.microsoft.com>
>
> When a root partition running on MSHV is powered off, the default
> behavior is to write ACPI registers to power-off. However, this ACPI
> write is intercepted by MSHV and will result in a Machine Check
> Exception(MCE).
>
> The root partition eventually panics with a trace similar to:
>
> [ 81.306348] reboot: Power down
> [ 81.314709] mce: [Hardware Error]: CPU 0: Machine Check Exception: 4 Bank 0: b2000000c0060001
> [ 81.314711] mce: [Hardware Error]: TSC 3b8cb60a66 PPIN 11d98332458e4ea9
> [ 81.314713] mce: [Hardware Error]: PROCESSOR 0:606a6 TIME 1759339405 SOCKET 0 APIC 0 microcode ffffffff
> [ 81.314715] mce: [Hardware Error]: Run the above through 'mcelog --ascii'
> [ 81.314716] mce: [Hardware Error]: Machine check: Processor context corrupt
> [ 81.314717] Kernel panic - not syncing: Fatal machine check
>
> To correctly shutdown a root partition running on MSHV, sleep state
> information has be configured within mshv. Later HVCALL_ENTER_SLEEP_STATE
s/has be/has to be/ --or-- s/has be/must be/
Nit: Be consistent in capitalizing "MSHV" (or not capitalizing it).
> should be invoked as the last step in the shutdown sequence.
>
> The previous patch configures the sleep state information and this patch
> invokes HVCALL_ENTER_SLEEP_STATE to cleanly shutdown the root partition.
>
> Signed-off-by: Praveen K Paladugu <prapal@...ux.microsoft.com>
> Co-developed-by: Anatol Belski <anbelski@...ux.microsoft.com>
> Signed-off-by: Anatol Belski <anbelski@...ux.microsoft.com>
> ---
> arch/x86/hyperv/hv_init.c | 2 ++
> arch/x86/include/asm/mshyperv.h | 2 ++
> drivers/hv/mshv_common.c | 19 +++++++++++++++++++
> 3 files changed, 23 insertions(+)
>
> diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
> index 645b52dd732e..24824534ff8d 100644
> --- a/arch/x86/hyperv/hv_init.c
> +++ b/arch/x86/hyperv/hv_init.c
> @@ -34,6 +34,7 @@
> #include <clocksource/hyperv_timer.h>
> #include <linux/highmem.h>
> #include <linux/export.h>
> +#include <asm/reboot.h>
>
> void *hv_hypercall_pg;
>
> @@ -562,6 +563,7 @@ void __init hyperv_init(void)
> * failures here.
> */
> hv_sleep_notifiers_register();
> + machine_ops.power_off = hv_machine_power_off;
> } else {
> hypercall_msr.guest_physical_address = vmalloc_to_pfn(hv_hypercall_pg);
> wrmsrq(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64);
> diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
> index fbc1233175ce..9082d56103ce 100644
> --- a/arch/x86/include/asm/mshyperv.h
> +++ b/arch/x86/include/asm/mshyperv.h
> @@ -182,9 +182,11 @@ void hv_apic_init(void);
> void __init hv_init_spinlocks(void);
> bool hv_vcpu_is_preempted(int vcpu);
> void hv_sleep_notifiers_register(void);
> +void hv_machine_power_off(void);
> #else
> static inline void hv_apic_init(void) {}
> static inline void hv_sleep_notifiers_register(void) {};
> +static inline void hv_machine_power_off(void) {};
> #endif
>
> struct irq_domain *hv_create_pci_msi_domain(void);
> diff --git a/drivers/hv/mshv_common.c b/drivers/hv/mshv_common.c
> index d1a1daa52b65..0588d293a92a 100644
> --- a/drivers/hv/mshv_common.c
> +++ b/drivers/hv/mshv_common.c
> @@ -217,4 +217,23 @@ void hv_sleep_notifiers_register(void)
> pr_err("%s: cannot register reboot notifier %d\n", __func__,
> ret);
> }
> +
> +/*
> + * Power off the machine by entering S5 sleep state via Hyper-V hypercall.
> + * This call does not return if successful.
> + */
> +void hv_machine_power_off(void)
> +{
> + u64 status;
> + unsigned long flags;
> + struct hv_input_enter_sleep_state *in;
> +
> + local_irq_save(flags);
> + in = *this_cpu_ptr(hyperv_pcpu_input_arg);
> + in->sleep_state = HV_SLEEP_STATE_S5;
> +
> + status = hv_do_hypercall(HVCALL_ENTER_SLEEP_STATE, in, NULL);
As flagged by the kernel test robot, this should be
(void)hv_do_hypercall(HVCALL_ENTER_SLEEP_STATE, in, NULL);
so that the intent to ignore the return value is explicit. And the local
variable "status" can be removed.
> + local_irq_restore(flags);
> +
> +}
> #endif
> --
> 2.51.0
Powered by blists - more mailing lists