[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <1587104999-28927-1-git-send-email-decui@microsoft.com>
Date: Thu, 16 Apr 2020 23:29:59 -0700
From: Dexuan Cui <decui@...rosoft.com>
To: bp@...en8.de, haiyangz@...rosoft.com, hpa@...or.com,
kys@...rosoft.com, linux-hyperv@...r.kernel.org,
linux-kernel@...r.kernel.org, mingo@...hat.com,
sthemmin@...rosoft.com, tglx@...utronix.de, x86@...nel.org,
mikelley@...rosoft.com, vkuznets@...hat.com, wei.liu@...nel.org
Cc: Dexuan Cui <decui@...rosoft.com>
Subject: [PATCH] x86/hyperv: Suspend/resume the VP assist page for hibernation
Unlike the other CPUs, CPU0 is never offlined during hibernation. So in the
resume path, the "new" kernel's VP assist page is not suspended (i.e.
disabled), and later when we jump to the "old" kernel, the page is not
properly re-enabled for CPU0 with the allocated page from the old kernel.
So far, the VP assist page is only used by hv_apic_eoi_write(). When the
page is not properly re-enabled, hvp->apic_assist is always 0, so the
HV_X64_MSR_EOI MSR is always written. This is not ideal with respect to
performance, but Hyper-V can still correctly handle this.
The issue is: the hypervisor can corrupt the old kernel memory, and hence
sometimes cause unexpected behaviors, e.g. when the old kernel's non-boot
CPUs are being onlined in the resume path, the VM can hang or be killed
due to virtual triple fault.
Fix the issue by calling hv_cpu_die()/hv_cpu_init() in the syscore ops.
Without the fix, hibernation can fail at a rate of 1/300 ~ 1/500.
With the fix, hibernation can pass a long-haul test of 2000 rounds.
Fixes: 05bd330a7fd8 ("x86/hyperv: Suspend/resume the hypercall page for hibernation")
Cc: stable@...r.kernel.org
Signed-off-by: Dexuan Cui <decui@...rosoft.com>
---
arch/x86/hyperv/hv_init.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
index b0da5320bcff..4d3ce86331a3 100644
--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -72,7 +72,8 @@ static int hv_cpu_init(unsigned int cpu)
struct page *pg;
input_arg = (void **)this_cpu_ptr(hyperv_pcpu_input_arg);
- pg = alloc_page(GFP_KERNEL);
+ /* hv_cpu_init() can be called with IRQs disabled from hv_resume() */
+ pg = alloc_page(GFP_ATOMIC);
if (unlikely(!pg))
return -ENOMEM;
*input_arg = page_address(pg);
@@ -253,6 +254,7 @@ static int __init hv_pci_init(void)
static int hv_suspend(void)
{
union hv_x64_msr_hypercall_contents hypercall_msr;
+ int ret;
/*
* Reset the hypercall page as it is going to be invalidated
@@ -269,12 +271,17 @@ static int hv_suspend(void)
hypercall_msr.enable = 0;
wrmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64);
- return 0;
+ ret = hv_cpu_die(0);
+ return ret;
}
static void hv_resume(void)
{
union hv_x64_msr_hypercall_contents hypercall_msr;
+ int ret;
+
+ ret = hv_cpu_init(0);
+ WARN_ON(ret);
/* Re-enable the hypercall page */
rdmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64);
@@ -287,6 +294,7 @@ static void hv_resume(void)
hv_hypercall_pg_saved = NULL;
}
+/* Note: when the ops are called, only CPU0 is online and IRQs are disabled. */
static struct syscore_ops hv_syscore_ops = {
.suspend = hv_suspend,
.resume = hv_resume,
--
2.19.1
Powered by blists - more mailing lists