linux-kernel - RE: [PATCH] x86/hyperv: Suspend/resume the VP assist page for hibernation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <HK0P153MB0273A04F0585524883C46B0FBFD90@HK0P153MB0273.APCP153.PROD.OUTLOOK.COM>
Date:   Fri, 17 Apr 2020 23:47:41 +0000
From:   Dexuan Cui <decui@...rosoft.com>
To:     Wei Liu <wei.liu@...nel.org>
CC:     "bp@...en8.de" <bp@...en8.de>,
        Haiyang Zhang <haiyangz@...rosoft.com>,
        "hpa@...or.com" <hpa@...or.com>, KY Srinivasan <kys@...rosoft.com>,
        "linux-hyperv@...r.kernel.org" <linux-hyperv@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "mingo@...hat.com" <mingo@...hat.com>,
        Stephen Hemminger <sthemmin@...rosoft.com>,
        "tglx@...utronix.de" <tglx@...utronix.de>,
        "x86@...nel.org" <x86@...nel.org>,
        Michael Kelley <mikelley@...rosoft.com>,
        vkuznets <vkuznets@...hat.com>
Subject: RE: [PATCH] x86/hyperv: Suspend/resume the VP assist page for
 hibernation

> From: Wei Liu <wei.liu@...nel.org>
> Sent: Friday, April 17, 2020 4:00 AM
> To: Dexuan Cui <decui@...rosoft.com>
> 
> On Thu, Apr 16, 2020 at 11:29:59PM -0700, Dexuan Cui wrote:
> > Unlike the other CPUs, CPU0 is never offlined during hibernation. So in the
> > resume path, the "new" kernel's VP assist page is not suspended (i.e.
> > disabled), and later when we jump to the "old" kernel, the page is not
> > properly re-enabled for CPU0 with the allocated page from the old kernel.
> >
> > So far, the VP assist page is only used by hv_apic_eoi_write(). When the
> > page is not properly re-enabled, hvp->apic_assist is always 0, so the
> > HV_X64_MSR_EOI MSR is always written. This is not ideal with respect to
> > performance, but Hyper-V can still correctly handle this.
> >
> > The issue is: the hypervisor can corrupt the old kernel memory, and hence
> > sometimes cause unexpected behaviors, e.g. when the old kernel's non-boot
> > CPUs are being onlined in the resume path, the VM can hang or be killed
> > due to virtual triple fault.
> 
> I don't quite follow here.
> 
> The first sentence is rather alarming -- why would Hyper-V corrupt
> guest's memory (kernel or not)?

Without this patch, after the VM resumes from hibernation, the hypervisor 
still thinks the assist page of vCPU0 points to the physical page allocated by
the "new" kernel (the "new" kernel started up freshly, loaded the saved state 
of the "old" kernel from disk into memory, and jumped to the "old" kernel),
but the same physical page can be allocated to store something different in
the "old" kernel (which is the currently running kernel, since the VM resumed).

Conceptually, it looks Hyper-V writes into the assist page from time to time,
e.g. for the EOI optimization. This "corrupts" the page for the "old" kernel.

I'm not absolutely sure if this explains the strange hang issue or triple fault
I occasionally saw in my long-haul hibernation test, but with this patch,
I never reproduce the strange hang/triple fault issue again, so I think this
patch works.

> Secondly, code below only specifies cpu0. What does it do with non-boot
> cpus on the resume path?
> 
> Wei.

hyperv_init() registers hv_cpu_init()/hv_cpu_die() to the cpuhp framework:

cpuhp = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "x86/hyperv_init:online",
                       hv_cpu_init, hv_cpu_die);

In the hibernation procedure, the non-boot CPUs are automatically disabled
and reenabled, so hv_cpu_init()/hv_cpu_die() are automatically called for them,
e.g. in the resume path, see:
    hibernation_restore()
        resume_target_kernel()
            hibernate_resume_nonboot_cpu_disable()
                disable_nonboot_cpus() 
            syscore_suspend()
                hv_cpu_die(0)  // Added by this patch
            swsusp_arch_resume()
                relocate_restore_code()
                    restore_image()
                        jump to the old kernel and we return from 
                        the swsusp_arch_suspend() in create_image()
                            syscore_resume()
                                hv_cpu_init(0) // Added by this patch.
                            suspend_enable_secondary_cpus()
                            dpm_resume_start()
                            ...
Thanks,
-- Dexuan