[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <PH0PR21MB302564FA43E1402AD13CC706D7D49@PH0PR21MB3025.namprd21.prod.outlook.com>
Date:   Mon, 23 May 2022 23:47:30 +0000
From:   "Michael Kelley (LINUX)" <mikelley@...rosoft.com>
To:     Vit Kabele <vit.kabele@...go.com>,
        "linux-hyperv@...r.kernel.org" <linux-hyperv@...r.kernel.org>
CC:     "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Dexuan Cui <decui@...rosoft.com>,
        KY Srinivasan <kys@...rosoft.com>,
        "rudolf.marek@...go.com" <rudolf.marek@...go.com>,
        "vit@...ele.me" <vit@...ele.me>,
        "wei.liu@...nel.org" <wei.liu@...nel.org>
Subject: RE: Hyper-V: Question about initializing hypercall interface
From: Vit Kabele <vit.kabele@...go.com> Sent: Thursday, May 19, 2022 6:11 AM
> 
> Hello,
> 
> I'm playing with the Hyper-V interface described in the documentation
> here [1] (version 6.0b) and I noticed inconsistency between the
> document and the actual code in arch/x86/hyperv/hv_init.c.
> 
> Section 3.13 Establishing the Hypercall Interface states:
> 
> > 5. The guest checks the Enable Hypercall Page bit. If it is set, the
> > interface is already active, and steps 6 and 7 should be omitted.
> > 6.  The guest finds a page within its GPA space, preferably one that
> > is not occupied by RAM, MMIO, and so on. If the page is occupied, the
> > guest should avoid using the underlying page for other purposes.
> > 7.  The guest writes a new value to the Hypercall MSR
> > (HV_X64_MSR_HYPERCALL) that includes the GPA from step 6 and sets the
> > Enable Hypercall Page bit to enable the interface.
> 
> Yet the code in hv_init.c seems to skip the step 5. and performs the
> steps 6. and 7. unconditionally. Snippet below.
> 
> ```
> rdmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64);
> hypercall_msr.enable = 1;
> 
> if (hv_root_partition) {
>         ...
> } else {
>     hypercall_msr.guest_physical_address = vmalloc_to_pfn(hv_hypercall_pg);
>     wrmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64);
> }
> ```
> 
> 1/ I thought that the specification is written in a way that allows
> hypervisor to locate the hypercall page on its own (for whatever reason)
> and just announce the location to the guest by setting the Enable bit in
> the MSR on initial read. A guest should then not attempt to remap the
> page (point 5. above), but instead create kernel mapping to the page
> reported by the hypervisor.
Arguably, the TLFS is a bit unclear here, but the first paragraph
in Section 3.13 makes clear that the guest must select the location
for the hypercall page.  I'm certain that the hypervisor can not
pick the location and just inform the guest.  The intent of Step 5 is
to handle the case where the guest might have already set up
the hypercall page.
> 
> 2/ The Lock bit (bit 1) is ignored in the Linux implementation. If the
> hypervisor starts with Lock bit set, the init function allocates the
> hv_hypercall_pg and writes the value to the MSR, then:
>         a/ If the hypervisor ignores the write, the MSR remains unchanged,
>                 but the global variable is already set. Attempt to do a
>                 hypercall ends with call to undefined memory, because the code
>                 in hv_do_hypercall() checks the global variable against NULL,
>                 which will pass.
>         b/ The hypervisor injects #GP, in which case the guest crashes.
I would need to confirm with the Hyper-V team, but I think the Lock
bit would only be set *after* the guest OS has provided a guest page
to be used as the hypercall page.
There is code in Linux to clear the MSR and disable the hypercall
page when doing a kexec or kdump.   This is done so that the new
kernel can start "fresh" and establish its own hypercall page. That
kexec/kdump code does not check the Lock bit, and I'm not sure of
the implications if the Lock bit were found to be set in such a case.
I'll check with the Hyper-V team to get clarity on the handling
of the Lock bit in the case of trying to disable the hypercall page.
Michael  
> 
> Do I understand the specification correctly? If yes, then the issues
> here are real issues. If my understanding is wrong, what do I miss?
> 
> --
> Best regards,
> Vit Kabele
> 
> [1]: https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/tlfs/tlfs
Powered by blists - more mailing lists
 
