[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20241218205421.319969-1-romank@linux.microsoft.com>
Date: Wed, 18 Dec 2024 12:54:19 -0800
From: Roman Kisel <romank@...ux.microsoft.com>
To: hpa@...or.com,
kys@...rosoft.com,
bp@...en8.de,
dave.hansen@...ux.intel.com,
decui@...rosoft.com,
eahariha@...ux.microsoft.com,
haiyangz@...rosoft.com,
mingo@...hat.com,
mhklinux@...look.com,
nunodasneves@...ux.microsoft.com,
tglx@...utronix.de,
tiala@...rosoft.com,
wei.liu@...nel.org,
linux-hyperv@...r.kernel.org,
linux-kernel@...r.kernel.org,
x86@...nel.org
Cc: apais@...rosoft.com,
benhill@...rosoft.com,
ssengar@...rosoft.com,
sunilmut@...rosoft.com,
vdso@...bites.dev
Subject: [PATCH 0/2] hyperv: Fixes for get_vtl(void)
The get_vtl(void) function
* has got one bug when the code started using a wrong pointer type after
refactoring, and also
* it doesn't adhere to the requirements of the Hypervisor Top-Level Funactional
Specification[1, 2] as the code overlaps the input and output areas for a hypercall.
The first issue leads to a wrong 100% reproducible computation due to reading
a byte worth of data at a wrong offset. That in turn leads to using a nonsensical
value ("fortunately", could catch it easily!) for the current VTL when initiating
VMBus communications. As a repercussion from that, the system wouldn't boot. The
fix is straightforward: use the correct pointer type.
The second issue doesn't seem to lead to any reproducible breakage just yet. It is
fixed with using the output hypercall pages allocated per-CPU, and that isn't the
only or the most obvious choice so let me elaborate why that fix appears to be the
best one in my opinion out of the options I could conceive of.
An alternative approach could be to use an appropriately aligned space within the
input page that doesn't overlap with the input data, as a memory optimization.
Indeed, when considering a 1,000+ vCPU VM, allocating one page per-CPU makes the
system spend more time when booting and more space during its lifetime, let alone
that the kernel running in VTL2 is expected to be CPU and memory frugal. Although
saving on the hypercall output page allocation in just that function, we'd still
need to allocate the output pages later for other hypercalls to provide services
for the VTL0 guest (VTL2 works as an exclave of the hypervisor) so here we wouldn't
really save any memory in the long run.
One could also consider passing the input and output parameters in the registers
to get the function be faster potentially, and/or to avoid allocations altogether to
be able to call it from any context at any stage of booting the system. This function
is not in a hot path though, and if it ever is, adding support for the extended
fast hypercalls (that pass input and output in the XMM registers, here, due to
the size of output parameters for the GetVpregisters hypercall) will both make
the function faster and allocation-less. Then again, if either of that ever becomes
a concern.
I have validated the fixes by booting the fixed kernel in VTL2 up using OpenVMM and
OpenHCL[3, 4].
[1] https://learn.microsoft.com/en-us/virtualization/hyper-v-on-windows/tlfs/hypercall-interface
[2] https://github.com/MicrosoftDocs/Virtualization-Documentation/tree/main/tlfs
[3] https://openvmm.dev/guide/user_guide/openhcl.html
[4] https://github.com/microsoft/OpenVMM
Roman Kisel (2):
hyperv: Fix pointer type for the output of the hypercall in
get_vtl(void)
hyperv: Do not overlap the input and output hypercall areas in
get_vtl(void)
arch/x86/hyperv/hv_init.c | 6 +++---
drivers/hv/hv_common.c | 6 +++---
include/hyperv/hvgdk_mini.h | 3 ---
3 files changed, 6 insertions(+), 9 deletions(-)
base-commit: 4d4ace979a3066e5c940331571e6c1c3f280d1d3
--
2.34.1
Powered by blists - more mailing lists