linux-kernel - RE: [PATCH v3 1/7] Drivers: hv: Introduce hv_hvcall

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID:
 <SN6PR02MB4157376DD06C1DC2E28A76B7D432A@SN6PR02MB4157.namprd02.prod.outlook.com>
Date: Thu, 21 Aug 2025 19:24:15 +0000
From: Michael Kelley <mhklinux@...look.com>
To: Mukesh R <mrathor@...ux.microsoft.com>, "kys@...rosoft.com"
	<kys@...rosoft.com>, "haiyangz@...rosoft.com" <haiyangz@...rosoft.com>,
	"wei.liu@...nel.org" <wei.liu@...nel.org>, "decui@...rosoft.com"
	<decui@...rosoft.com>, "tglx@...utronix.de" <tglx@...utronix.de>,
	"mingo@...hat.com" <mingo@...hat.com>, "bp@...en8.de" <bp@...en8.de>,
	"dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>, "hpa@...or.com"
	<hpa@...or.com>, "lpieralisi@...nel.org" <lpieralisi@...nel.org>,
	"kw@...ux.com" <kw@...ux.com>, "manivannan.sadhasivam@...aro.org"
	<manivannan.sadhasivam@...aro.org>, "robh@...nel.org" <robh@...nel.org>,
	"bhelgaas@...gle.com" <bhelgaas@...gle.com>, "arnd@...db.de" <arnd@...db.de>
CC: "x86@...nel.org" <x86@...nel.org>, "linux-hyperv@...r.kernel.org"
	<linux-hyperv@...r.kernel.org>, "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>, "linux-pci@...r.kernel.org"
	<linux-pci@...r.kernel.org>, "linux-arch@...r.kernel.org"
	<linux-arch@...r.kernel.org>
Subject: RE: [PATCH v3 1/7] Drivers: hv: Introduce hv_hvcall_*() functions for
 hypercall arguments

From: Mukesh R <mrathor@...ux.microsoft.com> Sent: Wednesday, August 20, 2025 7:58 PM
> 
> On 8/20/25 17:31, Mukesh R wrote:
> > On 4/15/25 11:07, mhkelley58@...il.com wrote:
> >> From: Michael Kelley <mhklinux@...look.com>
> >>
> >> Current code allocates the "hyperv_pcpu_input_arg", and in
> >> some configurations, the "hyperv_pcpu_output_arg". Each is a 4 KiB
> >> page of memory allocated per-vCPU. A hypercall call site disables
> >> interrupts, then uses this memory to set up the input parameters for
> >> the hypercall, read the output results after hypercall execution, and
> >> re-enable interrupts. The open coding of these steps leads to
> >> inconsistencies, and in some cases, violation of the generic
> >> requirements for the hypercall input and output as described in the
> >> Hyper-V Top Level Functional Spec (TLFS)[1].
> >>
> > <snip>
> >
> >> The new functions are realized as a single inline function that
> >> handles the most complex case, which is a hypercall with input
> >> and output, both of which contain arrays. Simpler cases are mapped to
> >> this most complex case with #define wrappers that provide zero or NULL
> >> for some arguments. Several of the arguments to this new function
> >> must be compile-time constants generated by "sizeof()"
> >> expressions. As such, most of the code in the new function can be
> >> evaluated by the compiler, with the result that the code paths are
> >> no longer than with the current open coding. The one exception is
> >> new code generated to zero the fixed-size portion of the input area
> >> in cases where it is not currently done.
> >
> > IMHO, this is unnecessary change that just obfuscates code. With status quo
> > one has the advantage of seeing what exactly is going on, one can use the
> > args any which way, change batch size any which way, and is thus flexible.

I started this patch set in response to some errors in open coding the
use of hyperv_pcpu_input/output_arg, to see if helper functions could
regularize the usage and reduce the likelihood of future errors. Balancing
the pluses and minuses of the result, in my view the helper functions are
an improvement, though not overwhelmingly so. Others may see the
tradeoffs differently, and as such I would not go to the mat in arguing the
patches must be taken. But if we don't take them, we need to go back and
clean up minor errors and inconsistencies in the open coding at some
existing hypercall call sites.

> > With time these functions only get more complicated and error prone. The
> > saving of ram is very minimal, this makes analyzing crash dumps harder,
> > and in some cases like in your patch 3/7 disables unnecessarily in error case:
> >
> > - if (count > HV_MAX_MODIFY_GPA_REP_COUNT) {
> > -  pr_err("Hyper-V: GPA count:%d exceeds supported:%lu\n", count,
> > -   HV_MAX_MODIFY_GPA_REP_COUNT);
> > + local_irq_save(flags);      <<<<<<<
> > ...

FWIW, this error case is not disabled. It is checked a few lines further down as:

+       if (count > batch_size) {
+               pr_err("Hyper-V: GPA count:%d exceeds supported:%u\n", count,
+                      batch_size);

> >
> > So, this is a nak from me. sorry.
> >
> 
> Furthermore, this makes us lose the ability to permanently map
> input/output pages in the hypervisor. So, Wei kindly undo.
> 

Could you elaborate on "lose the ability to permanently map
input/output pages in the hypervisor"? What specifically can't be
done and why?

<snip>

> >
> >> +/*
> >> + * Allocate one page that is shared between input and output args, which is
> >> + * sufficient for all current hypercalls. If a future hypercall requires
> >
> > That is incorrect. We've iommu map hypercalls that will use up entire page
> > for input. More coming as we implement ram withdrawl from the hypervisor.

At least some form of ram withdrawal is already implemented upstream as
hv_call_withdraw_memory(). The hypercall has a very small input using the
hv_setup_in() helper, but the output list of PFNs must go to a separately
allocated page so it can be retained with interrupts enabled while
__free_page() is called. The use of this separate output page predates the
introduction of the hv_setup_in() helper.

For iommu map hypercalls, what do the input and output look like? Is the
paradigm different from the typical small fixed portion plus a variable size
array of values that are fed into a rep hypercall? Is there also a large amount
of output from the hypercall? Just curious if there's a case that's fundamentally
different from the current set of hypercalls.

Michael