[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <1033ff35-850c-e2f0-e2f3-1d5bf4b96a76@linux.microsoft.com>
Date: Fri, 12 Sep 2025 19:18:12 -0700
From: Mukesh R <mrathor@...ux.microsoft.com>
To: Michael Kelley <mhklinux@...look.com>,
"kys@...rosoft.com" <kys@...rosoft.com>,
"haiyangz@...rosoft.com" <haiyangz@...rosoft.com>,
"wei.liu@...nel.org" <wei.liu@...nel.org>,
"decui@...rosoft.com" <decui@...rosoft.com>,
"tglx@...utronix.de" <tglx@...utronix.de>,
"mingo@...hat.com" <mingo@...hat.com>, "bp@...en8.de" <bp@...en8.de>,
"dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
"hpa@...or.com" <hpa@...or.com>,
"lpieralisi@...nel.org" <lpieralisi@...nel.org>, "kw@...ux.com"
<kw@...ux.com>,
"manivannan.sadhasivam@...aro.org" <manivannan.sadhasivam@...aro.org>,
"robh@...nel.org" <robh@...nel.org>,
"bhelgaas@...gle.com" <bhelgaas@...gle.com>, "arnd@...db.de" <arnd@...db.de>
Cc: "x86@...nel.org" <x86@...nel.org>,
"linux-hyperv@...r.kernel.org" <linux-hyperv@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-pci@...r.kernel.org" <linux-pci@...r.kernel.org>,
"linux-arch@...r.kernel.org" <linux-arch@...r.kernel.org>
Subject: Re: [PATCH v3 1/7] Drivers: hv: Introduce hv_hvcall_*() functions for
hypercall arguments
On 9/12/25 08:25, Michael Kelley wrote:
> From: Michael Kelley <mhklinux@...look.com> Sent: Monday, August 25, 2025 2:01 PM
>>
>> From: Mukesh R <mrathor@...ux.microsoft.com> Sent: Friday, August 22, 2025 7:25 PM
>>>
>>> On 8/21/25 19:10, Michael Kelley wrote:
>>>> From: Mukesh R <mrathor@...ux.microsoft.com> Sent: Thursday, August 21, 2025 1:50 PM
>>>>>
>>>>> On 8/21/25 12:24, Michael Kelley wrote:
>>>>>> From: Mukesh R <mrathor@...ux.microsoft.com> Sent: Wednesday, August 20, 2025 7:58 PM
>>>>>>>
>>>>>>> On 8/20/25 17:31, Mukesh R wrote:
>>>>>>>> With time these functions only get more complicated and error prone. The
>>>>>>>> saving of ram is very minimal, this makes analyzing crash dumps harder,
>>>>>>>> and in some cases like in your patch 3/7 disables unnecessarily in error case:
>>>>>>>>
>>>>>>>> - if (count > HV_MAX_MODIFY_GPA_REP_COUNT) {
>>>>>>>> - pr_err("Hyper-V: GPA count:%d exceeds supported:%lu\n", count,
>>>>>>>> - HV_MAX_MODIFY_GPA_REP_COUNT);
>>>>>>>> + local_irq_save(flags); <<<<<<<
>>>>>>>> ...
>>>>>>
>>>>>> FWIW, this error case is not disabled. It is checked a few lines further down as:
>>>>>
>>>>> I meant disabled interrupts. The check moves after disabling interrupts, so
>>>>> it runs "disabled" in traditional OS terminology :).
>>>>
>>>> Got it. But why is it problem to make this check with interrupts disabled?
>>>
>>> You are creating disabling overhead where that overhead previously
>>> did not exist.
>>
>> I'm not clear on what you mean by "disabling overhead". The existing code
>> does the following:
>>
>> 1) Validate that "count" is not too big, and return an error if it is.
>> 2) Disable interrupts
>> 3) Populate the per-cpu hypercall input arg
>> 4) Make the hypercall
>> 5) Re-enable interrupts
>>
>> With the patch, steps 1 and 2 are done in a different order:
>>
>> 2) Disable interrupts
>> 1) Validate that "count" is not too big. Re-enable interrupts and return an error if it is.
>> 3) Populate the per-cpu hypercall input arg
>> 4) Make the hypercall
>> 5) Re-enable interrupts
>>
>> Validating "count" with interrupts disabled is probably an additional
>> 2 or 3 instructions executed with interrupts disabled, which is negligible
>> compared to the thousands (or more) of instructions the hypercall will
>> execute with interrupts disabled.
>>
>> Or are you referring to something else as "disabling overhead"?
>
> Mukesh -- anything further on what you see as the problem here?
> I'm just not getting what your concern is.
It increases the interrupts disabled window, does a print from
interrupts disabled (not a great idea unless it is pr_emerg and system
is crashing), and in case of actual error of (count > batch_size)
interrupts are getting enabled and disabled that were not before.
> [snip]
>
>>>>>>> Furthermore, this makes us lose the ability to permanently map
>>>>>>> input/output pages in the hypervisor. So, Wei kindly undo.
>>>>>>>
>>>>>>
>>>>>> Could you elaborate on "lose the ability to permanently map
>>>>>> input/output pages in the hypervisor"? What specifically can't be
>>>>>> done and why?
>>>>>
>>>>> Input and output are mapped at fixed GPA/SPA always to avoid hyp
>>>>> having to map/unmap every time.
>>>>
>>>> OK. But how does this patch set impede doing a fixed mapping?
>>>
>>> The output address can be varied depending on the hypercall, instead
>>> of it being fixed always at fixed address:
>>>
>>> *(void **)output = space + offset; <<<<<<
>>
>> Agreed. But since mappings from GPA to SPA are page granular, having
>> such a fixed mapping means that there's a mapping for every byte in
>> the page containing the GPA to the corresponding byte in the SPA,
>> right? So even though the offset above may vary across hypercalls,
>> the output GPA still refers to the same page (since the offset is always
>> less than 4096), and that page has a fixed mapping. I would expect the
>> hypercall code in the hypervisor to look for an existing mapping based
>> on the output page, not the output address that includes the offset.
>> But I'm haven't looked at the hypervisor code. If the Hyper-V folks say
>> that a non-zero offset thwarts finding the existing mapping, what does
>> the hypervisor end up doing? Creating a 2nd mapping wouldn't seem
>> to make sense. So I'm really curious about what's going on ....
>>
>
> Again, any further information about why we "lose the ability to
> permanently map input/output pages"? It seems doubtful to me
> that an offset within the same page would make any difference,
> but maybe Hyper-V is doing something unexpected. If so, I'd like
> to know more about what that is.
>
> Michael
you've to pass the offset/pointer ever time, and hyp has to map
that instead of just per cpu permanent mapping.
-Mukesh
Powered by blists - more mailing lists