[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID:
<SN6PR02MB415752B5971A2D29EBB385B1D4CC2@SN6PR02MB4157.namprd02.prod.outlook.com>
Date: Fri, 28 Feb 2025 00:50:35 +0000
From: Michael Kelley <mhklinux@...look.com>
To: Nuno Das Neves <nunodasneves@...ux.microsoft.com>, "kys@...rosoft.com"
<kys@...rosoft.com>, "haiyangz@...rosoft.com" <haiyangz@...rosoft.com>,
"wei.liu@...nel.org" <wei.liu@...nel.org>, "decui@...rosoft.com"
<decui@...rosoft.com>, "tglx@...utronix.de" <tglx@...utronix.de>,
"mingo@...hat.com" <mingo@...hat.com>, "bp@...en8.de" <bp@...en8.de>,
"dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>, "hpa@...or.com"
<hpa@...or.com>, "lpieralisi@...nel.org" <lpieralisi@...nel.org>,
"kw@...ux.com" <kw@...ux.com>, "manivannan.sadhasivam@...aro.org"
<manivannan.sadhasivam@...aro.org>, "robh@...nel.org" <robh@...nel.org>,
"bhelgaas@...gle.com" <bhelgaas@...gle.com>, "arnd@...db.de" <arnd@...db.de>
CC: "x86@...nel.org" <x86@...nel.org>, "linux-hyperv@...r.kernel.org"
<linux-hyperv@...r.kernel.org>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>, "linux-pci@...r.kernel.org"
<linux-pci@...r.kernel.org>, "linux-arch@...r.kernel.org"
<linux-arch@...r.kernel.org>
Subject: RE: [PATCH 3/7] x86/hyperv: Use hv_hvcall_*() to set up hypercall
arguments -- part 1
From: Nuno Das Neves <nunodasneves@...ux.microsoft.com> Sent: Thursday, February 27, 2025 1:08 PM
>
> On 2/26/2025 12:06 PM, mhkelley58@...il.com wrote:
> > From: Michael Kelley <mhklinux@...look.com>
> >
> > Update hypercall call sites to use the new hv_hvcall_*() functions
> > to set up hypercall arguments. Since these functions zero the
> > fixed portion of input memory, remove now redundant calls to memset()
> > and explicit zero'ing of input fields.
> >
> > Signed-off-by: Michael Kelley <mhklinux@...look.com>
> > ---
> > arch/x86/hyperv/hv_apic.c | 6 ++----
> > arch/x86/hyperv/hv_init.c | 5 +----
> > arch/x86/hyperv/hv_vtl.c | 8 ++------
> > arch/x86/hyperv/irqdomain.c | 10 ++++------
> > 4 files changed, 9 insertions(+), 20 deletions(-)
> >
> > diff --git a/arch/x86/hyperv/hv_apic.c b/arch/x86/hyperv/hv_apic.c
> > index f022d5f64fb6..c16f81dd36fc 100644
> > --- a/arch/x86/hyperv/hv_apic.c
> > +++ b/arch/x86/hyperv/hv_apic.c
> > @@ -115,14 +115,12 @@ static bool __send_ipi_mask_ex(const struct cpumask *mask, int vector,
> > return false;
> >
> > local_irq_save(flags);
> > - ipi_arg = *this_cpu_ptr(hyperv_pcpu_input_arg);
> > -
> > + hv_hvcall_in_array(&ipi_arg, sizeof(*ipi_arg),
> > + sizeof(ipi_arg->vp_set.bank_contents[0]));
> I think the returned "batch size" should be checked to ensure it is not too small to hold the
> variable-sized part of the header.
Are you saying to check the batch_size against nr_bank (which is the
size of the array embedded in vp_set)?
>
> > if (unlikely(!ipi_arg))
> > goto ipi_mask_ex_done;
> >
> While here, is this check really needed? If so, maybe a check for the percpu page(s) could be
> baked into hv_hvcall_inout_array()?
Yes, I wanted to bake the check into hv_hvcall_inout_array(). But this
check really is needed. hv_send_ipi(), which can propagate down to
__send_ipi_mask_ex(), can get called early during boot before the per-cpu
hypercall argument page is allocated. The lack of the hypercall argument
page must cleanly propagate back to hv_send_ipi() so it can use the native
APIC "send IPI" function that works without a hypercall, but is slower.
>
> > ipi_arg->vector = vector;
> > - ipi_arg->reserved = 0;
> > - ipi_arg->vp_set.valid_bank_mask = 0;
> >
> > /*
> > * Use HV_GENERIC_SET_ALL and avoid converting cpumask to VP_SET
> > diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
> > index ddeb40930bc8..c5c9511cb7ed 100644
> > --- a/arch/x86/hyperv/hv_init.c
> > +++ b/arch/x86/hyperv/hv_init.c
> > @@ -400,13 +400,10 @@ static u8 __init get_vtl(void)
> > u64 ret;
> >
> > local_irq_save(flags);
> > - input = *this_cpu_ptr(hyperv_pcpu_input_arg);
> > - output = *this_cpu_ptr(hyperv_pcpu_output_arg);
> >
> > - memset(input, 0, struct_size(input, names, 1));
> > + hv_hvcall_inout(&input, sizeof(*input), &output, sizeof(*output));
>
> This doesn't look right, this is a rep hypercall taking an array of register names
> and outputting an array of register values.
>
> hv_hvcall_inout_array() should be fully utilized (input and output arrays) here.
>
> The current code may actually work, but it will overlap the input and output!
Yep. I messed this up. Not sure why. :-( Will fix.
>
> > input->partition_id = HV_PARTITION_ID_SELF;
> > input->vp_index = HV_VP_INDEX_SELF;
> > - input->input_vtl.as_uint8 = 0;
> > input->names[0] = HV_REGISTER_VSM_VP_STATUS;
> >
> > ret = hv_do_hypercall(control, input, output);
> > diff --git a/arch/x86/hyperv/hv_vtl.c b/arch/x86/hyperv/hv_vtl.c
> > index 3f4e20d7b724..3dd27d548db6 100644
> > --- a/arch/x86/hyperv/hv_vtl.c
> > +++ b/arch/x86/hyperv/hv_vtl.c
> <snip>
> > @@ -185,13 +184,10 @@ static int hv_vtl_apicid_to_vp_id(u32 apic_id)
> >
> > local_irq_save(irq_flags);
> >
> > - input = *this_cpu_ptr(hyperv_pcpu_input_arg);
> > - memset(input, 0, sizeof(*input));
> > + hv_hvcall_inout(&input, sizeof(*input), &output, sizeof(*output));
> This has the same issue as above - it is a rep hypercall so needs hv_hvcall_inout_array()
Agreed. Will fix.
>
> > input->partition_id = HV_PARTITION_ID_SELF;
> > input->apic_ids[0] = apic_id;
> >
> > - output = *this_cpu_ptr(hyperv_pcpu_output_arg);
> > -
> > control = HV_HYPERCALL_REP_COMP_1 | HVCALL_GET_VP_ID_FROM_APIC_ID;
> > status = hv_do_hypercall(control, input, output);
> > ret = output[0];
> > diff --git a/arch/x86/hyperv/irqdomain.c b/arch/x86/hyperv/irqdomain.c
> > index 64b921360b0f..803b1a945c5c 100644
> > --- a/arch/x86/hyperv/irqdomain.c
> > +++ b/arch/x86/hyperv/irqdomain.c
> > @@ -24,11 +24,11 @@ static int hv_map_interrupt(union hv_device_id device_id, bool level,
> >
> > local_irq_save(flags);
> >
> > - input = *this_cpu_ptr(hyperv_pcpu_input_arg);
> > - output = *this_cpu_ptr(hyperv_pcpu_output_arg);
> > + hv_hvcall_inout_array(&input, sizeof(*input),
> > + sizeof(input->interrupt_descriptor.target.vp_set.bank_contents[0]),
> > + &output, sizeof(*output), 0);
> As noted before I think the batch size should be checked to ensure it is large enough.
>
> Side note - it seems in this hypercall, nr_banks + 1 is used as the varhead size, which
> counts the vp valid mask, but this is not the case in __send_ipi_mask_ex(). Do you happen
> to know why that might be?
Interesting discrepancy. Right off the bat, I don't know why. The comment
in hv_map_interrupt() is very specific and sounds like it knows what it is
talking about. hyperv_flush_tlb_others_ex() does it like __send_ipi_mask_ex(),
but hv_arch_irq_unmask() does it like hv_map_interrupt(), and with the same
comment. So two votes for each way. :-( I'll research further.
Thanks for the careful review and flagging the mistakes!
Michael
Powered by blists - more mailing lists