[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID:
<SN6PR02MB415710BED37BDD375B0D769AD4D52@SN6PR02MB4157.namprd02.prod.outlook.com>
Date: Fri, 7 Mar 2025 23:21:07 +0000
From: Michael Kelley <mhklinux@...look.com>
To: Nuno Das Neves <nunodasneves@...ux.microsoft.com>,
"linux-hyperv@...r.kernel.org" <linux-hyperv@...r.kernel.org>,
"x86@...nel.org" <x86@...nel.org>, "linux-arm-kernel@...ts.infradead.org"
<linux-arm-kernel@...ts.infradead.org>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>, "linux-arch@...r.kernel.org"
<linux-arch@...r.kernel.org>, "linux-acpi@...r.kernel.org"
<linux-acpi@...r.kernel.org>
CC: "kys@...rosoft.com" <kys@...rosoft.com>, "haiyangz@...rosoft.com"
<haiyangz@...rosoft.com>, "wei.liu@...nel.org" <wei.liu@...nel.org>,
"decui@...rosoft.com" <decui@...rosoft.com>, "catalin.marinas@....com"
<catalin.marinas@....com>, "will@...nel.org" <will@...nel.org>,
"tglx@...utronix.de" <tglx@...utronix.de>, "mingo@...hat.com"
<mingo@...hat.com>, "bp@...en8.de" <bp@...en8.de>,
"dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>, "hpa@...or.com"
<hpa@...or.com>, "daniel.lezcano@...aro.org" <daniel.lezcano@...aro.org>,
"joro@...tes.org" <joro@...tes.org>, "robin.murphy@....com"
<robin.murphy@....com>, "arnd@...db.de" <arnd@...db.de>,
"jinankjain@...ux.microsoft.com" <jinankjain@...ux.microsoft.com>,
"muminulrussell@...il.com" <muminulrussell@...il.com>,
"skinsburskii@...ux.microsoft.com" <skinsburskii@...ux.microsoft.com>,
"mrathor@...ux.microsoft.com" <mrathor@...ux.microsoft.com>,
"ssengar@...ux.microsoft.com" <ssengar@...ux.microsoft.com>,
"apais@...ux.microsoft.com" <apais@...ux.microsoft.com>,
"Tianyu.Lan@...rosoft.com" <Tianyu.Lan@...rosoft.com>,
"stanislav.kinsburskiy@...il.com" <stanislav.kinsburskiy@...il.com>,
"gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>,
"vkuznets@...hat.com" <vkuznets@...hat.com>, "prapal@...ux.microsoft.com"
<prapal@...ux.microsoft.com>, "muislam@...rosoft.com"
<muislam@...rosoft.com>, "anrayabh@...ux.microsoft.com"
<anrayabh@...ux.microsoft.com>, "rafael@...nel.org" <rafael@...nel.org>,
"lenb@...nel.org" <lenb@...nel.org>, "corbet@....net" <corbet@....net>
Subject: RE: [PATCH v5 07/10] Drivers: hv: Introduce per-cpu event ring tail
From: Nuno Das Neves <nunodasneves@...ux.microsoft.com> Sent: Friday, March 7, 2025 2:07 PM
>
> On 3/7/2025 9:02 AM, Michael Kelley wrote:
> > From: Nuno Das Neves <nunodasneves@...ux.microsoft.com> Sent: Wednesday, February 26, 2025 3:08 PM
> >>
> >> Add a pointer hv_synic_eventring_tail to track the tail pointer for the
> >> SynIC event ring buffer for each SINT.
> >>
> >> This will be used by the mshv driver, but must be tracked independently
> >> since the driver module could be removed and re-inserted.
> >>
> >> Signed-off-by: Nuno Das Neves <nunodasneves@...ux.microsoft.com>
> >> Reviewed-by: Wei Liu <wei.liu@...nel.org>
> >> ---
> >> drivers/hv/hv_common.c | 34 ++++++++++++++++++++++++++++++++--
> >> 1 file changed, 32 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/drivers/hv/hv_common.c b/drivers/hv/hv_common.c
> >> index 252fd66ad4db..2763cb6d3678 100644
> >> --- a/drivers/hv/hv_common.c
> >> +++ b/drivers/hv/hv_common.c
> >> @@ -68,6 +68,16 @@ static void hv_kmsg_dump_unregister(void);
> >>
> >> static struct ctl_table_header *hv_ctl_table_hdr;
> >>
> >> +/*
> >> + * Per-cpu array holding the tail pointer for the SynIC event ring buffer
> >> + * for each SINT.
> >> + *
> >> + * We cannot maintain this in mshv driver because the tail pointer should
> >> + * persist even if the mshv driver is unloaded.
> >> + */
> >> +u8 __percpu **hv_synic_eventring_tail;
> >
> > I think the "__percpu" is in the wrong place here. This placement
> > is likely to cause errors from the "sparse" tool. It should be
> >
> > u8 * __percpu *hv_synic_eventring_tail;
> >
> > See the way hyperv_pcpu_input_arg, for example, is defined. And
> > see commit db3c65bc3a13 where I fixed hyperv_pcpu_input_arg.
> >
> Thanks. I'll fix it.
>
> >> +EXPORT_SYMBOL_GPL(hv_synic_eventring_tail);
> >
> > The "extern" declaration for this variable is in Patch 10 of the series
> > in drivers/hv/mshv_root.h. I guess that's OK, but I would normally
> > expect to find such a declaration in the header file associated with
> > where the variable is defined, which in this case is mshyperv.h.
> > Perhaps you are trying to restrict its usage to just mshv?
> >
> Yes, that's the idea - it should only be used by the driver.
>
> >> +
> >> /*
> >> * Hyper-V specific initialization and shutdown code that is
> >> * common across all architectures. Called from architecture
> >> @@ -90,6 +100,9 @@ void __init hv_common_free(void)
> >>
> >> free_percpu(hyperv_pcpu_input_arg);
> >> hyperv_pcpu_input_arg = NULL;
> >> +
> >> + free_percpu(hv_synic_eventring_tail);
> >> + hv_synic_eventring_tail = NULL;
> >> }
> >>
> >> /*
> >> @@ -372,6 +385,11 @@ int __init hv_common_init(void)
> >> BUG_ON(!hyperv_pcpu_output_arg);
> >> }
> >>
> >> + if (hv_root_partition()) {
> >> + hv_synic_eventring_tail = alloc_percpu(u8 *);
> >> + BUG_ON(hv_synic_eventring_tail == NULL);
> >> + }
> >> +
> >> hv_vp_index = kmalloc_array(nr_cpu_ids, sizeof(*hv_vp_index),
> >> GFP_KERNEL);
> >> if (!hv_vp_index) {
> >> @@ -460,6 +478,7 @@ void __init ms_hyperv_late_init(void)
> >> int hv_common_cpu_init(unsigned int cpu)
> >> {
> >> void **inputarg, **outputarg;
> >> + u8 **synic_eventring_tail;
> >> u64 msr_vp_index;
> >> gfp_t flags;
> >> const int pgcount = hv_output_page_exists() ? 2 : 1;
> >> @@ -472,8 +491,8 @@ int hv_common_cpu_init(unsigned int cpu)
> >> inputarg = (void **)this_cpu_ptr(hyperv_pcpu_input_arg);
> >>
> >> /*
> >> - * hyperv_pcpu_input_arg and hyperv_pcpu_output_arg memory is already
> >> - * allocated if this CPU was previously online and then taken offline
> >> + * The per-cpu memory is already allocated if this CPU was previously
> >> + * online and then taken offline
> >> */
> >> if (!*inputarg) {
> >> mem = kmalloc(pgcount * HV_HYP_PAGE_SIZE, flags);
> >> @@ -485,6 +504,17 @@ int hv_common_cpu_init(unsigned int cpu)
> >> *outputarg = (char *)mem + HV_HYP_PAGE_SIZE;
> >> }
> >>
> >> + if (hv_root_partition()) {
> >> + synic_eventring_tail = (u8 **)this_cpu_ptr(hv_synic_eventring_tail);
> >> + *synic_eventring_tail = kcalloc(HV_SYNIC_SINT_COUNT,
> >> + sizeof(u8), flags);
> >> +
> >> + if (unlikely(!*synic_eventring_tail)) {
> >> + kfree(mem);
> >> + return -ENOMEM;
> >> + }
> >> + }
> >> +
> >
> > Adding this code under the "if(!*inputarg)" implicitly ties the lifecycle of
> > synic_eventring_tail to the lifecycle of hyperv_pcpu_input_arg and
> > hyperv_pcpu_output_arg. Is there some logical relationship between the
> > two that warrants tying the lifecycles together (other than just both being
> > per-cpu)? hyperv_pcpu_input_arg and hyperv_pcpu_output_arg have an
> > unusual lifecycle management in that they aren't freed when a CPU goes
> > offline, as described in the comment in hv_common_cpu_die(). Does
> > synic_eventring_tail also need that same unusual lifecycle?
> >
> I thought about it, and no I don't think it shares the same exact lifecycle.
> It's only used by the mshv_root driver - it just needs to remain present
> whenever there's a chance the module could be re-inserted and expect it to
> be there.
>
> > Assuming there's no logical relationship, I'm thinking synic_eventring_tail
> > should be managed independent of the other two. If it does need the
> > unusual lifecycle, make sure to add a comment in hv_common_cpu_die()
> > explaining why. If it doesn't need the unusual lifecycle, maybe just do
> > the normal thing of allocating it in hv_common_cpu_init() and freeing
> > it in hv_common_cpu_die().
> >
> Yep, I suppose it should just be freed normally then, assuming
> hv_common_cpu_die() is only called when the hypervisor is going to reset
> (or remove) the synic pages for this partition. Is that the case here?
>
Yes, it is the case here. A particular vCPU can be taken offline
independent of other vCPUs in the VM (such as by writing "0"
to /sys/devices/system/cpu/cpu<nn>/online). When that happens
the vCPU going offline runs hv_synic_cleanup() first, and then it
runs hv_cpu_die(), which calls hv_common_cpu_die(). So by the
time hv_common_cpu_die() runs, the synic_message_page and
synic_event_page will have been unmapped and the pointers set
to NULL.
On arm64, there is no hv_cpu_init()/die(), and the "common"
versions are called directly. Perhaps at some point in the future there
will be arm64 specific things to be done, and hv_cpu_init()/die()
will need to be added. But the ordering is the same and
hv_synic_cleanup() runs first.
So, yes, since synic_eventring_tail is tied to the synic, it sounds like
the normal lifecycle could be used, like with the VP assist page that
is handled in hv_cpu_init()/die() on x86.
> Otherwise we'd want to retain it, in case mshv_root ever needs it again for
> that CPU in the lifetime of this partition.
>
> Nuno
>
> > The code as written in your patch isn't wrong and would work OK. But
> > the structure implies a relationship with hyperv_pcpu_*_arg that I
> > suspect doesn't exist.
> >
> > Michael
> >
> >> if (!ms_hyperv.paravisor_present &&
> >> (hv_isolation_type_snp() || hv_isolation_type_tdx())) {
> >> ret = set_memory_decrypted((unsigned long)mem, pgcount);
> >> --
> >> 2.34.1
Powered by blists - more mailing lists