lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7de75d5a-c284-4ab8-b275-ec245608ed5c@linux.microsoft.com>
Date: Fri, 7 Mar 2025 15:31:42 -0800
From: Nuno Das Neves <nunodasneves@...ux.microsoft.com>
To: Michael Kelley <mhklinux@...look.com>,
 "linux-hyperv@...r.kernel.org" <linux-hyperv@...r.kernel.org>,
 "x86@...nel.org" <x86@...nel.org>,
 "linux-arm-kernel@...ts.infradead.org"
 <linux-arm-kernel@...ts.infradead.org>,
 "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
 "linux-arch@...r.kernel.org" <linux-arch@...r.kernel.org>,
 "linux-acpi@...r.kernel.org" <linux-acpi@...r.kernel.org>
Cc: "kys@...rosoft.com" <kys@...rosoft.com>,
 "haiyangz@...rosoft.com" <haiyangz@...rosoft.com>,
 "wei.liu@...nel.org" <wei.liu@...nel.org>,
 "decui@...rosoft.com" <decui@...rosoft.com>,
 "catalin.marinas@....com" <catalin.marinas@....com>,
 "will@...nel.org" <will@...nel.org>, "tglx@...utronix.de"
 <tglx@...utronix.de>, "mingo@...hat.com" <mingo@...hat.com>,
 "bp@...en8.de" <bp@...en8.de>,
 "dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
 "hpa@...or.com" <hpa@...or.com>,
 "daniel.lezcano@...aro.org" <daniel.lezcano@...aro.org>,
 "joro@...tes.org" <joro@...tes.org>,
 "robin.murphy@....com" <robin.murphy@....com>, "arnd@...db.de"
 <arnd@...db.de>,
 "jinankjain@...ux.microsoft.com" <jinankjain@...ux.microsoft.com>,
 "muminulrussell@...il.com" <muminulrussell@...il.com>,
 "skinsburskii@...ux.microsoft.com" <skinsburskii@...ux.microsoft.com>,
 "mrathor@...ux.microsoft.com" <mrathor@...ux.microsoft.com>,
 "ssengar@...ux.microsoft.com" <ssengar@...ux.microsoft.com>,
 "apais@...ux.microsoft.com" <apais@...ux.microsoft.com>,
 "Tianyu.Lan@...rosoft.com" <Tianyu.Lan@...rosoft.com>,
 "stanislav.kinsburskiy@...il.com" <stanislav.kinsburskiy@...il.com>,
 "gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>,
 "vkuznets@...hat.com" <vkuznets@...hat.com>,
 "prapal@...ux.microsoft.com" <prapal@...ux.microsoft.com>,
 "muislam@...rosoft.com" <muislam@...rosoft.com>,
 "anrayabh@...ux.microsoft.com" <anrayabh@...ux.microsoft.com>,
 "rafael@...nel.org" <rafael@...nel.org>, "lenb@...nel.org"
 <lenb@...nel.org>, "corbet@....net" <corbet@....net>
Subject: Re: [PATCH v5 07/10] Drivers: hv: Introduce per-cpu event ring tail

On 3/7/2025 3:21 PM, Michael Kelley wrote:
> From: Nuno Das Neves <nunodasneves@...ux.microsoft.com> Sent: Friday, March 7, 2025 2:07 PM
>>
>> On 3/7/2025 9:02 AM, Michael Kelley wrote:
>>> From: Nuno Das Neves <nunodasneves@...ux.microsoft.com> Sent: Wednesday, February 26, 2025 3:08 PM
>>>>
>>>> Add a pointer hv_synic_eventring_tail to track the tail pointer for the
>>>> SynIC event ring buffer for each SINT.
>>>>
>>>> This will be used by the mshv driver, but must be tracked independently
>>>> since the driver module could be removed and re-inserted.
>>>>
>>>> Signed-off-by: Nuno Das Neves <nunodasneves@...ux.microsoft.com>
>>>> Reviewed-by: Wei Liu <wei.liu@...nel.org>
>>>> ---
>>>>  drivers/hv/hv_common.c | 34 ++++++++++++++++++++++++++++++++--
>>>>  1 file changed, 32 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/drivers/hv/hv_common.c b/drivers/hv/hv_common.c
>>>> index 252fd66ad4db..2763cb6d3678 100644
>>>> --- a/drivers/hv/hv_common.c
>>>> +++ b/drivers/hv/hv_common.c
>>>> @@ -68,6 +68,16 @@ static void hv_kmsg_dump_unregister(void);
>>>>
>>>>  static struct ctl_table_header *hv_ctl_table_hdr;
>>>>
>>>> +/*
>>>> + * Per-cpu array holding the tail pointer for the SynIC event ring buffer
>>>> + * for each SINT.
>>>> + *
>>>> + * We cannot maintain this in mshv driver because the tail pointer should
>>>> + * persist even if the mshv driver is unloaded.
>>>> + */
>>>> +u8 __percpu **hv_synic_eventring_tail;
>>>
>>> I think the "__percpu" is in the wrong place here. This placement
>>> is likely to cause errors from the "sparse" tool.  It should be
>>>
>>> u8 * __percpu *hv_synic_eventring_tail;
>>>
>>> See the way hyperv_pcpu_input_arg, for example, is defined.  And
>>> see commit db3c65bc3a13 where I fixed hyperv_pcpu_input_arg.
>>>
>> Thanks. I'll fix it.
>>
>>>> +EXPORT_SYMBOL_GPL(hv_synic_eventring_tail);
>>>
>>> The "extern" declaration for this variable is in Patch 10 of the series
>>> in drivers/hv/mshv_root.h. I guess that's OK, but I would normally
>>> expect to find such a declaration in the header file associated with
>>> where the variable is defined, which in this case is mshyperv.h.
>>> Perhaps you are trying to restrict its usage to just mshv?
>>>
>> Yes, that's the idea - it should only be used by the driver.
>>
>>>> +
>>>>  /*
>>>>   * Hyper-V specific initialization and shutdown code that is
>>>>   * common across all architectures.  Called from architecture
>>>> @@ -90,6 +100,9 @@ void __init hv_common_free(void)
>>>>
>>>>  	free_percpu(hyperv_pcpu_input_arg);
>>>>  	hyperv_pcpu_input_arg = NULL;
>>>> +
>>>> +	free_percpu(hv_synic_eventring_tail);
>>>> +	hv_synic_eventring_tail = NULL;
>>>>  }
>>>>
>>>>  /*
>>>> @@ -372,6 +385,11 @@ int __init hv_common_init(void)
>>>>  		BUG_ON(!hyperv_pcpu_output_arg);
>>>>  	}
>>>>
>>>> +	if (hv_root_partition()) {
>>>> +		hv_synic_eventring_tail = alloc_percpu(u8 *);
>>>> +		BUG_ON(hv_synic_eventring_tail == NULL);
>>>> +	}
>>>> +
>>>>  	hv_vp_index = kmalloc_array(nr_cpu_ids, sizeof(*hv_vp_index),
>>>>  				    GFP_KERNEL);
>>>>  	if (!hv_vp_index) {
>>>> @@ -460,6 +478,7 @@ void __init ms_hyperv_late_init(void)
>>>>  int hv_common_cpu_init(unsigned int cpu)
>>>>  {
>>>>  	void **inputarg, **outputarg;
>>>> +	u8 **synic_eventring_tail;
>>>>  	u64 msr_vp_index;
>>>>  	gfp_t flags;
>>>>  	const int pgcount = hv_output_page_exists() ? 2 : 1;
>>>> @@ -472,8 +491,8 @@ int hv_common_cpu_init(unsigned int cpu)
>>>>  	inputarg = (void **)this_cpu_ptr(hyperv_pcpu_input_arg);
>>>>
>>>>  	/*
>>>> -	 * hyperv_pcpu_input_arg and hyperv_pcpu_output_arg memory is already
>>>> -	 * allocated if this CPU was previously online and then taken offline
>>>> +	 * The per-cpu memory is already allocated if this CPU was previously
>>>> +	 * online and then taken offline
>>>>  	 */
>>>>  	if (!*inputarg) {
>>>>  		mem = kmalloc(pgcount * HV_HYP_PAGE_SIZE, flags);
>>>> @@ -485,6 +504,17 @@ int hv_common_cpu_init(unsigned int cpu)
>>>>  			*outputarg = (char *)mem + HV_HYP_PAGE_SIZE;
>>>>  		}
>>>>
>>>> +		if (hv_root_partition()) {
>>>> +			synic_eventring_tail = (u8 **)this_cpu_ptr(hv_synic_eventring_tail);
>>>> +			*synic_eventring_tail = kcalloc(HV_SYNIC_SINT_COUNT,
>>>> +							sizeof(u8), flags);
>>>> +
>>>> +			if (unlikely(!*synic_eventring_tail)) {
>>>> +				kfree(mem);
>>>> +				return -ENOMEM;
>>>> +			}
>>>> +		}
>>>> +
>>>
>>> Adding this code under the "if(!*inputarg)" implicitly ties the lifecycle of
>>> synic_eventring_tail to the lifecycle of hyperv_pcpu_input_arg and
>>> hyperv_pcpu_output_arg. Is there some logical relationship between the
>>> two that warrants tying the lifecycles together (other than just both being
>>> per-cpu)?  hyperv_pcpu_input_arg and hyperv_pcpu_output_arg have an
>>> unusual lifecycle management in that they aren't freed when a CPU goes
>>> offline, as described in the comment in hv_common_cpu_die(). Does
>>> synic_eventring_tail also need that same unusual lifecycle?
>>>
>> I thought about it, and no I don't think it shares the same exact lifecycle.
>> It's only used by the mshv_root driver - it just needs to remain present
>> whenever there's a chance the module could be re-inserted and expect it to
>> be there.
>>
>>> Assuming there's no logical relationship, I'm thinking synic_eventring_tail
>>> should be managed independent of the other two. If it does need the
>>> unusual lifecycle, make sure to add a comment in hv_common_cpu_die()
>>> explaining why. If it doesn't need the unusual lifecycle, maybe just do
>>> the normal thing of allocating it in hv_common_cpu_init() and freeing
>>> it in hv_common_cpu_die().
>>>
>> Yep, I suppose it should just be freed normally then, assuming
>> hv_common_cpu_die() is only called when the hypervisor is going to reset
>> (or remove) the synic pages for this partition. Is that the case here?
>>
> 
> Yes, it is the case here. A particular vCPU can be taken offline
> independent of other vCPUs in the VM (such as by writing "0"
> to /sys/devices/system/cpu/cpu<nn>/online). When that happens
> the vCPU going offline runs hv_synic_cleanup() first, and then it
> runs hv_cpu_die(), which calls hv_common_cpu_die(). So by the
> time hv_common_cpu_die() runs, the synic_message_page and
> synic_event_page will have been unmapped and the pointers set
> to NULL.
> 
> On arm64, there is no hv_cpu_init()/die(), and the "common"
> versions are called directly. Perhaps at some point in the future there
> will be arm64 specific things to be done, and hv_cpu_init()/die()
> will need to be added. But the ordering is the same and
> hv_synic_cleanup() runs first.
> 
> So, yes, since synic_eventring_tail is tied to the synic, it sounds like
> the normal lifecycle could be used, like with the VP assist page that
> is handled in hv_cpu_init()/die() on x86.
> 
Great, thanks for the clarification! I'll fix it for v6.

Nuno

>> Otherwise we'd want to retain it, in case mshv_root ever needs it again for
>> that CPU in the lifetime of this partition.
>>
>> Nuno
>>
>>> The code as written in your patch isn't wrong and would work OK. But
>>> the structure implies a relationship with hyperv_pcpu_*_arg that I
>>> suspect doesn't exist.
>>>
>>> Michael
>>>
>>>>  		if (!ms_hyperv.paravisor_present &&
>>>>  		    (hv_isolation_type_snp() || hv_isolation_type_tdx())) {
>>>>  			ret = set_memory_decrypted((unsigned long)mem, pgcount);
>>>> --
>>>> 2.34.1
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ