lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID:
 <SN6PR02MB4157EE41697ABC1002750297D49FA@SN6PR02MB4157.namprd02.prod.outlook.com>
Date: Fri, 30 Jan 2026 01:24:34 +0000
From: Michael Kelley <mhklinux@...look.com>
To: Stanislav Kinsburskii <skinsburskii@...ux.microsoft.com>
CC: "kys@...rosoft.com" <kys@...rosoft.com>, "haiyangz@...rosoft.com"
	<haiyangz@...rosoft.com>, "wei.liu@...nel.org" <wei.liu@...nel.org>,
	"decui@...rosoft.com" <decui@...rosoft.com>, "longli@...rosoft.com"
	<longli@...rosoft.com>, "linux-hyperv@...r.kernel.org"
	<linux-hyperv@...r.kernel.org>, "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>
Subject: RE: [PATCH 2/2] mshv: Add support for integrated scheduler

From: Stanislav Kinsburskii <skinsburskii@...ux.microsoft.com> Sent: Thursday, January 29, 2026 11:10 AM
> 
> On Thu, Jan 29, 2026 at 05:47:02PM +0000, Michael Kelley wrote:
> > From: Stanislav Kinsburskii <skinsburskii@...ux.microsoft.com> Sent: Wednesday, January 21, 2026 2:36 PM
> 
> <snip>
> 
> > >  static int __init mshv_root_partition_init(struct device *dev)
> > >  {
> > >  	int err;
> > >
> > > -	err = root_scheduler_init(dev);
> > > -	if (err)
> > > -		return err;
> > > -
> > >  	err = register_reboot_notifier(&mshv_reboot_nb);
> > >  	if (err)
> > > -		goto root_sched_deinit;
> > > +		return err;
> > >
> > >  	return 0;
> >
> > This code is now:
> >
> > 	if (err)
> > 		return err;
> > 	return 0;
> >
> > which can be simplified to just:
> >
> > 	return err;
> >
> > Or drop the local variable 'err' and simplify the entire function to:
> >
> > 	return register_reboot_notifier(&mshv_reboot_nb);
> >
> > There's a tangential question here: Why is this reboot notifier
> > needed in the first place? All it does is remove the cpuhp state
> > that allocates/frees the per-cpu root_scheduler_input and
> > root_scheduler_output pages. Removing the state will free
> > the pages, but if Linux is rebooting, why bother?
> >
> 
> This was originally done to support kexec.
> Here is the original commit message:
> 
>     mshv: perform synic cleanup during kexec
> 
>     Register a reboot notifier that performs synic cleanup when a kexec
>     is in progress.
> 
>     One notable issue this commit fixes is one where after a kexec, virtio
>     devices are not functional. Linux root partition receives MMIO doorbell
>     events in the ring buffer in the SIRB synic page. The hypervisor maintains
>     a head pointer where it writes new events into the ring buffer. The root
>     partition maintains a tail pointer to read events from the buffer.
> 
>     Upon kexec reboot, all root data structures are re-initialized and thus the
>     tail pointer gets reset to zero. The hypervisor on the other hand still
>     retains the pre-kexec head pointer which could be non-zero. This means that
>     when the hypervisor writes new events to the ring buffer, the root
>     partition looks at the wrong place and doesn't find any events. So, future
>     doorbell events never get delivered. As a result, virtqueue kicks never get
>     delivered to the host.
> 
>     When the SIRB page is disabled the hypervisor resets the head pointer.

FWIW, I don't see that commit message anywhere in a public source code
tree. The calls to register/unregister_reboot_notifier() were in the original
introduction of mshv_root_main.c in upstream commit 621191d709b14.
Evidently the code described by that commit message was not submitted
upstream. And of course, the kexec() topic is now being revisited ....

So to clarify: Do you expect that in the future the reboot notifier will be
used for something that really is required for resetting hypervisor state
in the case of a kexec reboot?

> 
> > > -root_sched_deinit:
> > > -	root_scheduler_deinit();
> > > -	return err;
> > >  }
> > >
> > > -static void mshv_init_vmm_caps(struct device *dev)
> > > +static int mshv_init_vmm_caps(struct device *dev)
> > >  {
> > > -	/*
> > > -	 * This can only fail here if HVCALL_GET_PARTITION_PROPERTY_EX or
> > > -	 * HV_PARTITION_PROPERTY_VMM_CAPABILITIES are not supported. In that
> > > -	 * case it's valid to proceed as if all vmm_caps are disabled (zero).
> > > -	 */
> > > -	if (hv_call_get_partition_property_ex(HV_PARTITION_ID_SELF,
> > > -					      HV_PARTITION_PROPERTY_VMM_CAPABILITIES,
> > > -					      0, &mshv_root.vmm_caps,
> > > -					      sizeof(mshv_root.vmm_caps)))
> > > -		dev_warn(dev, "Unable to get VMM capabilities\n");
> > > +	int ret;
> > > +
> > > +	ret = hv_call_get_partition_property_ex(HV_PARTITION_ID_SELF,
> > > +					 	HV_PARTITION_PROPERTY_VMM_CAPABILITIES,
> > > +						0, &mshv_root.vmm_caps,
> > > +						sizeof(mshv_root.vmm_caps));
> > > +	if (ret) {
> > > +		dev_err(dev, "Failed to get VMM capabilities: %d\n", ret);
> > > +		return ret;
> > > +	}
> >
> > This is a functional change that isn't mentioned in the commit message.
> > Why is it now appropriate to fail instead of treating the VMM capabilities
> > as all disabled? Presumably there are older versions of the hypervisor that
> > don't support the requirements described in the original comment, but
> > perhaps they are no longer relevant?
> >
> 
> To fail is now the only option for the L1VH partition. It must discover
> the scheduler type. Without this information, the partition cannot
> operate. The core scheduler logic will not work with an integrated
> scheduler, and vice versa.
> 
> And yes, older hypervisor versions do not support L1VH.

That makes sense. Your change in v2 of the patch handles this
nicely. For the non-L1VH case, the v2 behavior is the same as before in
that the init path won't error out on older hypervisors that don't
support the requirements described in the original comment. That's
the case I am concerned about.

Michael

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ