lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aXz8ldAeoWwGIxdu@skinsburskii.localdomain>
Date: Fri, 30 Jan 2026 10:46:45 -0800
From: Stanislav Kinsburskii <skinsburskii@...ux.microsoft.com>
To: Anirudh Rayabharam <anirudh@...rudhrb.com>
Cc: kys@...rosoft.com, haiyangz@...rosoft.com, wei.liu@...nel.org,
	decui@...rosoft.com, longli@...rosoft.com,
	linux-hyperv@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] mshv: Make MSHV mutually exclusive with KEXEC

On Fri, Jan 30, 2026 at 05:11:12PM +0000, Anirudh Rayabharam wrote:
> On Wed, Jan 28, 2026 at 03:11:14PM -0800, Stanislav Kinsburskii wrote:
> > On Wed, Jan 28, 2026 at 04:16:31PM +0000, Anirudh Rayabharam wrote:
> > > On Mon, Jan 26, 2026 at 12:46:44PM -0800, Stanislav Kinsburskii wrote:
> > > > On Tue, Jan 27, 2026 at 12:19:24AM +0530, Anirudh Rayabharam wrote:
> > > > > On Fri, Jan 23, 2026 at 10:20:53PM +0000, Stanislav Kinsburskii wrote:
> > > > > > The MSHV driver deposits kernel-allocated pages to the hypervisor during
> > > > > > runtime and never withdraws them. This creates a fundamental incompatibility
> > > > > > with KEXEC, as these deposited pages remain unavailable to the new kernel
> > > > > > loaded via KEXEC, leading to potential system crashes upon kernel accessing
> > > > > > hypervisor deposited pages.
> > > > > > 
> > > > > > Make MSHV mutually exclusive with KEXEC until proper page lifecycle
> > > > > > management is implemented.
> > > > > 
> > > > > Someone might want to stop all guest VMs and do a kexec. Which is valid
> > > > > and would work without any issue for L1VH.
> > > > > 
> > > > 
> > > > No, it won't work and hypervsisor depostied pages won't be withdrawn.
> > > 
> > > All pages that were deposited in the context of a guest partition (i.e.
> > > with the guest partition ID), would be withdrawn when you kill the VMs,
> > > right? What other deposited pages would be left?
> > > 
> > 
> > The driver deposits two types of pages: one for the guests (withdrawn
> > upon gust shutdown) and the other - for the host itself (never
> > withdrawn).
> > See hv_call_create_partition, for example: it deposits pages for the
> > host partition.
> 
> Hmm.. I see. Is it not possible to reclaim this memory in module_exit?
> Also, can't we forcefully kill all running partitions in module_exit and
> then reclaim memory? Would this help with kernel consistency
> irrespective of userspace behavior?
> 

It would, but this is sloppy and cannot be a long-term solution.

It is also not reliable. We have no hook to prevent kexec. So if we fail
to kill the guest or reclaim the memory for any reason, the new kernel
may still crash.

There are two long-term solutions:
 1. Add a way to prevent kexec when there is shared state between the hypervisor and the kernel.
 2. Hand the shared kernel state over to the new kernel.

I sent a series for the first one. The second one is not ready yet.
Anything else is neither robust nor reliable, so I don’t think it makes
sense to pursue it.

Thanks,
Stanislav


> Thanks,
> Anirudh.
> 
> > 
> > Thanks,
> > Stanislav
> > 
> > > Thanks,
> > > Anirudh.
> > > 
> > > > Also, kernel consisntency must no depend on use space behavior. 
> > > > 
> > > > > Also, I don't think it is reasonable at all that someone needs to
> > > > > disable basic kernel functionality such as kexec in order to use our
> > > > > driver.
> > > > > 
> > > > 
> > > > It's a temporary measure until proper page lifecycle management is
> > > > supported in the driver.
> > > > Mutual exclusion of the driver and kexec is given and thus should be
> > > > expclitily stated in the Kconfig.
> > > > 
> > > > Thanks,
> > > > Stanislav
> > > > 
> > > > > Thanks,
> > > > > Anirudh.
> > > > > 
> > > > > > 
> > > > > > Signed-off-by: Stanislav Kinsburskii <skinsburskii@...ux.microsoft.com>
> > > > > > ---
> > > > > >  drivers/hv/Kconfig |    1 +
> > > > > >  1 file changed, 1 insertion(+)
> > > > > > 
> > > > > > diff --git a/drivers/hv/Kconfig b/drivers/hv/Kconfig
> > > > > > index 7937ac0cbd0f..cfd4501db0fa 100644
> > > > > > --- a/drivers/hv/Kconfig
> > > > > > +++ b/drivers/hv/Kconfig
> > > > > > @@ -74,6 +74,7 @@ config MSHV_ROOT
> > > > > >  	# e.g. When withdrawing memory, the hypervisor gives back 4k pages in
> > > > > >  	# no particular order, making it impossible to reassemble larger pages
> > > > > >  	depends on PAGE_SIZE_4KB
> > > > > > +	depends on !KEXEC
> > > > > >  	select EVENTFD
> > > > > >  	select VIRT_XFER_TO_GUEST_WORK
> > > > > >  	select HMM_MIRROR
> > > > > > 
> > > > > > 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ