lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2efb7fc8-994f-7cbf-6b7c-a1e645bdf638@linux.microsoft.com>
Date: Fri, 30 Jan 2026 11:47:48 -0800
From: Mukesh R <mrathor@...ux.microsoft.com>
To: Stanislav Kinsburskii <skinsburskii@...ux.microsoft.com>,
 Anirudh Rayabharam <anirudh@...rudhrb.com>
Cc: kys@...rosoft.com, haiyangz@...rosoft.com, wei.liu@...nel.org,
 decui@...rosoft.com, longli@...rosoft.com, linux-hyperv@...r.kernel.org,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH] mshv: Make MSHV mutually exclusive with KEXEC

On 1/30/26 10:41, Stanislav Kinsburskii wrote:
> On Fri, Jan 30, 2026 at 05:17:52PM +0000, Anirudh Rayabharam wrote:
>> On Thu, Jan 29, 2026 at 06:59:31PM -0800, Mukesh R wrote:
>>> On 1/28/26 15:08, Stanislav Kinsburskii wrote:
>>>> On Tue, Jan 27, 2026 at 11:56:02AM -0800, Mukesh R wrote:
>>>>> On 1/27/26 09:47, Stanislav Kinsburskii wrote:
>>>>>> On Mon, Jan 26, 2026 at 05:39:49PM -0800, Mukesh R wrote:
>>>>>>> On 1/26/26 16:21, Stanislav Kinsburskii wrote:
>>>>>>>> On Mon, Jan 26, 2026 at 03:07:18PM -0800, Mukesh R wrote:
>>>>>>>>> On 1/26/26 12:43, Stanislav Kinsburskii wrote:
>>>>>>>>>> On Mon, Jan 26, 2026 at 12:20:09PM -0800, Mukesh R wrote:
>>>>>>>>>>> On 1/25/26 14:39, Stanislav Kinsburskii wrote:
>>>>>>>>>>>> On Fri, Jan 23, 2026 at 04:16:33PM -0800, Mukesh R wrote:
>>>>>>>>>>>>> On 1/23/26 14:20, Stanislav Kinsburskii wrote:
>>>>>>>>>>>>>> The MSHV driver deposits kernel-allocated pages to the hypervisor during
>>>>>>>>>>>>>> runtime and never withdraws them. This creates a fundamental incompatibility
>>>>>>>>>>>>>> with KEXEC, as these deposited pages remain unavailable to the new kernel
>>>>>>>>>>>>>> loaded via KEXEC, leading to potential system crashes upon kernel accessing
>>>>>>>>>>>>>> hypervisor deposited pages.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Make MSHV mutually exclusive with KEXEC until proper page lifecycle
>>>>>>>>>>>>>> management is implemented.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Signed-off-by: Stanislav Kinsburskii <skinsburskii@...ux.microsoft.com>
>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>         drivers/hv/Kconfig |    1 +
>>>>>>>>>>>>>>         1 file changed, 1 insertion(+)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> diff --git a/drivers/hv/Kconfig b/drivers/hv/Kconfig
>>>>>>>>>>>>>> index 7937ac0cbd0f..cfd4501db0fa 100644
>>>>>>>>>>>>>> --- a/drivers/hv/Kconfig
>>>>>>>>>>>>>> +++ b/drivers/hv/Kconfig
>>>>>>>>>>>>>> @@ -74,6 +74,7 @@ config MSHV_ROOT
>>>>>>>>>>>>>>         	# e.g. When withdrawing memory, the hypervisor gives back 4k pages in
>>>>>>>>>>>>>>         	# no particular order, making it impossible to reassemble larger pages
>>>>>>>>>>>>>>         	depends on PAGE_SIZE_4KB
>>>>>>>>>>>>>> +	depends on !KEXEC
>>>>>>>>>>>>>>         	select EVENTFD
>>>>>>>>>>>>>>         	select VIRT_XFER_TO_GUEST_WORK
>>>>>>>>>>>>>>         	select HMM_MIRROR
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Will this affect CRASH kexec? I see few CONFIG_CRASH_DUMP in kexec.c
>>>>>>>>>>>>> implying that crash dump might be involved. Or did you test kdump
>>>>>>>>>>>>> and it was fine?
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Yes, it will. Crash kexec depends on normal kexec functionality, so it
>>>>>>>>>>>> will be affected as well.
>>>>>>>>>>>
>>>>>>>>>>> So not sure I understand the reason for this patch. We can just block
>>>>>>>>>>> kexec if there are any VMs running, right? Doing this would mean any
>>>>>>>>>>> further developement would be without a ver important and major feature,
>>>>>>>>>>> right?
>>>>>>>>>>
>>>>>>>>>> This is an option. But until it's implemented and merged, a user mshv
>>>>>>>>>> driver gets into a situation where kexec is broken in a non-obvious way.
>>>>>>>>>> The system may crash at any time after kexec, depending on whether the
>>>>>>>>>> new kernel touches the pages deposited to hypervisor or not. This is a
>>>>>>>>>> bad user experience.
>>>>>>>>>
>>>>>>>>> I understand that. But with this we cannot collect core and debug any
>>>>>>>>> crashes. I was thinking there would be a quick way to prohibit kexec
>>>>>>>>> for update via notifier or some other quick hack. Did you already
>>>>>>>>> explore that and didn't find anything, hence this?
>>>>>>>>>
>>>>>>>>
>>>>>>>> This quick hack you mention isn't quick in the upstream kernel as there
>>>>>>>> is no hook to interrupt kexec process except the live update one.
>>>>>>>
>>>>>>> That's the one we want to interrupt and block right? crash kexec
>>>>>>> is ok and should be allowed. We can document we don't support kexec
>>>>>>> for update for now.
>>>>>>>
>>>>>>>> I sent an RFC for that one but given todays conversation details is
>>>>>>>> won't be accepted as is.
>>>>>>>
>>>>>>> Are you taking about this?
>>>>>>>
>>>>>>>            "mshv: Add kexec safety for deposited pages"
>>>>>>>
>>>>>>
>>>>>> Yes.
>>>>>>
>>>>>>>> Making mshv mutually exclusive with kexec is the only viable option for
>>>>>>>> now given time constraints.
>>>>>>>> It is intended to be replaced with proper page lifecycle management in
>>>>>>>> the future.
>>>>>>>
>>>>>>> Yeah, that could take a long time and imo we cannot just disable KEXEC
>>>>>>> completely. What we want is just block kexec for updates from some
>>>>>>> mshv file for now, we an print during boot that kexec for updates is
>>>>>>> not supported on mshv. Hope that makes sense.
>>>>>>>
>>>>>>
>>>>>> The trade-off here is between disabling kexec support and having the
>>>>>> kernel crash after kexec in a non-obvious way. This affects both regular
>>>>>> kexec and crash kexec.
>>>>>
>>>>> crash kexec on baremetal is not affected, hence disabling that
>>>>> doesn't make sense as we can't debug crashes then on bm.
>>>>>
>>>>
>>>> Bare metal support is not currently relevant, as it is not available.
>>>> This is the upstream kernel, and this driver will be accessible to
>>>> third-party customers beginning with kernel 6.19 for running their
>>>> kernels in Azure L1VH, so consistency is required.
>>>
>>> Well, without crashdump support, customers will not be running anything
>>> anywhere.
>>
>> This is my concern too. I don't think customers will be particularly
>> happy that kexec doesn't work with our driver.
>>
> 
> I wasn?t clear earlier, so let me restate it. Today, kexec is not
> supported in L1VH. This is a bug we have not fixed yet. Disabling kexec
> is not a long-term solution. But it is better to disable it explicitly
> than to have kernel crashes after kexec.

I don't think there is disagreement on this. The undesired part is turning
off KEXEC config completely.

Thanks,
-Mukesh


> This does not mean the bug should not be fixed. But the upstream kernel
> has its own policies and merge windows. For kernel 6.19, it is better to
> have a clear kexec error than random crashes after kexec.
> 
> Thanks,
> Stanislav
> 
>> Thanks,
>> Anirudh
>>
>>>
>>> Thanks,
>>> -Mukesh
>>>
>>>> Thanks,
>>>> Stanislav
>>>>
>>>>> Let me think and explore a bit, and if I come up with something, I'll
>>>>> send a patch here. If nothing, then we can do this as last resort.
>>>>>
>>>>> Thanks,
>>>>> -Mukesh
>>>>>
>>>>>
>>>>>> It?s a pity we can?t apply a quick hack to disable only regular kexec.
>>>>>> However, since crash kexec would hit the same issues, until we have a
>>>>>> proper state transition for deposted pages, the best workaround for now
>>>>>> is to reset the hypervisor state on every kexec, which needs design,
>>>>>> work, and testing.
>>>>>>
>>>>>> Disabling kexec is the only consistent way to handle this in the
>>>>>> upstream kernel at the moment.
>>>>>>
>>>>>> Thanks, Stanislav
>>>>>>
>>>>>>
>>>>>>> Thanks,
>>>>>>> -Mukesh
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Stanislav
>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> -Mukesh
>>>>>>>>>
>>>>>>>>>> Therefor it should be explicitly forbidden as it's essentially not
>>>>>>>>>> supported yet.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Stanislav
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Stanislav
>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> -Mukesh
>>>


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ