linux-kernel - Re: [PATCH v2] x86/sev: Fix host kdump support for SNP

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <89cef849-4309-478c-8250-3e668943fa15@amd.com>
Date: Wed, 4 Sep 2024 14:44:33 -0500
From: "Kalra, Ashish" <ashish.kalra@....com>
To: Sean Christopherson <seanjc@...gle.com>
Cc: pbonzini@...hat.com, dave.hansen@...ux.intel.com, tglx@...utronix.de,
 mingo@...hat.com, bp@...en8.de, x86@...nel.org, hpa@...or.com,
 peterz@...radead.org, linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
 thomas.lendacky@....com, michael.roth@....com, kexec@...ts.infradead.org,
 linux-coco@...ts.linux.dev
Subject: Re: [PATCH v2] x86/sev: Fix host kdump support for SNP

Hello Sean,

>>>  e_free_context:
>>> @@ -2884,9 +2890,126 @@ static int snp_decommission_context(struct kvm *kvm)
>>>  	snp_free_firmware_page(sev->snp_context);
>>>  	sev->snp_context = NULL;
>>>  
>>> +	if (snp_asid_to_gctx_pages_map)
>>> +		snp_asid_to_gctx_pages_map[sev_get_asid(kvm)] = NULL;
>>> +
>>>  	return 0;
>>>  }
>>>  
>>> +static void __snp_decommission_all(void)
>>> +{
>>> +	struct sev_data_snp_addr data = {};
>>> +	int ret, asid;
>>> +
>>> +	if (!snp_asid_to_gctx_pages_map)
>>> +		return;
>>> +
>>> +	for (asid = 1; asid < min_sev_asid; asid++) {
>>> +		if (snp_asid_to_gctx_pages_map[asid]) {
>>> +			data.address = __sme_pa(snp_asid_to_gctx_pages_map[asid]);
>> NULL pointer deref if this races with snp_decommission_context() from task
>> context.

Actually looking at this again, this is why we really need all CPUs synchronizing in NMI context before one CPU in NMI context takes control and issues SNP_DECOMMISSION on all SNP VMs.

If there are sev_vm_destroy() -> snp_decommision_context() executing,  when they start handling NMI they would have either already issued SNP_DECOMMISSION for this VM and/or reclaimed the SNP guest context page (transitioned to FW state after SNP_DECOMMISSION). In both cases when we issue SNP_DECOMMISSION here in __snp_decommission_all(), the command will fail with INVALID_GUEST/INVALID_ADDRESS error, so we can simply ignore this error and assume that the VM has already been decommissioned and continue with decommissioning the other VMs.

I actually tested some of these scenarios and they work as above.

>>> +			ret = sev_do_cmd(SEV_CMD_SNP_DECOMMISSION, &data, NULL);
>>> +			if (!ret) {
>> And what happens if SEV_CMD_SNP_DECOMMISSION fails?

As mentioned above, we can ignore the failure here as the VM may have already been decommissioned.

In the case where SNP_DECOMMISSION fails without the VM being already decommissioned, crashkernel boot will fail.

Thanks, Ashish