[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <49ef7e43-6a5d-452a-936b-87a573225d1e@amd.com>
Date: Wed, 16 Jul 2025 17:12:52 -0500
From: "Kalra, Ashish" <ashish.kalra@....com>
To: Vasant Hegde <vasant.hegde@....com>, joro@...tes.org,
suravee.suthikulpanit@....com, thomas.lendacky@....com,
Sairaj.ArunKodilkar@....com, herbert@...dor.apana.org.au
Cc: seanjc@...gle.com, pbonzini@...hat.com, will@...nel.org,
robin.murphy@....com, john.allen@....com, davem@...emloft.net, bp@...en8.de,
michael.roth@....com, iommu@...ts.linux.dev, linux-kernel@...r.kernel.org,
linux-crypto@...r.kernel.org, kvm@...r.kernel.org
Subject: Re: [PATCH v3 4/4] iommu/amd: Fix host kdump support for SNP
Hello Vasant,
On 7/16/2025 4:46 AM, Vasant Hegde wrote:
>
>
> On 7/16/2025 12:57 AM, Ashish Kalra wrote:
>> From: Ashish Kalra <ashish.kalra@....com>
>>
>> When a crash is triggered the kernel attempts to shut down SEV-SNP
>> using the SNP_SHUTDOWN_EX command. If active SNP VMs are present,
>> SNP_SHUTDOWN_EX fails as firmware checks all encryption-capable ASIDs
>> to ensure none are in use and that a DF_FLUSH is not required. If a
>> DF_FLUSH is required, the firmware returns DFFLUSH_REQUIRED, causing
>> SNP_SHUTDOWN_EX to fail.
>>
>> This casues the kdump kernel to boot with IOMMU SNP enforcement still
>> enabled and IOMMU completion wait buffers (CWBs), command buffers,
>> device tables and event buffer registers remain locked and exclusive
>> to the previous kernel. Attempts to allocate and use new buffers in
>> the kdump kernel fail, as the hardware ignores writes to the locked
>> MMIO registers (per AMD IOMMU spec Section 2.12.2.1).
>>
>> As a result, the kdump kernel cannot initialize the IOMMU or enable IRQ
>> remapping which is required for proper operation.
>>
>> This results in repeated "Completion-Wait loop timed out" errors and a
>> second kernel panic: "Kernel panic - not syncing: timer doesn't work
>> through Interrupt-remapped IO-APIC"
>>
>> The following MMIO registers are locked and ignore writes after failed
>> SNP shutdown:
>> Device Table Base Address Register
>> Command Buffer Base Address Register
>> Event Buffer Base Address Register
>> Completion Store Base Register/Exclusion Base Register
>> Completion Store Limit Register/Exclusion Range Limit Register
>>
>
> May be you can rephrase the description as first patch covered some of these
> details
We do need to include the complete description here as this is the final
patch of the series which fixes the kdump boot.
Do note, that the description in the first patch only mentions the
IOMMU buffers - command, CWB and event buffers for reuse and this commit
log covers all reusing and remapping required - IOMMU buffers, device table,
etc.
>> Instead of allocating new buffers, re-use the previous kernel’s pages
>> for completion wait buffers, command buffers, event buffers and device
>> tables and operate with the already enabled SNP configuration and
>> existing data structures.
>>
>> This approach is now used for kdump boot regardless of whether SNP is
>> enabled during kdump.
>>
>> The fix enables successful crashkernel/kdump operation on SNP hosts
>> even when SNP_SHUTDOWN_EX fails.
>>
>> Fixes: c3b86e61b756 ("x86/cpufeatures: Enable/unmask SEV-SNP CPU feature")
>
> I am not sure why you have marked only this patch as Fixes? Also it won't fix
> the kdump if someone just backports only this patch right?
>
As mentioned in the cover letter, this is the final patch of the series which
actually fixes the SNP kdump boot, so i kept Fixes: tag as part of this patch.
I am not sure if i can add Fixes: tag to all the four patches in this series ?
Thanks,
Ashish
Powered by blists - more mailing lists