[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b15c4495-3765-4af4-8cf1-38c64e8fa69e@amd.com>
Date: Wed, 26 Nov 2025 13:51:27 +0530
From: Vasant Hegde <vasant.hegde@....com>
To: Jason Gunthorpe <jgg@...dia.com>, Sairaj Kodilkar <sarunkod@....com>
Cc: iommu@...ts.linux.dev, linux-kernel@...r.kernel.org, joro@...tes.org,
suravee.suthikulpanit@....com, robin.murphy@....com, ashish.kalra@....com,
will@...nel.org
Subject: Re: [PATCH v2 1/2] amd/iommu: Preserve domain ids inside the kdump
kernel
On 11/24/2025 8:14 PM, Jason Gunthorpe wrote:
> On Fri, Nov 21, 2025 at 02:41:15PM +0530, Sairaj Kodilkar wrote:
>> Currently AMD IOMMU driver does not reserve domain ids programmed in the
>> DTE while reusing the device table inside kdump kernel. This can cause
>> reallocation of these domain ids for newer domains that are created by
>> the kdump kernel, which can lead to potential IO_PAGE_FAULTs
>>
>> Hence reserve these ids inside pdom_ids.
>>
>> Fixes: 38e5f33ee359 ("iommu/amd: Reuse device table for kdump")
>> Signed-off-by: Sairaj Kodilkar <sarunkod@....com>
>> Reported-by: Jason Gunthorpe <jgg@...dia.com>
>> Reviewed-by: Vasant Hegde <vasant.hegde@....com>
>> ---
>> drivers/iommu/amd/init.c | 23 ++++++++++++++++++++++-
>> 1 file changed, 22 insertions(+), 1 deletion(-)
>
> This seems OK
>
> Reviewed-by: Jason Gunthorpe <jgg@...dia.com>
>
> But the a point of this work was to remove this code:
>
> diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
> index 48bca4dc8eb61f..1cd799913cbcd6 100644
> --- a/drivers/iommu/amd/iommu.c
> +++ b/drivers/iommu/amd/iommu.c
> @@ -2024,7 +2024,6 @@ static void set_dte_entry(struct amd_iommu *iommu,
> phys_addr_t top_paddr, unsigned int top_level)
> {
> u16 domid;
> - u32 old_domid;
> struct dev_table_entry *initial_dte;
> struct dev_table_entry new = {};
> struct protection_domain *domain = dev_data->domain;
> @@ -2080,7 +2079,6 @@ static void set_dte_entry(struct amd_iommu *iommu,
> if (dev_data->ats_enabled)
> new.data[1] |= DTE_FLAG_IOTLB;
>
> - old_domid = READ_ONCE(dte->data[1]) & DEV_DOMID_MASK;
> new.data[1] |= domid;
>
> /*
> @@ -2096,15 +2094,6 @@ static void set_dte_entry(struct amd_iommu *iommu,
> set_dte_gcr3_table(iommu, dev_data, &new);
>
> update_dte256(iommu, dev_data, &new);
> -
> - /*
> - * A kdump kernel might be replacing a domain ID that was copied from
> - * the previous kernel--if so, it needs to flush the translation cache
> - * entries for the old domain ID that is being overwritten
> - */
> - if (old_domid) {
> - amd_iommu_flush_tlb_domid(iommu, old_domid);
> - }
> }
>
> /*
>
> Under the reasoning that:
> - domids in use by the prior kernel are reserved in the IDA and are
> never used by this kernel
It looks like on non-SNP system in kdump path, if it fails to reserve old DTE
table, it tries to allocate new one! Looking into code again, it may be OK to
fail in that path as well.. so that we will have common flow.
That needs to be fixed before removing old_domid check.
-Vasant
> - domids in the IDA must be clean
> - There is no reason to flush a domid until it is returned to the IDA
> - detach_device() calls amd_iommu_domain_flush_all() before the
> domain can be freed and the domid returned the IDA which clears the
> IOTLB
>
> Please add a patch?
>
> Jason
Powered by blists - more mailing lists