[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <de939eb6-4305-4dd9-92f5-346997df9357@amd.com>
Date: Thu, 15 Jan 2026 16:21:47 +0700
From: "Suthikulpanit, Suravee" <suravee.suthikulpanit@....com>
To: Nicolin Chen <nicolinc@...dia.com>
Cc: jgg@...dia.com, linux-kernel@...r.kernel.org, robin.murphy@....com,
will@...nel.org, joro@...tes.org, kevin.tian@...el.com, jsnitsel@...hat.com,
vasant.hegde@....com, iommu@...ts.linux.dev, santosh.shukla@....com,
sairaj.arunkodilkar@....com, jon.grimm@....com, prashanthpra@...gle.com,
wvw@...gle.com, wnliu@...gle.com, gptran@...gle.com, kpsingh@...gle.com,
joao.m.martins@...cle.com, alejandro.j.jimenez@...cle.com
Subject: Re: [PATCH v5 11/14] iommu/amd: Introduce gDomID-to-hDomID Mapping
and handle parent domain invalidation
On 11/14/2025 3:36 AM, Nicolin Chen wrote:
>> @@ -92,7 +94,60 @@ amd_iommu_alloc_domain_nested(struct iommufd_viommu *viommu, u32 flags,
>> .....
>> + /*
>> + * Normally, when a guest has multiple pass-through devices,
>> + * the IOMMU driver setup DTEs with the same stage-2 table and
>> + * use the same host domain ID (hDomId). In case of nested translation,
>> + * if the guest setup different stage-1 tables with same PASID,
>> + * IOMMU would use the same TLB tag. This will results in TLB
>> + * aliasing issue.
>> + *
>> + * The guest is assigning gDomIDs based on its own algorithm for managing
>> + * cache tags of (DomID, PASID). Within a single viommu, the nest parent domain
>> + * (w/ S2 table) is used by all DTEs. But we need to consistently map the gDomID
>> + * to a single hDomID. This is done using an xarray in the vIOMMU to
>> + * keep track of the gDomID mapping. When the S2 is changed, the INVALIDATE_IOMMU_PAGES
>> + * command must be issued for each hDomID in the xarray.
>> + */
>> + curr = xa_cmpxchg(&aviommu->gdomid_array,
>> + ndom->gdom_id, NULL, gdom_info, GFP_ATOMIC);
>> + if (curr) {
>> + if (xa_err(curr)) {
>> + ret = -EINVAL;
>> + goto out_err_gdom_info;
>> + } else {
>> + /* The gDomID already exist */
>> + pr_debug("%s: Found gdom_id=%#x, hdom_id=%#x\n",
>> + __func__, ndom->gdom_id, curr->hdom_id);
>> + refcount_inc(&curr->users);
>> + ndom->gdom_info = curr;
> This looks racy..
>
> When a gDomID is shared between two nested domains, a concurrent
> nested_domain_free() could enter before refcount_inc(), and call
> refcount_dec_and_test() or even free the curr and ndom.
>
> Then, this refcount_inc() will blow up, or curr/ndom will UAF.
>
> Actually, I don't see where amd_iommu_alloc_domain_nested() gets
> used in this series.. I assume AMD will use the iommufd's vIOMMU
> infrastructure directly which doesn't mutex across nested domain
> allocation/free calls.
>
> So, the entire thing here should hold xa_lock(), use xas_load()
> for the existing curr and use xas_store() to store gdom_info if
> !curr, and xa_unlock() after gdom_info is fully initialized.
>
>> + kfree(gdom_info);
>> + return &ndom->domain;
>> + }
>> + }
>> +
>> + /* The gDomID does not exist. We allocate new hdom_id */
>> + gdom_info->hdom_id = amd_iommu_pdom_id_alloc();
>> + if (gdom_info->hdom_id <= 0) {
>> + xa_cmpxchg(&aviommu->gdomid_array,
>> + ndom->gdom_id, gdom_info, NULL, GFP_ATOMIC);
>> + ret = -ENOSPC;
>> + goto out_err_gdom_info;
>> + }
>> +
>> + refcount_set(&gdom_info->users, 1);
> Similar risk here. gdom_info is stored to the xarray before this
> line. A concurrent amd_iommu_alloc_domain_nested() could get the
> stored gdom_info and blow up at refcount_inc().
>
> Make sure the entire thing is locked and safe.
>
> Nicolin
Thanks for pointing this out. I am fixing this in V6 w/ Jason's
suggestion to use xa_lock/unlock w/ __xa_cmpxchg.
Thanks,
Suravee
Powered by blists - more mailing lists