[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251119000217.GG120075@nvidia.com>
Date: Tue, 18 Nov 2025 20:02:17 -0400
From: Jason Gunthorpe <jgg@...dia.com>
To: Nicolin Chen <nicolinc@...dia.com>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@....com>,
linux-kernel@...r.kernel.org, robin.murphy@....com, will@...nel.org,
joro@...tes.org, kevin.tian@...el.com, jsnitsel@...hat.com,
vasant.hegde@....com, iommu@...ts.linux.dev, santosh.shukla@....com,
sairaj.arunkodilkar@....com, jon.grimm@....com,
prashanthpra@...gle.com, wvw@...gle.com, wnliu@...gle.com,
gptran@...gle.com, kpsingh@...gle.com, joao.m.martins@...cle.com,
alejandro.j.jimenez@...cle.com
Subject: Re: [PATCH v5 11/14] iommu/amd: Introduce gDomID-to-hDomID Mapping
and handle parent domain invalidation
On Thu, Nov 13, 2025 at 12:36:07PM -0800, Nicolin Chen wrote:
> > + curr = xa_cmpxchg(&aviommu->gdomid_array,
> > + ndom->gdom_id, NULL, gdom_info, GFP_ATOMIC);
> > + if (curr) {
> > + if (xa_err(curr)) {
> > + ret = -EINVAL;
> > + goto out_err_gdom_info;
> > + } else {
> > + /* The gDomID already exist */
> > + pr_debug("%s: Found gdom_id=%#x, hdom_id=%#x\n",
> > + __func__, ndom->gdom_id, curr->hdom_id);
> > + refcount_inc(&curr->users);
> > + ndom->gdom_info = curr;
>
> This looks racy..
Yes
> When a gDomID is shared between two nested domains, a concurrent
> nested_domain_free() could enter before refcount_inc(), and call
> refcount_dec_and_test() or even free the curr and ndom.
>
> Then, this refcount_inc() will blow up, or curr/ndom will UAF.
>
> Actually, I don't see where amd_iommu_alloc_domain_nested() gets
> used in this series.. I assume AMD will use the iommufd's vIOMMU
> infrastructure directly which doesn't mutex across nested domain
> allocation/free calls.
>
> So, the entire thing here should hold xa_lock(), use xas_load()
> for the existing curr and use xas_store() to store gdom_info if
> !curr, and xa_unlock() after gdom_info is fully initialized.
No need for xas functions.. You can use the __ functions..
A helper function like this will do the job:
static void *xa_load_or_alloc_locked(struct xarray *xa, unsigned long index, size_t sz)
{
void *elm, *res;
elm = xa_load(xa, index);
if (elm)
return elm;
xa_unlock(xa);
elm = kzalloc(sz, GFP_KERNEL);
xa_lock(xa);
if (!elm)
return ERR_PTR(-ENOMEM);
res = __xa_cmpxchg(xa, index, NULL, elm, GFP_KERNEL);
if (xa_is_err(res))
res = ERR_PTR(xa_err(res));
if (res) {
kfree(elm);
return res;
}
return elm;
}
Call like
xa_lock(&aviommu->gdomid_array);
elm = *xa_load_or_alloc_locked(..)
if (IS_ERR(elm))
..
elm->refcount++;
xa_unlock(&aviommu->gdomid_array);
Needs more bits if you want to use refcount_t
Jason
Powered by blists - more mailing lists