[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aGLmq8D88mN5lkmN@Asurada-Nvidia>
Date: Mon, 30 Jun 2025 12:34:03 -0700
From: Nicolin Chen <nicolinc@...dia.com>
To: Xu Yilun <yilun.xu@...ux.intel.com>
CC: <jgg@...dia.com>, <jgg@...pe.ca>, <kevin.tian@...el.com>,
<will@...nel.org>, <aneesh.kumar@...nel.org>, <iommu@...ts.linux.dev>,
<linux-kernel@...r.kernel.org>, <joro@...tes.org>, <robin.murphy@....com>,
<shuah@...nel.org>, <aik@....com>, <dan.j.williams@...el.com>,
<baolu.lu@...ux.intel.com>, <yilun.xu@...el.com>
Subject: Re: [PATCH v3 2/5] iommufd: Destroy vdevice on idevice destroy
On Fri, Jun 27, 2025 at 11:38:06AM +0800, Xu Yilun wrote:
> +static void iommufd_device_remove_vdev(struct iommufd_device *idev)
> +{
> + struct iommufd_vdevice *vdev;
> +
> + mutex_lock(&idev->igroup->lock);
> + /* vdev has been completely destroyed by userspace */
> + if (!idev->vdev)
> + goto out_unlock;
> +
> + vdev = iommufd_get_vdevice(idev->ictx, idev->vdev->obj.id);
> + if (IS_ERR(vdev)) {
> + /*
> + * vdev is removed from xarray by userspace, but is not
> + * destroyed/freed. Since iommufd_vdevice_abort() is reentrant,
> + * safe to destroy vdev here.
> + */
> + iommufd_vdevice_abort(&idev->vdev->obj);
> + goto out_unlock;
This is the case #3, i.e. a racing vdev destory, in the commit log?
I think it is worth clarifying that there is a concurrent destroy:
/*
* An ongoing vdev destroy ioctl has removed the vdev from the
* object xarray but has not finished iommufd_vdevice_destroy()
* yet, as it is holding the same mutex. Destroy the vdev here,
* i.e. the iommufd_vdevice_destroy() will be a NOP once it is
* unlocked.
*/
> @@ -147,10 +183,12 @@ int iommufd_vdevice_alloc_ioctl(struct iommufd_ucmd *ucmd)
> if (rc)
> goto out_abort;
> iommufd_object_finalize(ucmd->ictx, &vdev->obj);
> - goto out_put_idev;
> + goto out_unlock_igroup;
>
> out_abort:
> iommufd_object_abort_and_destroy(ucmd->ictx, &vdev->obj);
> +out_unlock_igroup:
> + mutex_unlock(&idev->igroup->lock);
Looks like we will have to partially revert the _ucmd allocator,
in this function:
https://lore.kernel.org/all/107b24a3b791091bb09c92ffb0081c56c413b26d.1749882255.git.nicolinc@nvidia.com/
Please try fixing the conflicts on top of Jason's for-next tree.
Thanks
Nicolin
Powered by blists - more mailing lists