[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BN9PR11MB5276A6F54C0391F72F3CFD7D8C7BA@BN9PR11MB5276.namprd11.prod.outlook.com>
Date: Wed, 25 Jun 2025 02:11:40 +0000
From: "Tian, Kevin" <kevin.tian@...el.com>
To: Jason Gunthorpe <jgg@...dia.com>
CC: Xu Yilun <yilun.xu@...ux.intel.com>, "will@...nel.org" <will@...nel.org>,
"aneesh.kumar@...nel.org" <aneesh.kumar@...nel.org>, "iommu@...ts.linux.dev"
<iommu@...ts.linux.dev>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>, "joro@...tes.org" <joro@...tes.org>,
"robin.murphy@....com" <robin.murphy@....com>, "shuah@...nel.org"
<shuah@...nel.org>, "nicolinc@...dia.com" <nicolinc@...dia.com>,
"aik@....com" <aik@....com>, "Williams, Dan J" <dan.j.williams@...el.com>,
"baolu.lu@...ux.intel.com" <baolu.lu@...ux.intel.com>, "Xu, Yilun"
<yilun.xu@...el.com>
Subject: RE: [PATCH v2 3/4] iommufd: Destroy vdevice on idevice destroy
> From: Jason Gunthorpe <jgg@...dia.com>
> Sent: Wednesday, June 25, 2025 9:36 AM
>
> On Tue, Jun 24, 2025 at 11:57:31PM +0000, Tian, Kevin wrote:
> > > From: Jason Gunthorpe <jgg@...dia.com>
> > > Sent: Tuesday, June 24, 2025 10:54 PM
> > >
> > > On Mon, Jun 23, 2025 at 05:49:45PM +0800, Xu Yilun wrote:
> > > > +static void iommufd_device_remove_vdev(struct iommufd_device
> *idev)
> > > > +{
> > > > + bool vdev_removing = false;
> > > > +
> > > > + mutex_lock(&idev->igroup->lock);
> > > > + if (idev->vdev) {
> > > > + struct iommufd_vdevice *vdev;
> > > > +
> > > > + vdev = iommufd_get_vdevice(idev->ictx, idev->vdev->obj.id);
> > > > + if (IS_ERR(vdev)) {
> > >
> > > This incrs obj.users which will cause a concurrent
> > > iommufd_object_remove() to fail with -EBUSY, which we are trying to
> > > avoid.
> >
> > concurrent remove means a user-initiated IOMMU_DESTROY, for which
> > failing with -EBUSY is expected as it doesn't wait for shortterm?
>
> Yes a user IOMMU_DESTROY of the vdevice should not have a transient
> EBUSY failure. Avoiding that is the purpose of the shorterm_users
> mechanism.
hmm my understanding is the opposite.
currently iommufd_destroy() doesn't set REMOVE_WAIT_SHORTTERM:
static int iommufd_destroy(struct iommufd_ucmd *ucmd)
{
struct iommu_destroy *cmd = ucmd->cmd;
return iommufd_object_remove(ucmd->ictx, NULL, cmd->id, 0);
}
so it's natural for IOMMU_DESTROY to hit transient EBUSY when a parallel
ioctl is being executed on the destroyed object:
if (!refcount_dec_if_one(&obj->users)) {
ret = -EBUSY;
goto err_xa;
}
idevice unbind is just a similar (but indirect) transient race to
IOMMU_DESTROY.
waiting shorterm_users is more for kernel destroy.
>
> > > Also you can hit a race where the tombstone has NULL'd the entry but
> > > the racing destroy will then load the NULL with xas_load() and hit this:
> > >
> > > if (WARN_ON(obj != to_destroy)) {
> >
> > IOMMU_DESTROY doesn't provide to_destroy.
>
> Right, but IOMMU_DESTROY thread could have already gone past the
> xa_store(NULL) and then the kernel destroy thread could reach the
> above WARN as it does use to_destroy.
>
If IOMMU_DESTROY have already gone past xa_store(NULL), there are
two scenarios:
1) vdevice has been completely destroyed with idev->vdev=NULL.
In such case iommufd_device_remove_vdev() is nop.
2) vdevice destroy has not been completed with idev->vdev still valid
In such case iommufd_get_vdevice() fails with vdev_removing set.
Then iommufd_device_remove_vdev() will wait on idev->vdev to
be NULL instead of calling iommufd_object_tombstone_user().
so the said race won't happen. 😊
Powered by blists - more mailing lists