[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aGOoCa3BkV7KrwVz@yilunxu-OptiPlex-7050>
Date: Tue, 1 Jul 2025 17:19:05 +0800
From: Xu Yilun <yilun.xu@...ux.intel.com>
To: Jason Gunthorpe <jgg@...dia.com>
Cc: "Tian, Kevin" <kevin.tian@...el.com>,
"will@...nel.org" <will@...nel.org>,
"aneesh.kumar@...nel.org" <aneesh.kumar@...nel.org>,
"iommu@...ts.linux.dev" <iommu@...ts.linux.dev>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"joro@...tes.org" <joro@...tes.org>,
"robin.murphy@....com" <robin.murphy@....com>,
"shuah@...nel.org" <shuah@...nel.org>,
"nicolinc@...dia.com" <nicolinc@...dia.com>,
"aik@....com" <aik@....com>,
"Williams, Dan J" <dan.j.williams@...el.com>,
"baolu.lu@...ux.intel.com" <baolu.lu@...ux.intel.com>,
"Xu, Yilun" <yilun.xu@...el.com>
Subject: Re: [PATCH v3 2/5] iommufd: Destroy vdevice on idevice destroy
On Mon, Jun 30, 2025 at 11:50:51AM -0300, Jason Gunthorpe wrote:
> On Mon, Jun 30, 2025 at 06:18:50PM +0800, Xu Yilun wrote:
>
> > I need to reconsider this, seems we need a dedicated vdev lock to
> > synchronize concurrent vdev abort/destroy.
>
> It is not possible to be concurrent
>
> destroy is only called once after it is no longer possible to call
> abort.
I'm almost about to drop the "abort twice" idea. [1]
[1]: https://lore.kernel.org/linux-iommu/20250625123832.GF167785@nvidia.com/
See from the flow below,
T1. iommufd_device_unbind(idev)
iommufd_device_destroy(obj)
mutex_lock(&idev->igroup->lock)
iommufd_vdevice_abort(idev->vdev.obj)
mutex_unlock(&idev->igroup->lock)
kfree(obj)
T2. iommufd_destroy(vdev_id)
iommufd_vdevice_destroy(obj)
mutex_lock(&vdev->idev->igroup->lock)
iommufd_vdevice_abort(obj);
mutex_unlock(&vdev->idev->igroup->lock)
kfree(obj)
iommufd_vdevice_destroy() will access idev->igroup->lock, but it is
possible the idev is already freed at that time:
iommufd_destroy(vdev_id)
iommufd_vdevice_destroy(obj)
iommufd_device_unbind(idev)
iommufd_device_destroy(obj)
mutex_lock(&idev->igroup->lock)
mutex_lock(&vdev->idev->igroup->lock) (wait)
iommufd_vdevice_abort(idev->vdev.obj)
mutex_unlock(&idev->igroup->lock)
kfree(obj)
mutex_lock(&vdev->idev->igroup->lock) (PANIC)
iommufd_vdevice_abort(obj)
...
We also can't introduce some vdev side lock instead of idev side lock,
vdev could also be freed right after iommufd_vdevice_destroy().
I think the only simple way is to let idev destruction wait for vdev
destruction, that go back to the wait_event idea ...
Thanks,
Yilun
Powered by blists - more mailing lists