linux-kernel - Re: [PATCH v1 05/16] iommufd/viommu: Add IOMMU_VIOMMU_SET/UNSET_VDEV

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Zr5bENKAYJTvwEBJ@Asurada-Nvidia>
Date: Thu, 15 Aug 2024 12:46:24 -0700
From: Nicolin Chen <nicolinc@...dia.com>
To: Jason Gunthorpe <jgg@...dia.com>
CC: <kevin.tian@...el.com>, <will@...nel.org>, <joro@...tes.org>,
	<suravee.suthikulpanit@....com>, <robin.murphy@....com>,
	<dwmw2@...radead.org>, <baolu.lu@...ux.intel.com>, <shuah@...nel.org>,
	<linux-kernel@...r.kernel.org>, <iommu@...ts.linux.dev>,
	<linux-arm-kernel@...ts.infradead.org>, <linux-kselftest@...r.kernel.org>
Subject: Re: [PATCH v1 05/16] iommufd/viommu: Add
 IOMMU_VIOMMU_SET/UNSET_VDEV_ID ioctl

On Thu, Aug 15, 2024 at 04:08:48PM -0300, Jason Gunthorpe wrote:
> On Wed, Aug 07, 2024 at 01:10:46PM -0700, Nicolin Chen wrote:
> 
> > +int iommufd_viommu_set_vdev_id(struct iommufd_ucmd *ucmd)
> > +{
> > +	struct iommu_viommu_set_vdev_id *cmd = ucmd->cmd;
> > +	struct iommufd_hwpt_nested *hwpt_nested;
> > +	struct iommufd_vdev_id *vdev_id, *curr;
> > +	struct iommufd_hw_pagetable *hwpt;
> > +	struct iommufd_viommu *viommu;
> > +	struct iommufd_device *idev;
> > +	int rc = 0;
> > +
> > +	if (cmd->vdev_id > ULONG_MAX)
> > +		return -EINVAL;
> > +
> > +	idev = iommufd_get_device(ucmd, cmd->dev_id);
> > +	if (IS_ERR(idev))
> > +		return PTR_ERR(idev);
> > +	hwpt = idev->igroup->hwpt;
> > +
> > +	if (hwpt == NULL || hwpt->obj.type != IOMMUFD_OBJ_HWPT_NESTED) {
> > +		rc = -EINVAL;
> > +		goto out_put_idev;
> > +	}
> > +	hwpt_nested = container_of(hwpt, struct iommufd_hwpt_nested, common);
> 
> This doesn't seem like a necessary check, the attached hwpt can change
> after this is established, so this can't be an invariant we enforce.
> 
> If you want to do 1:1 then somehow directly check if the idev is
> already linked to a viommu.

But idev can't link to a viommu without a proxy hwpt_nested? Even
the stage-2 only configuration should have an identity hwpt_nested
right?

> > +static struct device *
> > +iommufd_viommu_find_device(struct iommufd_viommu *viommu, u64 id)
> > +{
> > +	struct iommufd_vdev_id *vdev_id;
> > +
> > +	xa_lock(&viommu->vdev_ids);
> > +	vdev_id = xa_load(&viommu->vdev_ids, (unsigned long)id);
> > +	xa_unlock(&viommu->vdev_ids);
> 
> This lock doesn't do anything
> 
> > +	if (!vdev_id || vdev_id->vdev_id != id)
> > +		return NULL;
> 
> And this is unlocked
> 
> > +	return vdev_id->dev;
> > +}
> 
> This isn't good.. We can't return the struct device pointer here as
> there is no locking for it anymore. We can't even know it is still
> probed to VFIO anymore.
> 
> It has to work by having the iommu driver directly access the xarray
> and the entirely under the spinlock the iommu driver can translate the
> vSID to the pSID and the let go and push the invalidation to HW. No
> races.

Maybe the iommufd_viommu_invalidate ioctl handler should hold that
xa_lock around the viommu->ops->cache_invalidate, and then add lock
assert in iommufd_viommu_find_device?

> > +int iommufd_viommu_unset_vdev_id(struct iommufd_ucmd *ucmd)
> > +{
> > +	struct iommu_viommu_unset_vdev_id *cmd = ucmd->cmd;
> > +	struct iommufd_vdev_id *vdev_id;
> > +	struct iommufd_viommu *viommu;
> > +	struct iommufd_device *idev;
> > +	int rc = 0;
> > +
> > +	idev = iommufd_get_device(ucmd, cmd->dev_id);
> > +	if (IS_ERR(idev))
> > +		return PTR_ERR(idev);
> > +
> > +	viommu = iommufd_get_viommu(ucmd, cmd->viommu_id);
> > +	if (IS_ERR(viommu)) {
> > +		rc = PTR_ERR(viommu);
> > +		goto out_put_idev;
> > +	}
> > +
> > +	if (idev->dev != iommufd_viommu_find_device(viommu, cmd->vdev_id)) {
> 
> Swap the order around != to be more kernely

Ack.

> > +		rc = -EINVAL;
> > +		goto out_put_viommu;
> > +	}
> > +
> > +	vdev_id = xa_erase(&viommu->vdev_ids, cmd->vdev_id);
> 
> And this whole thing needs to be done under the xa_lock too.
> 
> xa_lock(&viommu->vdev_ids);
> vdev_id = xa_load(&viommu->vdev_ids, cmd->vdev_id);
> if (!vdev_id || vdev_id->vdev_id != cmd->vdev_id (????) || vdev_id->dev != idev->dev)
>     err
> __xa_erase(&viommu->vdev_ids, cmd->vdev_id);
> xa_unlock((&viommu->vdev_ids);

I've changed to xa_cmpxchg() in my local tree. Would it be simpler?

Thanks
Nicolin