lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Zr6bpbc0HZ8xLVZw@Asurada-Nvidia>
Date: Thu, 15 Aug 2024 17:21:57 -0700
From: Nicolin Chen <nicolinc@...dia.com>
To: Jason Gunthorpe <jgg@...dia.com>
CC: <kevin.tian@...el.com>, <will@...nel.org>, <joro@...tes.org>,
	<suravee.suthikulpanit@....com>, <robin.murphy@....com>,
	<dwmw2@...radead.org>, <baolu.lu@...ux.intel.com>, <shuah@...nel.org>,
	<linux-kernel@...r.kernel.org>, <iommu@...ts.linux.dev>,
	<linux-arm-kernel@...ts.infradead.org>, <linux-kselftest@...r.kernel.org>
Subject: Re: [PATCH v1 05/16] iommufd/viommu: Add
 IOMMU_VIOMMU_SET/UNSET_VDEV_ID ioctl

On Thu, Aug 15, 2024 at 08:41:19PM -0300, Jason Gunthorpe wrote:
> On Thu, Aug 15, 2024 at 12:46:24PM -0700, Nicolin Chen wrote:
> > On Thu, Aug 15, 2024 at 04:08:48PM -0300, Jason Gunthorpe wrote:
> > > On Wed, Aug 07, 2024 at 01:10:46PM -0700, Nicolin Chen wrote:
> > > 
> > > > +int iommufd_viommu_set_vdev_id(struct iommufd_ucmd *ucmd)
> > > > +{
> > > > +	struct iommu_viommu_set_vdev_id *cmd = ucmd->cmd;
> > > > +	struct iommufd_hwpt_nested *hwpt_nested;
> > > > +	struct iommufd_vdev_id *vdev_id, *curr;
> > > > +	struct iommufd_hw_pagetable *hwpt;
> > > > +	struct iommufd_viommu *viommu;
> > > > +	struct iommufd_device *idev;
> > > > +	int rc = 0;
> > > > +
> > > > +	if (cmd->vdev_id > ULONG_MAX)
> > > > +		return -EINVAL;
> > > > +
> > > > +	idev = iommufd_get_device(ucmd, cmd->dev_id);
> > > > +	if (IS_ERR(idev))
> > > > +		return PTR_ERR(idev);
> > > > +	hwpt = idev->igroup->hwpt;
> > > > +
> > > > +	if (hwpt == NULL || hwpt->obj.type != IOMMUFD_OBJ_HWPT_NESTED) {
> > > > +		rc = -EINVAL;
> > > > +		goto out_put_idev;
> > > > +	}
> > > > +	hwpt_nested = container_of(hwpt, struct iommufd_hwpt_nested, common);
> > > 
> > > This doesn't seem like a necessary check, the attached hwpt can change
> > > after this is established, so this can't be an invariant we enforce.
> > > 
> > > If you want to do 1:1 then somehow directly check if the idev is
> > > already linked to a viommu.
> > 
> > But idev can't link to a viommu without a proxy hwpt_nested?
> 
> Why not? The idev becomes linked to the viommu when the dev id is set

> Unless we are also going to enforce the idev is always attached to a
> nested then I don't think we need to check it here.
> 
> Things will definately not entirely work as expected if the vdev is
> directly attached to the s2 or a blocking, but it won't harm anything.

My view is that, the moment there is a VIOMMU object, that must
be a nested IOMMU case, so there must be a nested hwpt. Blocking
domain would be a hwpt_nested too (vSTE=Abort) as we previously
concluded.

Then, in a nested case, it feels odd that an idev is attached to
an S2 hwpt..

That being said, I think we can still do that with validations:
 If idev->hwpt is nested, compare input viommu v.s idev->hwpt->viommu.
 If idev->hwpt is paging, compare input viommu->hwpt v.s idev->hwpt.

> > the stage-2 only configuration should have an identity hwpt_nested
> > right?
> 
> Yes, that is the right way to use the API
> 
> > > It has to work by having the iommu driver directly access the xarray
> > > and the entirely under the spinlock the iommu driver can translate the
> > > vSID to the pSID and the let go and push the invalidation to HW. No
> > > races.
> > 
> > Maybe the iommufd_viommu_invalidate ioctl handler should hold that
> > xa_lock around the viommu->ops->cache_invalidate, and then add lock
> > assert in iommufd_viommu_find_device?
> 
> That doesn't seem like a great idea, you can't do copy_from_user under
> a spinlock.
> 
> > > xa_lock(&viommu->vdev_ids);
> > > vdev_id = xa_load(&viommu->vdev_ids, cmd->vdev_id);
> > > if (!vdev_id || vdev_id->vdev_id != cmd->vdev_id (????) || vdev_id->dev != idev->dev)
> > >     err
> > > __xa_erase(&viommu->vdev_ids, cmd->vdev_id);
> > > xa_unlock((&viommu->vdev_ids);
> > 
> > I've changed to xa_cmpxchg() in my local tree. Would it be simpler?
> 
> No, that is still not right, you can't take the vdev_id outside the
> lock at all. Even for cmpxchng because the vdev_id could have been
> freed and reallocated by another thread.
> 
> You must combine the validation of the vdev_id with the erase under a
> single critical region.

Yea, we need a wider locker to keep the vdev_id list and its data
pointers unchanged. I'll try the rw semaphore that you suggested
in the other mail.

This complicates things overall especially with the VIRQ that has
involved interrupt context polling vdev_id, where semaphore/mutex
won't fit very well. Perhaps it would need a driver-level bottom
half routine to call those helpers with locks. I am glad that you
noticed the problem early.

Thanks!
Nicolin

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ