lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aCYvjeZW+7NmUtoE@Asurada-Nvidia>
Date: Thu, 15 May 2025 11:16:45 -0700
From: Nicolin Chen <nicolinc@...dia.com>
To: Jason Gunthorpe <jgg@...dia.com>
CC: <kevin.tian@...el.com>, <corbet@....net>, <will@...nel.org>,
	<bagasdotme@...il.com>, <robin.murphy@....com>, <joro@...tes.org>,
	<thierry.reding@...il.com>, <vdumpa@...dia.com>, <jonathanh@...dia.com>,
	<shuah@...nel.org>, <jsnitsel@...hat.com>, <nathan@...nel.org>,
	<peterz@...radead.org>, <yi.l.liu@...el.com>, <mshavit@...gle.com>,
	<praan@...gle.com>, <zhangzekun11@...wei.com>, <iommu@...ts.linux.dev>,
	<linux-doc@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
	<linux-arm-kernel@...ts.infradead.org>, <linux-tegra@...r.kernel.org>,
	<linux-kselftest@...r.kernel.org>, <patches@...ts.linux.dev>,
	<mochs@...dia.com>, <alok.a.tiwari@...cle.com>, <vasant.hegde@....com>
Subject: Re: [PATCH v4 11/23] iommufd/viommu: Add IOMMUFD_CMD_HW_QUEUE_ALLOC
 ioctl

On Thu, May 15, 2025 at 01:06:20PM -0300, Jason Gunthorpe wrote:
> On Thu, May 08, 2025 at 08:02:32PM -0700, Nicolin Chen wrote:
> > +/**
> > + * struct iommu_hw_queue_alloc - ioctl(IOMMU_HW_QUEUE_ALLOC)
> > + * @size: sizeof(struct iommu_hw_queue_alloc)
> > + * @flags: Must be 0
> > + * @viommu_id: Virtual IOMMU ID to associate the HW queue with
> > + * @type: One of enum iommu_hw_queue_type
> > + * @index: The logical index to the HW queue per virtual IOMMU for a multi-queue
> > + *         model
> > + * @out_hw_queue_id: The ID of the new HW queue
> > + * @base_addr: Base address of the queue memory in guest physical address space
> > + * @length: Length of the queue memory in the guest physical address space
> > + *
> > + * Allocate a HW queue object for a vIOMMU-specific HW-accelerated queue, which
> > + * allows HW to access a guest queue memory described by @base_addr and @length.
> > + * Upon success, the underlying physical pages of the guest queue memory will be
> > + * pinned to prevent VMM from unmapping them in the IOAS until the HW queue gets
> > + * destroyed.
> 
> Do we have way to make the pinning optional?
> 
> As I understand AMD's system the iommu HW itself translates the
> base_addr through the S2 page table automatically, so it doesn't need
> pinned memory and physical addresses but just the IOVA.

Right. That's why we invented a flag, and it should be probably
extended to cover the pin step as well.

> Perhaps for this reason the pinning should be done with a function
> call from the driver?

But the whole point of doing in the core was to avoid the entire
iopt related structure/function from being exposed to the driver,
which would otherwise hugely increase the size of the driver.o?

> > +struct iommu_hw_queue_alloc {
> > +	__u32 size;
> > +	__u32 flags;
> > +	__u32 viommu_id;
> > +	__u32 type;
> > +	__u32 index;
> > +	__u32 out_hw_queue_id;
> > +	__aligned_u64 base_addr;
> 
> base addr should probably be called nesting_parent_iova  to match how
> we described the viommu hwpt:
> 
>  * @hwpt_id: ID of a nesting parent HWPT to associate to

Ack.

> > +	/*
> > +	 * The underlying physical pages must be pinned to prevent them from
> > +	 * being unmapped (via IOMMUFD_CMD_IOAS_UNMAP) during the life cycle
> > +	 * of the HW QUEUE object.
> > +	 */
> > +	rc = iopt_pin_pages(&hwpt->ioas->iopt, cmd->base_addr, cmd->length,
> > +			    pages, 0);
> 
> I don't think this actually works like this without an unmap
> callback. unmap will break:
> 
> 			iommufd_access_notify_unmap(iopt, area_first, length);
> 			/* Something is not responding to unmap requests. */
> 			tries++;
> 			if (WARN_ON(tries > 100))
> 				return -EDEADLOCK;
> 
> If it can't shoot down the pinning.

Hmm, I thought we want the unmap to fail until VMM releases the HW
QUEUE first? In what case, does VMM wants to unmap while holding
the queue pages?

> Why did this need to change away from just a normal access? That ops
> and unmap callback are not optional things.
> 
> What vcmdq would do in the unmap callback is question I'm not sure
> of..
> 
> This is more reason to put the pin/access in the driver so it can
> provide an unmap callback that can fix it up.

As there are two types of "access" here.. you mean iommufd_access,
i.e. vcmdq driver should hold an iommufd_access like an emulated
vfio device driver?

> I think this should have
> been done just by using the normal access mechanism, maybe with a
> simplifying wrapper for in-driver use. ie no need for patch #9

Still, the driver.o file will be very large, unlike VFIO that just
depends on CONFIG_IOMMUFD?

Thanks
Nicolin

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ