[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2127b181-2c3a-4470-9b79-b508a18275c9@linux.ibm.com>
Date: Tue, 3 Feb 2026 21:22:13 +0530
From: Shivaprasad G Bhat <sbhat@...ux.ibm.com>
To: Jason Gunthorpe <jgg@...dia.com>
Cc: linux-kernel@...r.kernel.org, linuxppc-dev@...ts.ozlabs.org,
kvm@...r.kernel.org, iommu@...ts.linux.dev, chleroy@...nel.org,
mpe@...erman.id.au, maddy@...ux.ibm.com, npiggin@...il.com,
alex@...zbot.org, joerg.roedel@....com, kevin.tian@...el.com,
gbatra@...ux.ibm.com, clg@...d.org, vaibhav@...ux.ibm.com,
brking@...ux.vnet.ibm.com, nnmlinux@...ux.ibm.com,
amachhiw@...ux.ibm.com, tpearson@...torengineering.com
Subject: Re: [RFC PATCH] powerpc: iommu: Initial IOMMUFD support for PPC64
Hi Jason,
Thanks for reviewing this patch.
On 1/28/26 12:46 AM, Jason Gunthorpe wrote:
> On Tue, Jan 27, 2026 at 06:35:56PM +0000, Shivaprasad G Bhat wrote:
>> The RFC attempts to implement the IOMMUFD support on PPC64 by
>> adding new iommu_ops for paging domain. The existing platform
>> domain continues to be the default domain for in-kernel use.
> It would be nice to see the platform domain go away and ppc use the
> normal dma-iommu.c stuff, but I don't think it is critical to making
> it work with iommufd.
I agree. I have started on this. I will send incremental changes
as follow-up after this.
>> On PPC64, IOVA ranges are based on the type of the DMA window
>> and their properties. Currently, there is no way to expose the
>> attributes of the non-default 64-bit DMA window, which the platform
>> supports. The platform allows the operating system to select the
>> starting offset(at 4GiB or 512PiB default offset), pagesize and
>> window size for the non-default 64-bit DMA window. For example,
>> with VFIO, this is handled via VFIO_IOMMU_SPAPR_TCE_GET_INFO
>> and VFIO_IOMMU_SPAPR_TCE_CREATE|REMOVE ioctls. While I am exploring
>> the ways to expose and configure these DMA window attributes as
>> per user input, any suggestions in this regard will be very helpful.
> You can pass in driver specific information during HWPT creation, so
> any properties you need can be specified there.
Sure. I think IOMMU_GET_HW_INFO would be useful for getting the
platform supported configuration in this case.
> Then you'd want to introduce a new domain op to get the apertures
> instead of the single range hard coded into the domain struct. The new
> op would be able to return a list. We can use this op to return
> apertures for sign extension page tables too.
>
> Update iommufd to calculate the reserved regions by evaluating the
> whole list.
>
> I think you'll find this pretty straight forward, I'd do it as a
> followup patch to this one.
Thanks. I will wait for that patch.
>
>> Currently existing vfio type1 specific vfio-compat driver even
>> with this patch will not work for PPC64. I believe we need to have
>> a separate "vfio-spapr-compat" driver to make it work.
> Yes, vfio-compat doesn't support the special spapr ioctls.
>
> I don't think you need a new driver, just implement whatever they do
> with the existing interfaces, probably in its own .c file though.
There are ioctl number conflicts like
# grep -n "VFIO_BASE + 1[89]" include/uapi/linux/vfio.h | grep define
940:#defineVFIO_DEVICE_BIND_IOMMUFD_IO(VFIO_TYPE, VFIO_BASE + 18)
976:#defineVFIO_DEVICE_ATTACH_IOMMUFD_PT_IO(VFIO_TYPE, VFIO_BASE + 19)
1833:#defineVFIO_IOMMU_SPAPR_UNREGISTER_MEMORY_IO(VFIO_TYPE, VFIO_BASE + 18)
1856:#defineVFIO_IOMMU_SPAPR_TCE_CREATE_IO(VFIO_TYPE, VFIO_BASE + 19)
# grep -n "VFIO_BASE + 20" include/uapi/linux/vfio.h | grep define
999:#defineVFIO_DEVICE_DETACH_IOMMUFD_PT_IO(VFIO_TYPE, VFIO_BASE + 20)
1870:#defineVFIO_IOMMU_SPAPR_TCE_REMOVE_IO(VFIO_TYPE, VFIO_BASE + 20)
> However, I have no idea what is required to implement those ops, or if
> it is even possible.. It may be easier to just leave the old vfio
> stuff around instead of trying to compat it. The purpose of compat was
> to be able to build kernels without type1 at all. It isn't necessary
> to start using iommufd in new apps with the new interfaces.
>
> Given you are mainly looking at a VMM that already will have iommufd
> support it may not be worthwhile.
You are right. We do have some use cases beyond VMM, I will consider
compat driver
only if it is helpful there.
>> @@ -1201,7 +1201,15 @@ spapr_tce_blocked_iommu_attach_dev(struct iommu_domain *platform_domain,
>> * also sets the dma_api ops
>> */
>> table_group = iommu_group_get_iommudata(grp);
>> +
>> + if (old && old->type == IOMMU_DOMAIN_DMA) {
> I'm trying to delete IOMMU_DOMAIN_DMA please don't use it in
> drivers.
Sure.
>> static const struct iommu_ops spapr_tce_iommu_ops = {
>> .default_domain = &spapr_tce_platform_domain,
>> .blocked_domain = &spapr_tce_blocked_domain,
>> @@ -1267,6 +1436,14 @@ static const struct iommu_ops spapr_tce_iommu_ops = {
>> .probe_device = spapr_tce_iommu_probe_device,
>> .release_device = spapr_tce_iommu_release_device,
>> .device_group = spapr_tce_iommu_device_group,
>> + .domain_alloc_paging = spapr_tce_domain_alloc_paging,
>> + .default_domain_ops = &(const struct iommu_domain_ops) {
>> + .attach_dev = spapr_tce_iommu_attach_device,
>> + .map_pages = spapr_tce_iommu_map_pages,
>> + .unmap_pages = spapr_tce_iommu_unmap_pages,
>> + .iova_to_phys = spapr_tce_iommu_iova_to_phys,
>> + .free = spapr_tce_domain_free,
>> + }
> Please don't use default_domain_ops in a driver that is supporting
> multiple domain types and platform, it becomes confusing to guess
> which domain type those ops are linked to.
Sure.
> You should also implement the BLOCKING domain type to make VFIO work
> better
I am not sure how this could help making VFIO better. May be, I am not able
to imagine the advantages with the current platform domain approach
in place. Could you please elaborate more on this?
> I wouldn't try to guess if this is right or not, but it looks pretty
> reasonable as a first start.
Thanks, I will iterate this as RFC till i get to reasonable shape.
Regards,
Shivaprasad
> Jason
Powered by blists - more mailing lists