[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aN7bDNTAadHI/+qn@Asurada-Nvidia>
Date: Thu, 2 Oct 2025 13:05:32 -0700
From: Nicolin Chen <nicolinc@...dia.com>
To: Suravee Suthikulpanit <suravee.suthikulpanit@....com>
CC: <jgg@...dia.com>, <linux-kernel@...r.kernel.org>, <robin.murphy@....com>,
<will@...nel.org>, <joro@...tes.org>, <kevin.tian@...el.com>,
<jsnitsel@...hat.com>, <vasant.hegde@....com>, <iommu@...ts.linux.dev>,
<santosh.shukla@....com>, <sairaj.arunkodilkar@....com>, <jon.grimm@....com>,
<prashanthpra@...gle.com>, <wvw@...gle.com>, <wnliu@...gle.com>,
<gptran@...gle.com>, <kpsingh@...gle.com>, <joao.m.martins@...cle.com>,
<alejandro.j.jimenez@...cle.com>
Subject: Re: [PATCH v2 12/12] iommu/amd: Introduce IOMMUFD vIOMMU support for
AMD
On Wed, Oct 01, 2025 at 06:09:54AM +0000, Suravee Suthikulpanit wrote:
> Introduce struct amd_iommu_vminfo to store AMD HW-vIOMMU per-IOMMU data,
> which is initialized when calling struct iommu_ops.viommu_init().
>
> Currently, the struct amd_iommu_vminfo and amd_iommu_viommu_init() contain
> base code to support nested domain allocation for vIOMMU using the struct
> iommufd_viommu_ops.alloc_domain_nested.
>
> Additional initialization will be added in subsequent patches.
This is the last patch in the series. You mean some future patch?
And pls briefly elaborate what is the additional initialization for.
> diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
> index c1abb06126c1..e3503091cd65 100644
> --- a/drivers/iommu/amd/iommu.c
> +++ b/drivers/iommu/amd/iommu.c
> @@ -3063,6 +3063,61 @@ static const struct iommu_dirty_ops amd_dirty_ops = {
> .read_and_clear_dirty = amd_iommu_read_and_clear_dirty,
> };
>
> +static size_t amd_iommu_get_viommu_size(struct device *dev, enum iommu_viommu_type viommu_type)
> +{
> + if (viommu_type != IOMMU_VIOMMU_TYPE_AMD)
> + return 0;
> +
> + return VIOMMU_STRUCT_SIZE(struct amd_iommu_vminfo, core);
> +}
> +
> +/*
> + * This is called from the drivers/iommu/iommufd/viommu.c: iommufd_viommu_alloc_ioctl
> + */
> +static int amd_iommu_viommu_init(struct iommufd_viommu *viommu,
> +static int amd_iommu_viommu_init(struct iommufd_viommu *viommu,
> + struct iommu_domain *parent,
> + const struct iommu_user_data *user_data)
> +{
> +#if IS_ENABLED(CONFIG_AMD_IOMMU_IOMMUFD)
These two should be put in the iommufd.c file, and the header
should define the followings (next to amd_iommufd_hw_info):
#if IS_ENABLED(CONFIG_AMD_IOMMU_IOMMUFD)
void *amd_iommufd_hw_info(struct device *dev, u32 *length, u32 *type);
+size_t amd_iommu_get_viommu_size(struct device *dev,
+ enum iommu_viommu_type viommu_type);
+int amd_iommu_viommu_init(struct iommufd_viommu *viommu,
+ struct iommu_domain *parent,
+ const struct iommu_user_data *user_data);
else
#define amd_iommufd_hw_info NULL
+#define amd_iommu_get_viommu_size NULL
+#define amd_iommu_viommu_init NULL
#endif
The core would return -EOPNOTSUPP if either of them is NULL.
> + if (ret)
> + return ret;
> +
> + iommu = get_amd_iommu_from_devid(data.iommu_devid);
> + if (!iommu)
> + return -ENODEV;
> +
> + vminfo->iommu_devid = data.iommu_devid;
If you want the struct amd_iommu pointer, do this instead:
iommu = container_of(viommu->iommu_dev, struct amd_iommu, iommu);
Then iommu_devid and get_amd_iommu_from_devid() aren't used anywhere
else. So both could be dropped.
> +/*
> + * See include/linux/iommufd.h
> + * struct iommufd_viommu_ops - vIOMMU specific operations
> + */
> +const struct iommufd_viommu_ops amd_viommu_ops = {
> + .alloc_domain_nested = amd_iommu_alloc_domain_nested,
> +};
Unfortunately, a viommu_ops with alloc_domain_nested is incomplete,
IMHO. If this series gets merged alone, it declares that the kernel
now supports AMD IOMMU's virtualization, which actually won't work
without a cache invalidation op (SW) or hw_queue (HW-acceleration).
> diff --git a/drivers/iommu/amd/iommufd.h b/drivers/iommu/amd/iommufd.h
> index f880be80a30d..0d59ef160780 100644
> --- a/drivers/iommu/amd/iommufd.h
> +++ b/drivers/iommu/amd/iommufd.h
> @@ -7,6 +7,8 @@
> #define AMD_IOMMUFD_H
>
> #if IS_ENABLED(CONFIG_AMD_IOMMU_IOMMUFD)
> +extern const struct iommufd_viommu_ops amd_viommu_ops;
You don't need this if defining the amd_iommu_get_viommu_size and
amd_iommu_viommu_init here.
> +/**
> + * struct iommu_viommu_amd - AMD vIOMMU Interface (IOMMU_VIOMMU_TYPE_AMD)
> + * @iommu_devid: Host IOMMU PCI device ID
Though I don't think you need this, just curious, how does user
space know the host PCI device ID?
> + * @viommu_devid: Guest vIOMMU PCI device ID
This isn't used in the patch, so I am not sure. But it sounds like
a vDEVICE virt_id to me. Though it probably works for the AMD case
because AMD IOMMU is per PCI device (IIRC), I think it'd be better
to define a vDEVICE forwarding this ID. I don't know, but for CC,
AMD might end up with a vDEVICE object anyway?
> + * @trans_devid: GPA->GVA translation device ID (host)
This is unclear to me either, and it's not being used, so I cannot
tell what it is for. But it feels like a part of vDEVICE object..
> + * @out_gid: (out) Guest ID
Again, not being used, needs some elaboration.
> + * @out_vfmmio_mmap_offset: (out) mmap offset for vIOMMU VF-MMIO
"(out)" is redudant.
@out_vfmmio_mmap_offset: mmap offset argument for vIOMMU VF-MMIO
And you could have an @out_vfmmio_mmap_length too, even if it is
just for validation in the iommufd core.
> + */
> +struct iommu_viommu_amd {
> + __u32 iommu_devid;
> + __u32 viommu_devid;
> + __u32 trans_devid;
> + __u32 out_gid;
> + __aligned_u64 out_vfmmio_mmap_offset;
> + __u32 reserved; /* must be last */
"reserved" is useless here.
Usually we add reserved field for paddings, and we usually call
it "__reserved".
struct iommu_viommu_amd is perfectly 64-bit aligned without the
last "reserved" field. I would drop it.
Nicolin
Powered by blists - more mailing lists