linux-kernel - Re: [PATCH v6 10/13] iommu/amd: Introduce gDomID-to-hDomID Mapping and handle parent domain invalidation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20260119171307.GJ1134360@nvidia.com>
Date: Mon, 19 Jan 2026 13:13:07 -0400
From: Jason Gunthorpe <jgg@...dia.com>
To: Suravee Suthikulpanit <suravee.suthikulpanit@....com>
Cc: nicolinc@...dia.com, linux-kernel@...r.kernel.org, robin.murphy@....com,
	will@...nel.org, joro@...tes.org, kevin.tian@...el.com,
	jsnitsel@...hat.com, vasant.hegde@....com, iommu@...ts.linux.dev,
	santosh.shukla@....com, sairaj.arunkodilkar@....com,
	jon.grimm@....com, prashanthpra@...gle.com, wvw@...gle.com,
	wnliu@...gle.com, gptran@...gle.com, kpsingh@...gle.com,
	joao.m.martins@...cle.com, alejandro.j.jimenez@...cle.com
Subject: Re: [PATCH v6 10/13] iommu/amd: Introduce gDomID-to-hDomID Mapping
 and handle parent domain invalidation

On Thu, Jan 15, 2026 at 06:08:11AM +0000, Suravee Suthikulpanit wrote:
> +static int iommu_flush_pages_v1_hdom_ids(struct protection_domain *pdom, u64 address, size_t size)
> +{
> +	int ret = 0;
> +	struct amd_iommu_viommu *aviommu;
> +
> +	list_for_each_entry(aviommu, &pdom->viommu_list, pdom_list) {
> +		unsigned long i;

You should have some lockdeps here for this list iteration..

> +static void *gdom_info_load_or_alloc_locked(struct xarray *xa, unsigned long index)
> +{
> +	struct guest_domain_mapping_info *elm, *res;
> +
> +	elm = xa_load(xa, index);
> +	if (elm)
> +		return elm;
> +
> +	xa_unlock(xa);
> +	elm = kzalloc(sizeof(struct guest_domain_mapping_info), GFP_KERNEL);
> +	xa_lock(xa);
> +	if (!elm)
> +		return ERR_PTR(-ENOMEM);
> +
> +	res = __xa_cmpxchg(xa, index, NULL, elm, GFP_KERNEL);
> +	if (xa_is_err(res))
> +		res = ERR_PTR(xa_err(res));
> +
> +	if (res) {
> +		kfree(elm);
> +		return res;
> +	}
> +
> +	refcount_set(&elm->users, 0);
> +	return elm;
> +}
> +
>  /*
>   * This function is assigned to struct iommufd_viommu_ops.alloc_domain_nested()
>   * during the call to struct iommu_ops.viommu_init().
> @@ -68,6 +96,7 @@ amd_iommu_alloc_domain_nested(struct iommufd_viommu *viommu, u32 flags,
>  {
>  	int ret;
>  	struct nested_domain *ndom;
> +	struct guest_domain_mapping_info *gdom_info;
>  	struct amd_iommu_viommu *aviommu = container_of(viommu, struct amd_iommu_viommu, core);
>  
>  	if (user_data->type != IOMMU_HWPT_DATA_AMD_GUEST)
> @@ -92,7 +121,63 @@ amd_iommu_alloc_domain_nested(struct iommufd_viommu *viommu, u32 flags,
>  	ndom->domain.type = IOMMU_DOMAIN_NESTED;
>  	ndom->viommu = aviommu;
>  
> +	/*
> +	 * Normally, when a guest has multiple pass-through devices,
> +	 * the IOMMU driver setup DTEs with the same stage-2 table and
> +	 * use the same host domain ID (hDomId). In case of nested translation,
> +	 * if the guest setup different stage-1 tables with same PASID,
> +	 * IOMMU would use the same TLB tag. This will results in TLB
> +	 * aliasing issue.
> +	 *
> +	 * The guest is assigning gDomIDs based on its own algorithm for managing
> +	 * cache tags of (DomID, PASID). Within a single viommu, the nest parent domain
> +	 * (w/ S2 table) is used by all DTEs. But we need to consistently map the gDomID
> +	 * to a single hDomID. This is done using an xarray in the vIOMMU to
> +	 * keep track of the gDomID mapping. When the S2 is changed, the INVALIDATE_IOMMU_PAGES
> +	 * command must be issued for each hDomID in the xarray.
> +	 */
> +	xa_lock(&aviommu->gdomid_array);
> +
> +	gdom_info = gdom_info_load_or_alloc_locked(&aviommu->gdomid_array, ndom->gdom_id);
> +	if (IS_ERR(gdom_info)) {
> +		xa_unlock(&aviommu->gdomid_array);
> +		ret = PTR_ERR(gdom_info);
> +		goto out_err;
> +	}
> +
> +	/* Check if gDomID exist */
> +	if (refcount_inc_not_zero(&gdom_info->users)) {
> +		ndom->gdom_info = gdom_info;
> +		xa_unlock(&aviommu->gdomid_array);

This is pretty tortured, the alloc flow inside
gdom_info_load_or_alloc_locked() should do the
amd_iommu_pdom_id_alloc() and set the refcount to 1 before installing
it in the xarray, then you don't need any of this here.

> +	/* The gDomID does not exist. We allocate new hdom_id */
> +	gdom_info->hdom_id = amd_iommu_pdom_id_alloc();

Then this allocation wouldn't have to be ATOMIC.

But it looks working the way it is so no rush

Jason