lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <0a0a576e-dd4d-402a-a0ea-43eede4e7cd8@arm.com>
Date: Mon, 13 Jan 2025 17:25:18 +0000
From: Robin Murphy <robin.murphy@....com>
To: Charan Teja Kalla <quic_charante@...cinc.com>,
 Will Deacon <will@...nel.org>
Cc: joro@...tes.org, jgg@...pe.ca, iommu@...ts.linux.dev,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH RFC] iommu: fix wrong DMA ops for iommu device

On 10/01/2025 1:29 pm, Charan Teja Kalla wrote:
[...]
>>> We need to fix 6.13 before we fix 6.6 unless you can show that 6.13 is
>>> unaffected (in which case, we can backport the fix(es)).
>>
>> Certainly the reasoning can't apply to mainline as given, since
>> arch_setup_dma_ops() stopped touching IOMMU stuff at all back in 6.10
>> (and indeed iommu_dma_ops itself no longer exists since 6.12).
>>
> 
> @Robin
> 
> Agree that we don't have iommu_dma_ops but I can say that same race
> still exists. Although dma_ops is not used, but the decision to call
> into the iommu api's is determined by the 'dev->dma_iommu' flag, which
> again, is set after domain is allocated for a device.
> 
> In the same race mentioned above,
> 1) S: Domain is not allocated but the dev->iommu_group.
> 2) C: Just returns as dev->iommu_group is filled.
> 3) C: Continues probing and succeeds.
> 4) C: Calls dma_alloc/map/.... But, it won't enter into iommu_ api's
> because the 'dev->dma_iommu' is still 'false'.
> 5) S: Domain is allocated and sets the 'dev->dma_iommu' to 'true'.
> 
> 4) above is the problematic step. Although issue exists but seems to me
> that very narrow to get triggered. Please CMIW.

Hmm, yes, I guess there is a fundamental race where async client driver
probe can observe a partially initialised group while the IOMMU driver
itself is still running iommu_device_register()->bus_iommu_probe()...
And in that case, this patch is wrong for two reasons: firstly, bodging
the iommu_probe_device() call (which really should not exist at all)
does not cover all cases; and secondly, forcibly creating the default
domain in that path before we know bus_iommu_probe() has seen all other
existing devices means iommu_get_default_domain_type() may miss their
requirements and thus defeat the whole point of deferred allocation in
the first place.

Having looked a bit closer, I think a more robust solution for now is
probably as below.

Thanks,
Robin.

----->8-----
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index d1af0547f553..8d90d196e38d 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -3120,6 +3120,11 @@ int iommu_device_use_default_domain(struct device *dev)
  		return 0;
  
  	mutex_lock(&group->mutex);
+	/* We may race against bus_iommu_probe() finalising groups here */
+	if (!group->default_domain) {
+		ret = -EPROBE_DEFER;
+		goto unlock_out;
+	}
  	if (group->owner_cnt) {
  		if (group->domain != group->default_domain || group->owner ||
  		    !xa_empty(&group->pasid_array)) {

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ