[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20260108113846.56179-1-smostafa@google.com>
Date: Thu, 8 Jan 2026 11:38:46 +0000
From: Mostafa Saleh <smostafa@...gle.com>
To: iommu@...ts.linux.dev, linux-arm-kernel@...ts.infradead.org,
linux-kernel@...r.kernel.org
Cc: robin.murphy@....com, will@...nel.org, joro@...tes.org,
Mostafa Saleh <smostafa@...gle.com>, Jason Gunthorpe <jgg@...pe.ca>
Subject: [PATCH] iommu/io-pgtable-arm: Drop DMA API usage for CMOs
As part of the KVM SMMUv3 series[1], we are trying to factor out
the kernel specific code from io-pgtable-arm so it can also compile
for the hypervisor.
Jason pointed out that the DMA-API calls are not really needed [2].
Looking more into this. Initially, the io-pgtable API let drivers
do the CMOs using tlb::flush_pgtable() where drivers were using the
DMA API (map/unmap_single) only to do CMOs as the low-level cache
functions won’t be available for modules.
This was later moved to the core code [3], with possibility to
convert it to full blown DMA-API code if there was a use case.
However, no such use case appeared, and we can simplify the code
slightly by doing CMO directly instead of going through the DMA-API
functions just for CMOs.
Although HTTU is not used at the moment, leave the
arch_sync_dma_for_cpu() in __arm_lpae_free_pages() as this is not a
hot path.
Removing the DMA-API will also remove some extra checks ensuring that
the IOMMU can deal with physical addrs. we can add those check at
page table creation time (Something as use_dma_iommu/is_swiotlb_active
However, that seemed a little too much as Linux doesn't support such
topologies anyways.
[1] https://lore.kernel.org/linux-iommu/20251117184815.1027271-1-smostafa@google.com/
[2] https://lore.kernel.org/linux-iommu/20251216005834.GC31492@ziepe.ca/
[3] https://lore.kernel.org/linux-iommu/7c5584d3efa61ee6b0b87efb72f24f32852aafb7.1438195011.git.robin.murphy@arm.com/
Suggested-by: Jason Gunthorpe <jgg@...pe.ca>
Signed-off-by: Mostafa Saleh <smostafa@...gle.com>
---
drivers/iommu/io-pgtable-arm.c | 39 +++++++---------------------------
1 file changed, 8 insertions(+), 31 deletions(-)
diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index e6626004b323..0da9195155ec 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -15,7 +15,7 @@
#include <linux/sizes.h>
#include <linux/slab.h>
#include <linux/types.h>
-#include <linux/dma-mapping.h>
+#include <linux/dma-map-ops.h>
#include <asm/barrier.h>
@@ -254,7 +254,6 @@ static void *__arm_lpae_alloc_pages(size_t size, gfp_t gfp,
{
struct device *dev = cfg->iommu_dev;
size_t alloc_size;
- dma_addr_t dma;
void *pages;
/*
@@ -271,32 +270,10 @@ static void *__arm_lpae_alloc_pages(size_t size, gfp_t gfp,
if (!pages)
return NULL;
- if (!cfg->coherent_walk) {
- dma = dma_map_single(dev, pages, size, DMA_TO_DEVICE);
- if (dma_mapping_error(dev, dma))
- goto out_free;
- /*
- * We depend on the IOMMU being able to work with any physical
- * address directly, so if the DMA layer suggests otherwise by
- * translating or truncating them, that bodes very badly...
- */
- if (dma != virt_to_phys(pages))
- goto out_unmap;
- }
-
+ if (!cfg->coherent_walk)
+ arch_sync_dma_for_device(__arm_lpae_dma_addr(pages), size,
+ DMA_TO_DEVICE);
return pages;
-
-out_unmap:
- dev_err(dev, "Cannot accommodate DMA translation for IOMMU page tables\n");
- dma_unmap_single(dev, dma, size, DMA_TO_DEVICE);
-
-out_free:
- if (cfg->free)
- cfg->free(cookie, pages, size);
- else
- iommu_free_pages(pages);
-
- return NULL;
}
static void __arm_lpae_free_pages(void *pages, size_t size,
@@ -304,8 +281,8 @@ static void __arm_lpae_free_pages(void *pages, size_t size,
void *cookie)
{
if (!cfg->coherent_walk)
- dma_unmap_single(cfg->iommu_dev, __arm_lpae_dma_addr(pages),
- size, DMA_TO_DEVICE);
+ arch_sync_dma_for_cpu(__arm_lpae_dma_addr(pages),
+ size, DMA_TO_DEVICE);
if (cfg->free)
cfg->free(cookie, pages, size);
@@ -316,8 +293,8 @@ static void __arm_lpae_free_pages(void *pages, size_t size,
static void __arm_lpae_sync_pte(arm_lpae_iopte *ptep, int num_entries,
struct io_pgtable_cfg *cfg)
{
- dma_sync_single_for_device(cfg->iommu_dev, __arm_lpae_dma_addr(ptep),
- sizeof(*ptep) * num_entries, DMA_TO_DEVICE);
+ arch_sync_dma_for_device(__arm_lpae_dma_addr(ptep),
+ sizeof(*ptep) * num_entries, DMA_TO_DEVICE);
}
static void __arm_lpae_clear_pte(arm_lpae_iopte *ptep, struct io_pgtable_cfg *cfg, int num_entries)
--
2.52.0.351.gbe84eed79e-goog
Powered by blists - more mailing lists