lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <56c39e62-0417-495d-a40e-99180a107777@arm.com>
Date: Thu, 8 Jan 2026 12:52:13 +0000
From: Robin Murphy <robin.murphy@....com>
To: Mostafa Saleh <smostafa@...gle.com>, iommu@...ts.linux.dev,
 linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org
Cc: will@...nel.org, joro@...tes.org, Jason Gunthorpe <jgg@...pe.ca>
Subject: Re: [PATCH] iommu/io-pgtable-arm: Drop DMA API usage for CMOs

On 2026-01-08 11:38 am, Mostafa Saleh wrote:
> As part of the KVM SMMUv3 series[1], we are trying to factor out
> the kernel specific code from io-pgtable-arm so it can also compile
> for the hypervisor.
> 
> Jason pointed out that the DMA-API calls are not really needed [2].
> 
> Looking more into this. Initially, the io-pgtable API let drivers
> do the CMOs using tlb::flush_pgtable() where drivers were using the
> DMA API (map/unmap_single) only to do CMOs as the low-level cache
> functions won’t be available for modules.
> 
> This was later moved to the core code [3], with possibility to
> convert it to full blown DMA-API code if there was a use case.

I don't understand that sentence at all.

> However, no such use case appeared, and we can simplify the code
> slightly by doing CMO directly instead of going through the DMA-API
> functions just for CMOs.
> 
> Although HTTU is not used at the moment, leave the
> arch_sync_dma_for_cpu() in __arm_lpae_free_pages() as this is not a
> hot path.
> 
> Removing the DMA-API will also remove some extra checks ensuring that
> the IOMMU can deal with physical addrs. we can add those check at
> page table creation time (Something as use_dma_iommu/is_swiotlb_active
> However, that seemed a little too much as Linux doesn't support such
> topologies anyways.

NAK - io-pgtable is regular driver code doing what is absolutely a DMA 
mapping operation. How would Panfrost calling arch_sync_* signal to hyp 
that the page is now being used for DMA so needs mapping/pinning at 
stage 2? We can't even say that that doesn't apply to "real" IOMMU 
drivers, since there do unfortunately exist cases where an entire "stage 
1" SMMUv3 is upstream of a "stage 2" SMMUv3 such that it's only workable 
if the former is assigned to EL1 as just another black-box device, while 
only EL2 knows about the latter.

If "pagetable creation time" means alloc_io_pgtable_ops() as I've taken 
it to, how do you specifically propose to check that pages which haven't 
been allocated yet will definitely be valid for DMA in future? Also note 
that the only reason for the virt_to_phys() check (which *has* often 
caught genuine driver/platform configuration issues to a degree that I 
would be very uncomfortable removing it) is basically laziness, in that 
nobody's got round to doing the work to be able to keep track of kernel 
addresses separately from the DMA addresses in the tables themselves, 
and thus allow visibly-remapped pagetables - every now and then there 
arises a situation where that would be really useful, so it is still on 
my wishlist.

Frankly if KVM is going to have increasingly divergent requirements and 
assumptions about pagetables, then it should have its own pagetable code 
like it does for the CPU.

Thanks,
Robin.

> [1] https://lore.kernel.org/linux-iommu/20251117184815.1027271-1-smostafa@google.com/
> [2] https://lore.kernel.org/linux-iommu/20251216005834.GC31492@ziepe.ca/
> [3] https://lore.kernel.org/linux-iommu/7c5584d3efa61ee6b0b87efb72f24f32852aafb7.1438195011.git.robin.murphy@arm.com/
> 
> Suggested-by: Jason Gunthorpe <jgg@...pe.ca>
> Signed-off-by: Mostafa Saleh <smostafa@...gle.com>
> ---
>   drivers/iommu/io-pgtable-arm.c | 39 +++++++---------------------------
>   1 file changed, 8 insertions(+), 31 deletions(-)
> 
> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> index e6626004b323..0da9195155ec 100644
> --- a/drivers/iommu/io-pgtable-arm.c
> +++ b/drivers/iommu/io-pgtable-arm.c
> @@ -15,7 +15,7 @@
>   #include <linux/sizes.h>
>   #include <linux/slab.h>
>   #include <linux/types.h>
> -#include <linux/dma-mapping.h>
> +#include <linux/dma-map-ops.h>
>   
>   #include <asm/barrier.h>
>   
> @@ -254,7 +254,6 @@ static void *__arm_lpae_alloc_pages(size_t size, gfp_t gfp,
>   {
>   	struct device *dev = cfg->iommu_dev;
>   	size_t alloc_size;
> -	dma_addr_t dma;
>   	void *pages;
>   
>   	/*
> @@ -271,32 +270,10 @@ static void *__arm_lpae_alloc_pages(size_t size, gfp_t gfp,
>   	if (!pages)
>   		return NULL;
>   
> -	if (!cfg->coherent_walk) {
> -		dma = dma_map_single(dev, pages, size, DMA_TO_DEVICE);
> -		if (dma_mapping_error(dev, dma))
> -			goto out_free;
> -		/*
> -		 * We depend on the IOMMU being able to work with any physical
> -		 * address directly, so if the DMA layer suggests otherwise by
> -		 * translating or truncating them, that bodes very badly...
> -		 */
> -		if (dma != virt_to_phys(pages))
> -			goto out_unmap;
> -	}
> -
> +	if (!cfg->coherent_walk)
> +		arch_sync_dma_for_device(__arm_lpae_dma_addr(pages), size,
> +					 DMA_TO_DEVICE);
>   	return pages;
> -
> -out_unmap:
> -	dev_err(dev, "Cannot accommodate DMA translation for IOMMU page tables\n");
> -	dma_unmap_single(dev, dma, size, DMA_TO_DEVICE);
> -
> -out_free:
> -	if (cfg->free)
> -		cfg->free(cookie, pages, size);
> -	else
> -		iommu_free_pages(pages);
> -
> -	return NULL;
>   }
>   
>   static void __arm_lpae_free_pages(void *pages, size_t size,
> @@ -304,8 +281,8 @@ static void __arm_lpae_free_pages(void *pages, size_t size,
>   				  void *cookie)
>   {
>   	if (!cfg->coherent_walk)
> -		dma_unmap_single(cfg->iommu_dev, __arm_lpae_dma_addr(pages),
> -				 size, DMA_TO_DEVICE);
> +		arch_sync_dma_for_cpu(__arm_lpae_dma_addr(pages),
> +				      size, DMA_TO_DEVICE);
>   
>   	if (cfg->free)
>   		cfg->free(cookie, pages, size);
> @@ -316,8 +293,8 @@ static void __arm_lpae_free_pages(void *pages, size_t size,
>   static void __arm_lpae_sync_pte(arm_lpae_iopte *ptep, int num_entries,
>   				struct io_pgtable_cfg *cfg)
>   {
> -	dma_sync_single_for_device(cfg->iommu_dev, __arm_lpae_dma_addr(ptep),
> -				   sizeof(*ptep) * num_entries, DMA_TO_DEVICE);
> +	arch_sync_dma_for_device(__arm_lpae_dma_addr(ptep),
> +				 sizeof(*ptep) * num_entries, DMA_TO_DEVICE);
>   }
>   
>   static void __arm_lpae_clear_pte(arm_lpae_iopte *ptep, struct io_pgtable_cfg *cfg, int num_entries)


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ