linux-kernel - Re: [PATCH v9 07/24] dma-mapping: Implement link/unlink ranges API

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aA1iRtCsPkuprI-X@bombadil.infradead.org>
Date: Sat, 26 Apr 2025 15:46:30 -0700
From: Luis Chamberlain <mcgrof@...nel.org>
To: Leon Romanovsky <leon@...nel.org>
Cc: Marek Szyprowski <m.szyprowski@...sung.com>,
	Jens Axboe <axboe@...nel.dk>, Christoph Hellwig <hch@....de>,
	Keith Busch <kbusch@...nel.org>,
	Leon Romanovsky <leonro@...dia.com>, Jake Edge <jake@....net>,
	Jonathan Corbet <corbet@....net>, Jason Gunthorpe <jgg@...pe.ca>,
	Zhu Yanjun <zyjzyj2000@...il.com>,
	Robin Murphy <robin.murphy@....com>, Joerg Roedel <joro@...tes.org>,
	Will Deacon <will@...nel.org>, Sagi Grimberg <sagi@...mberg.me>,
	Bjorn Helgaas <bhelgaas@...gle.com>,
	Logan Gunthorpe <logang@...tatee.com>,
	Yishai Hadas <yishaih@...dia.com>,
	Shameer Kolothum <shameerali.kolothum.thodi@...wei.com>,
	Kevin Tian <kevin.tian@...el.com>,
	Alex Williamson <alex.williamson@...hat.com>,
	Jérôme Glisse <jglisse@...hat.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
	linux-block@...r.kernel.org, linux-rdma@...r.kernel.org,
	iommu@...ts.linux.dev, linux-nvme@...ts.infradead.org,
	linux-pci@...r.kernel.org, kvm@...r.kernel.org, linux-mm@...ck.org,
	Niklas Schnelle <schnelle@...ux.ibm.com>,
	Chuck Lever <chuck.lever@...cle.com>,
	Matthew Wilcox <willy@...radead.org>,
	Dan Williams <dan.j.williams@...el.com>,
	Kanchan Joshi <joshi.k@...sung.com>,
	Chaitanya Kulkarni <kch@...dia.com>
Subject: Re: [PATCH v9 07/24] dma-mapping: Implement link/unlink ranges API

On Wed, Apr 23, 2025 at 11:12:58AM +0300, Leon Romanovsky wrote:
> From: Leon Romanovsky <leonro@...dia.com>
> 
> Introduce new DMA APIs to perform DMA linkage of buffers
> in layers higher than DMA.
> 
> In proposed API, the callers will perform the following steps.
> In map path:
> 	if (dma_can_use_iova(...))
> 	    dma_iova_alloc()
> 	    for (page in range)
> 	       dma_iova_link_next(...)
> 	    dma_iova_sync(...)
> 	else
> 	     /* Fallback to legacy map pages */
>              for (all pages)
> 	       dma_map_page(...)
> 
> In unmap path:
> 	if (dma_can_use_iova(...))
> 	     dma_iova_destroy()
> 	else
> 	     for (all pages)
> 		dma_unmap_page(...)
> 
> Reviewed-by: Christoph Hellwig <hch@....de>
> Tested-by: Jens Axboe <axboe@...nel.dk>
> Signed-off-by: Leon Romanovsky <leonro@...dia.com>
> ---
>  drivers/iommu/dma-iommu.c   | 261 ++++++++++++++++++++++++++++++++++++
>  include/linux/dma-mapping.h |  32 +++++
>  2 files changed, 293 insertions(+)
> 
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index d2c298083e0a..2e014db5a244 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -1818,6 +1818,267 @@ void dma_iova_free(struct device *dev, struct dma_iova_state *state)
>  }
>  EXPORT_SYMBOL_GPL(dma_iova_free);
>  
> +static int __dma_iova_link(struct device *dev, dma_addr_t addr,
> +		phys_addr_t phys, size_t size, enum dma_data_direction dir,
> +		unsigned long attrs)
> +{
> +	bool coherent = dev_is_dma_coherent(dev);
> +
> +	if (!coherent && !(attrs & DMA_ATTR_SKIP_CPU_SYNC))
> +		arch_sync_dma_for_device(phys, size, dir);

So arch_sync_dma_for_device() is a no-op on some architectures, notably x86.
So since you're doing this work and given the above pattern is common on
the non iova case, we could save ourselves 2 branches checks on x86 on
__dma_iova_link() and also generalize savings for the non-iova case as
well. For the non-iova case we have two use cases, one with the attrs on
initial mapping, and one without on subsequent sync ops. For the iova
case the attr is always consistently used.

So we could just have something like this:

#ifdef CONFIG_ARCH_HAS_SYNC_DMA_FOR_DEVICE
static inline void arch_sync_dma_device(struct device *dev,
                                        phys_addr_t paddr, size_t size,
                                        enum dma_data_direction dir)
{
    if (!dev_is_dma_coherent(dev))
        arch_sync_dma_for_device(paddr, size, dir);
}

static inline void arch_sync_dma_device_attrs(struct device *dev,
                                              phys_addr_t paddr, size_t size,
                                              enum dma_data_direction dir,
                                              unsigned long attrs)
{
    if (!dev_is_dma_coherent(dev) && !(attrs & DMA_ATTR_SKIP_CPU_SYNC))
        arch_sync_dma_for_device(paddr, size, dir);
}
#else
static inline void arch_sync_dma_device(struct device *dev,
                                        phys_addr_t paddr, size_t size,
                                        enum dma_data_direction dir)
{
}

static inline void arch_sync_dma_device_attrs(struct device *dev,
                                              phys_addr_t paddr, size_t size,
                                              enum dma_data_direction dir,
                                              unsigned long attrs)
{
}
#endif

> +/**
> + * dma_iova_link - Link a range of IOVA space
> + * @dev: DMA device
> + * @state: IOVA state
> + * @phys: physical address to link
> + * @offset: offset into the IOVA state to map into
> + * @size: size of the buffer
> + * @dir: DMA direction
> + * @attrs: attributes of mapping properties
> + *
> + * Link a range of IOVA space for the given IOVA state without IOTLB sync.
> + * This function is used to link multiple physical addresses in contiguous
> + * IOVA space without performing costly IOTLB sync.
> + *
> + * The caller is responsible to call to dma_iova_sync() to sync IOTLB at
> + * the end of linkage.
> + */
> +int dma_iova_link(struct device *dev, struct dma_iova_state *state,
> +		phys_addr_t phys, size_t offset, size_t size,
> +		enum dma_data_direction dir, unsigned long attrs)
> +{
> +	struct iommu_domain *domain = iommu_get_dma_domain(dev);
> +	struct iommu_dma_cookie *cookie = domain->iova_cookie;
> +	struct iova_domain *iovad = &cookie->iovad;
> +	size_t iova_start_pad = iova_offset(iovad, phys);
> +
> +	if (WARN_ON_ONCE(iova_start_pad && offset > 0))
> +		return -EIO;
> +
> +	if (dev_use_swiotlb(dev, size, dir) && iova_offset(iovad, phys | size))

There is already a similar check for the non-iova case for this on
iommu_dma_map_page() and a nice comment about what why this checked,
this seems to be just screaming for a helper:

/*                                                                       
 * Checks if a physical buffer has unaligned boundaries with respect to
 * the IOMMU granule. Returns non-zero if either the start or end
 * address is not aligned to the granule boundary.
*/
static inline size_t iova_unaligned(struct iova_domain *iovad,
                                    phys_addr_t phys,
				    size_t size)
{                                                                                
	return iova_offset(iovad, phys | size);
}  

Other than that, looks good.

Reviewed-by: Luis Chamberlain <mcgrof@...nel.org>

  Luis