[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250115062628.GA29782@lst.de>
Date: Wed, 15 Jan 2025 07:26:28 +0100
From: Christoph Hellwig <hch@....de>
To: Robin Murphy <robin.murphy@....com>
Cc: Leon Romanovsky <leon@...nel.org>, Jens Axboe <axboe@...nel.dk>,
Jason Gunthorpe <jgg@...pe.ca>, Joerg Roedel <joro@...tes.org>,
Will Deacon <will@...nel.org>, Christoph Hellwig <hch@....de>,
Sagi Grimberg <sagi@...mberg.me>,
Leon Romanovsky <leonro@...dia.com>,
Keith Busch <kbusch@...nel.org>,
Bjorn Helgaas <bhelgaas@...gle.com>,
Logan Gunthorpe <logang@...tatee.com>,
Yishai Hadas <yishaih@...dia.com>,
Shameer Kolothum <shameerali.kolothum.thodi@...wei.com>,
Kevin Tian <kevin.tian@...el.com>,
Alex Williamson <alex.williamson@...hat.com>,
Marek Szyprowski <m.szyprowski@...sung.com>,
Jérôme Glisse <jglisse@...hat.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Jonathan Corbet <corbet@....net>, linux-doc@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-block@...r.kernel.org,
linux-rdma@...r.kernel.org, iommu@...ts.linux.dev,
linux-nvme@...ts.infradead.org, linux-pci@...r.kernel.org,
kvm@...r.kernel.org, linux-mm@...ck.org,
Randy Dunlap <rdunlap@...radead.org>
Subject: Re: [PATCH v5 07/17] dma-mapping: Implement link/unlink ranges API
On Tue, Jan 14, 2025 at 08:50:35PM +0000, Robin Murphy wrote:
>> EXPORT_SYMBOL_GPL(dma_iova_free);
>> +static int __dma_iova_link(struct device *dev, dma_addr_t addr,
>> + phys_addr_t phys, size_t size, enum dma_data_direction dir,
>> + unsigned long attrs)
>> +{
>> + bool coherent = dev_is_dma_coherent(dev);
>> +
>> + if (!coherent && !(attrs & DMA_ATTR_SKIP_CPU_SYNC))
>> + arch_sync_dma_for_device(phys, size, dir);
>
> Again, if we're going to pretend to support non-coherent devices, where are
> the dma_sync_for_{device,cpu} calls that work for a dma_iova_state? It
> can't be the existing dma_sync_single ops since that would require the user
> to keep track of every mapping to sync them individually, and the whole
> premise is to avoid doing that (not to mention dma-debug wouldn't like it).
> Same for anything coherent but SWIOTLB-bounced.
That assumes you actually need to sync them. Many DMA mapping if not
most dma mappings are one shots - map and unmap, no sync. And these
will work fine here.
But I guess the documentation needs to spell that out. While I don't
have a good non-coherent system to test, swiotlb has actually been
tested with nvme when I implemented this part.
>> +{
>> + struct iommu_domain *domain = iommu_get_dma_domain(dev);
>> + struct iommu_dma_cookie *cookie = domain->iova_cookie;
>> + struct iova_domain *iovad = &cookie->iovad;
>> + size_t iova_start_pad = iova_offset(iovad, phys);
>> + size_t iova_end_pad = iova_offset(iovad, phys + size);
>
> "end_pad" implies a length of padding from the unaligned end address to
> reach the *next* granule boundary, but it seems this is actually the
> unaligned tail length of the data itself. That's what confused me last
> time, since in the map path that post-data padding region does matter in
> its own right.
Yeah. Do you have a suggestion for a better name?
>> + phys_addr_t phys, size_t offset, size_t size,
>> + enum dma_data_direction dir, unsigned long attrs)
>> +{
>> + struct iommu_domain *domain = iommu_get_dma_domain(dev);
>> + struct iommu_dma_cookie *cookie = domain->iova_cookie;
>> + struct iova_domain *iovad = &cookie->iovad;
>> + size_t iova_start_pad = iova_offset(iovad, phys);
>> +
>> + if (WARN_ON_ONCE(iova_start_pad && offset > 0))
>
> "iova_start_pad == 0" still doesn't guarantee that "phys" and "offset" are
> appropriately aligned to each other.
>> + if (dev_use_swiotlb(dev, size, dir) && iova_offset(iovad, phys | size))
>
> Again, why are we supporting non-granule-aligned mappings in the middle of
> a range when the documentation explicitly says not to?
It's not trying to support that, but checking that this is guaranteed
to be the last one is harder than handling it like this. If you have
a suggestion for better checks that would be very welcome.
>> + if (!dev_is_dma_coherent(dev) &&
>> + !(attrs & DMA_ATTR_SKIP_CPU_SYNC))
>> + arch_sync_dma_for_cpu(phys, len, dir);
>
> Hmm, how do attrs even work for a bulk unlink/destroy when the individual
> mappings could have been linked with different values?
They shouldn't. Just like randomly mixing flags doesn't work for the
existing APIs.
> (So no, irrespective of how conceptually horrid it is, clearly it's not
> even functionally viable to open-code abuse of DMA_ATTR_SKIP_CPU_SYNC in
> callers to attempt to work around P2P mappings...)
What do you mean with "work around"? I guess Leon added it to the hmm
code based on previous feedback, but I still don't think any of our P2P
infrastructure works reliably with non-coherent devices as
iommu_dma_map_sg gets this wrong. So despite the earlier comments I
suspect this should stick to the state of the art even if that is broken.
Powered by blists - more mailing lists