[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240704171602.GE95824@unreal>
Date: Thu, 4 Jul 2024 20:16:02 +0300
From: Leon Romanovsky <leon@...nel.org>
To: Robin Murphy <robin.murphy@....com>
Cc: Jens Axboe <axboe@...nel.dk>, Jason Gunthorpe <jgg@...pe.ca>,
Joerg Roedel <joro@...tes.org>, Will Deacon <will@...nel.org>,
Keith Busch <kbusch@...nel.org>, Christoph Hellwig <hch@....de>,
"Zeng, Oak" <oak.zeng@...el.com>,
Chaitanya Kulkarni <kch@...dia.com>,
Sagi Grimberg <sagi@...mberg.me>,
Bjorn Helgaas <bhelgaas@...gle.com>,
Logan Gunthorpe <logang@...tatee.com>,
Yishai Hadas <yishaih@...dia.com>,
Shameer Kolothum <shameerali.kolothum.thodi@...wei.com>,
Kevin Tian <kevin.tian@...el.com>,
Alex Williamson <alex.williamson@...hat.com>,
Marek Szyprowski <m.szyprowski@...sung.com>,
Jérôme Glisse <jglisse@...hat.com>,
Andrew Morton <akpm@...ux-foundation.org>,
linux-block@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-rdma@...r.kernel.org, iommu@...ts.linux.dev,
linux-nvme@...ts.infradead.org, linux-pci@...r.kernel.org,
kvm@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [RFC PATCH v1 18/18] nvme-pci: use new dma API
On Thu, Jul 04, 2024 at 04:23:47PM +0100, Robin Murphy wrote:
> On 02/07/2024 10:09 am, Leon Romanovsky wrote:
> [...]
> > +static inline dma_addr_t nvme_dma_link_page(struct page *page,
> > + unsigned int poffset,
> > + unsigned int len,
> > + struct nvme_iod *iod)
> > {
> > - int i;
> > - struct scatterlist *sg;
> > + struct dma_iova_attrs *iova = &iod->dma_map->iova;
> > + struct dma_iova_state *state = &iod->dma_map->state;
> > + dma_addr_t dma_addr;
> > + int ret;
> > +
> > + if (iod->dma_map->use_iova) {
> > + phys_addr_t phys = page_to_phys(page) + poffset;
>
> Yeah, there's no way this can possibly work. You can't do the
> dev_use_swiotlb() check up-front based on some overall DMA operation size,
> but then build that operation out of arbitrarily small fragments of
> different physical pages that *could* individually need bouncing to not
> break coherency.
This is exactly how dma_map_sg() works. It checks in advance all SG and
proceeds with bounce buffer if needed. In our case all checks which
exists in dev_use_sg_swiotlb() will give "false". In v0, Christoph said
that NVMe guarantees alignment, which is only one "dynamic" check in
that function.
600 static bool dev_use_sg_swiotlb(struct device *dev, struct scatterlist *sg,
601 int nents, enum dma_data_direction dir)
602 {
603 struct scatterlist *s;
604 int i;
605
606 if (!IS_ENABLED(CONFIG_SWIOTLB))
607 return false;
608
609 if (dev_is_untrusted(dev))
610 return true;
611
612 /*
613 * If kmalloc() buffers are not DMA-safe for this device and
614 * direction, check the individual lengths in the sg list. If any
615 * element is deemed unsafe, use the swiotlb for bouncing.
616 */
617 if (!dma_kmalloc_safe(dev, dir)) {
618 for_each_sg(sg, s, nents, i)
619 if (!dma_kmalloc_size_aligned(s->length))
620 return true;
621 }
622
623 return false;
624 }
...
1338 static int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg,
1339 int nents, enum dma_data_direction dir, unsigned long attrs)
...
1360 if (dev_use_sg_swiotlb(dev, sg, nents, dir))
1361 return iommu_dma_map_sg_swiotlb(dev, sg, nents, dir, attrs);
Thanks
>
> Thanks,
> Robin.
>
> > +
> > + dma_addr = state->iova->addr + state->range_size;
> > + ret = dma_link_range(&iod->dma_map->state, phys, len);
> > + if (ret)
> > + return DMA_MAPPING_ERROR;
> > + } else {
> > + dma_addr = dma_map_page_attrs(iova->dev, page, poffset, len,
> > + iova->dir, iova->attrs);
> > + }
> > + return dma_addr;
> > +}
>
Powered by blists - more mailing lists