[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170419171451.GA10020@obsidianresearch.com>
Date: Wed, 19 Apr 2017 11:14:51 -0600
From: Jason Gunthorpe <jgunthorpe@...idianresearch.com>
To: Logan Gunthorpe <logang@...tatee.com>
Cc: Benjamin Herrenschmidt <benh@...nel.crashing.org>,
Dan Williams <dan.j.williams@...el.com>,
Bjorn Helgaas <helgaas@...nel.org>,
Christoph Hellwig <hch@....de>,
Sagi Grimberg <sagi@...mberg.me>,
"James E.J. Bottomley" <jejb@...ux.vnet.ibm.com>,
"Martin K. Petersen" <martin.petersen@...cle.com>,
Jens Axboe <axboe@...nel.dk>,
Steve Wise <swise@...ngridcomputing.com>,
Stephen Bates <sbates@...thlin.com>,
Max Gurtovoy <maxg@...lanox.com>,
Keith Busch <keith.busch@...el.com>, linux-pci@...r.kernel.org,
linux-scsi <linux-scsi@...r.kernel.org>,
linux-nvme@...ts.infradead.org, linux-rdma@...r.kernel.org,
linux-nvdimm <linux-nvdimm@...1.01.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Jerome Glisse <jglisse@...hat.com>
Subject: Re: [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory
On Wed, Apr 19, 2017 at 10:48:51AM -0600, Logan Gunthorpe wrote:
> The pci_enable_p2p_bar function would then just need to call
> devm_memremap_pages with the dma_map callback set to a function that
> does the segment check and the offset calculation.
I don't see a use for the dma_map function pointer at this point..
It doesn't make alot of sense for the completor of the DMA to provide
a mapping op, the mapping process is *path* specific, not specific to
a completer/initiator.
So, I would suggest more like this:
static inline struct device *get_p2p_src(struct page *page)
{
struct device *res;
struct dev_pagemap *pgmap;
if (!is_zone_device_page(page))
return NULL;
pgmap = get_dev_pagemap(page_to_pfn(page), NULL);
if (!pgmap || pgmap->type != MEMORY_DEVICE_P2P)
/* For now ZONE_DEVICE memory that is not P2P is
assumed to be configured for DMA the same as CPU
memory. */
return ERR_PTR(-EINVAL);
res = pgmap->dev;
device_get(res);
put_dev_pagemap(pgmap);
return res;
}
dma_addr_t pci_p2p_same_segment(struct device *initator,
struct device *completer,
struct page *page)
{
if (! PCI initiator & completer)
return ERROR;
if (!same segment initiator & completer)
return ERROR;
// Translate page directly to the value programmed into the BAR
return (Completer's PCI BAR base address) + (offset of page within BAR);
}
// dma_sg_map
for (each sgl) {
struct page *page = sg_page(s);
struct device *p2p_src = get_p2p_src(page);
if (IS_ERR(p2p_src))
// fail dma_sg
if (p2p_src) {
bool needs_iommu = false;
pa = pci_p2p_same_segment(dev, p2p_src, page);
if (pa == ERROR)
pa = arch_p2p_cross_segment(dev, p2psrc, page, &needs_iommui);
device_put(p2p_src);
if (pa == ERROR)
// fail
if (!needs_iommu) {
// Insert PA directly into the result SGL
sg++;
continue;
}
}
else
// CPU memory
pa = page_to_phys(page);
To me it looks like the code duplication across the iommu stuff comes
from just duplicating the basic iommu algorithm in every driver.
To clean that up I think someone would need to hoist the overall sgl
loop and use more ops callbacks eg allocate_iommu_range,
assign_page_to_rage, dealloc_range, etc. This is a problem p2p makes
worse, but isn't directly causing :\
Jason
Powered by blists - more mailing lists