[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPcyv4jUeKzKDARp6Z35kdPLKnP-M6aF8X5KpOx55CLyjnj4dA@mail.gmail.com>
Date: Sat, 15 Apr 2017 15:09:08 -0700
From: Dan Williams <dan.j.williams@...el.com>
To: Logan Gunthorpe <logang@...tatee.com>
Cc: Benjamin Herrenschmidt <benh@...nel.crashing.org>,
Bjorn Helgaas <helgaas@...nel.org>,
Jason Gunthorpe <jgunthorpe@...idianresearch.com>,
Christoph Hellwig <hch@....de>,
Sagi Grimberg <sagi@...mberg.me>,
"James E.J. Bottomley" <jejb@...ux.vnet.ibm.com>,
"Martin K. Petersen" <martin.petersen@...cle.com>,
Jens Axboe <axboe@...nel.dk>,
Steve Wise <swise@...ngridcomputing.com>,
Stephen Bates <sbates@...thlin.com>,
Max Gurtovoy <maxg@...lanox.com>,
Keith Busch <keith.busch@...el.com>, linux-pci@...r.kernel.org,
linux-scsi <linux-scsi@...r.kernel.org>,
linux-nvme@...ts.infradead.org, linux-rdma@...r.kernel.org,
linux-nvdimm <linux-nvdimm@...1.01.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Jerome Glisse <jglisse@...hat.com>
Subject: Re: [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory
On Sat, Apr 15, 2017 at 10:41 AM, Logan Gunthorpe <logang@...tatee.com> wrote:
> Thanks, Benjamin, for the summary of some of the issues.
>
> On 14/04/17 04:07 PM, Benjamin Herrenschmidt wrote
>> So I assume the p2p code provides a way to address that too via special
>> dma_ops ? Or wrappers ?
>
> Not at this time. We will probably need a way to ensure the iommus do
> not attempt to remap these addresses. Though if it does, I'd expect
> everything would still work you just wouldn't get the performance or
> traffic flow you are looking for. We've been testing with the software
> iommu which doesn't have this problem.
>
>> The problem is that the latter while seemingly easier, is also slower
>> and not supported by all platforms and architectures (for example,
>> POWER currently won't allow it, or rather only allows a store-only
>> subset of it under special circumstances).
>
> Yes, I think situations where we have to cross host bridges will remain
> unsupported by this work for a long time. There are two many cases where
> it just doesn't work or it performs too poorly to be useful.
>
>> I don't fully understand how p2pmem "solves" that by creating struct
>> pages. The offset problem is one issue. But there's the iommu issue as
>> well, the driver cannot just use the normal dma_map ops.
>
> We are not using a proper iommu and we are dealing with systems that
> have zero offset. This case is also easily supported. I expect fixing
> the iommus to not map these addresses would also be reasonably achievable.
I'm wondering, since this is limited to support behind a single
switch, if you could have a software-iommu hanging off that switch
device object that knows how to catch and translate the non-zero
offset bus address case. We have something like this with VMD driver,
and I toyed with a soft pci bridge when trying to support AHCI+NVME
bar remapping. When the dma api looks up the iommu for its device it
hits this soft-iommu and that driver checks if the page is host memory
or device memory to do the dma translation. You wouldn't need a bit in
struct page, just a lookup to the hosting struct dev_pagemap in the
is_zone_device_page() case and that can point you to p2p details.
Powered by blists - more mailing lists