[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPcyv4haUUs1Eew1PZTZkoGU4YFiHOuU93G+kG+CqfKzjz1gpw@mail.gmail.com>
Date: Tue, 18 Apr 2017 15:28:17 -0700
From: Dan Williams <dan.j.williams@...el.com>
To: Logan Gunthorpe <logang@...tatee.com>
Cc: Jason Gunthorpe <jgunthorpe@...idianresearch.com>,
Benjamin Herrenschmidt <benh@...nel.crashing.org>,
Bjorn Helgaas <helgaas@...nel.org>,
Christoph Hellwig <hch@....de>,
Sagi Grimberg <sagi@...mberg.me>,
"James E.J. Bottomley" <jejb@...ux.vnet.ibm.com>,
"Martin K. Petersen" <martin.petersen@...cle.com>,
Jens Axboe <axboe@...nel.dk>,
Steve Wise <swise@...ngridcomputing.com>,
Stephen Bates <sbates@...thlin.com>,
Max Gurtovoy <maxg@...lanox.com>,
Keith Busch <keith.busch@...el.com>, linux-pci@...r.kernel.org,
linux-scsi <linux-scsi@...r.kernel.org>,
linux-nvme@...ts.infradead.org, linux-rdma@...r.kernel.org,
linux-nvdimm <linux-nvdimm@...1.01.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Jerome Glisse <jglisse@...hat.com>
Subject: Re: [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory
On Tue, Apr 18, 2017 at 3:15 PM, Logan Gunthorpe <logang@...tatee.com> wrote:
>
>
> On 18/04/17 03:36 PM, Dan Williams wrote:
>> On Tue, Apr 18, 2017 at 2:22 PM, Jason Gunthorpe
>> <jgunthorpe@...idianresearch.com> wrote:
>>> On Tue, Apr 18, 2017 at 02:11:33PM -0700, Dan Williams wrote:
>>>>> I think this opens an even bigger can of worms..
>>>>
>>>> No, I don't think it does. You'd only shim when the target page is
>>>> backed by a device, not host memory, and you can figure this out by a
>>>> is_zone_device_page()-style lookup.
>>>
>>> The bigger can of worms is how do you meaningfully stack dma_ops.
>>
>> This goes back to my original comment to make this capability a
>> function of the pci bridge itself. The kernel has an implementation of
>> a dynamically created bridge device that injects its own dma_ops for
>> the devices behind the bridge. See vmd_setup_dma_ops() in
>> drivers/pci/host/vmd.c.
>
> Well the issue I think Jason is pointing out is that the ops don't
> stack. The map_* function in the injected dma_ops needs to be able to
> call the original map_* for any page that is not p2p memory. This is
> especially annoying in the map_sg function which may need to call a
> different op based on the contents of the sgl. (And please correct me if
> I'm not seeing how this can be done in the vmd example.)
Unlike the pci bus address offset case which I think is fundamental to
support since shipping archs do this today, I think it is ok to say
p2p is restricted to a single sgl that gets to talk to host memory or
a single device. That said, what's wrong with a p2p aware map_sg
implementation calling up to the host memory map_sg implementation on
a per sgl basis?
> Also, what happens if p2p pages end up getting passed to a device that
> doesn't have the injected dma_ops?
This goes back to limiting p2p to a single pci host bridge. If the p2p
capability is coordinated with the bridge rather than between the
individual devices then we have a central point to catch this case.
...of course this is all hand wavy until someone writes the code and
proves otherwise.
> However, the concept of replacing the dma_ops for all devices behind a
> supporting bridge is interesting and may be a good piece of the final
> solution.
It's at least a proof point for injecting special behavior for devices
behind a (virtual) pci bridge without needing to go touch a bunch of
drivers.
Powered by blists - more mailing lists