lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0733efde-1a6e-5237-0868-e9fb31f237c7@amd.com>
Date:   Fri, 13 Jan 2017 14:04:30 +0100
From:   Christian König <christian.koenig@....com>
To:     Jerome Glisse <jglisse@...hat.com>,
        Stephen Bates <sbates@...thlin.com>
CC:     Logan Gunthorpe <logang@...tatee.com>,
        Jason Gunthorpe <jgunthorpe@...idianresearch.com>,
        <david1.zhou@....com>, <qiang.yu@....com>,
        "'linux-rdma@...r.kernel.org'" <linux-rdma@...r.kernel.org>,
        "'linux-nvdimm@...ts.01.org'" <linux-nvdimm@...1.01.org>,
        "Kuehling, Felix" <felix.kuehling@....com>,
        Serguei Sagalovitch <serguei.sagalovitch@....com>,
        "'linux-kernel@...r.kernel.org'" <linux-kernel@...r.kernel.org>,
        "'dri-devel@...ts.freedesktop.org'" <dri-devel@...ts.freedesktop.org>,
        <hch@...radead.org>,
        "Deucher, Alexander" <alexander.deucher@....com>,
        "Sander, Ben" <ben.sander@....com>,
        "Suthikulpanit, Suravee" <suravee.suthikulpanit@....com>,
        "'linux-pci@...r.kernel.org'" <linux-pci@...r.kernel.org>,
        Jerome Glisse <j.glisse@...il.com>,
        "Blinzer, Paul" <paul.blinzer@....com>,
        "'Linux-media@...r.kernel.org'" <linux-media@...r.kernel.org>
Subject: Re: Enabling peer to peer device transactions for PCIe devices

Am 12.01.2017 um 16:11 schrieb Jerome Glisse:
> On Wed, Jan 11, 2017 at 10:54:39PM -0600, Stephen Bates wrote:
>> On Fri, January 6, 2017 4:10 pm, Logan Gunthorpe wrote:
>>>
>>> On 06/01/17 11:26 AM, Jason Gunthorpe wrote:
>>>
>>>
>>>> Make a generic API for all of this and you'd have my vote..
>>>>
>>>>
>>>> IMHO, you must support basic pinning semantics - that is necessary to
>>>> support generic short lived DMA (eg filesystem, etc). That hardware can
>>>> clearly do that if it can support ODP.
>>> I agree completely.
>>>
>>>
>>> What we want is for RDMA, O_DIRECT, etc to just work with special VMAs
>>> (ie. at least those backed with ZONE_DEVICE memory). Then
>>> GPU/NVME/DAX/whatever drivers can just hand these VMAs to userspace
>>> (using whatever interface is most appropriate) and userspace can do what
>>> it pleases with them. This makes _so_ much sense and actually largely
>>> already works today (as demonstrated by iopmem).
>> +1 for iopmem ;-)
>>
>> I feel like we are going around and around on this topic. I would like to
>> see something that is upstream that enables P2P even if it is only the
>> minimum viable useful functionality to begin. I think aiming for the moon
>> (which is what HMM and things like it are) are simply going to take more
>> time if they ever get there.
>>
>> There is a use case for in-kernel P2P PCIe transfers between two NVMe
>> devices and between an NVMe device and an RDMA NIC (using NVMe CMBs or
>> BARs on the NIC). I am even seeing users who now want to move data P2P
>> between FPGAs and NVMe SSDs and the upstream kernel should be able to
>> support these users or they will look elsewhere.
>>
>> The iopmem patchset addressed all the use cases above and while it is not
>> an in kernel API it could have been modified to be one reasonably easily.
>> As Logan states the driver can then choose to pass the VMAs to user-space
>> in a manner that makes sense.
>>
>> Earlier in the thread someone mentioned LSF/MM. There is already a
>> proposal to discuss this topic so if you are interested please respond to
>> the email letting the committee know this topic is of interest to you [1].
>>
>> Also earlier in the thread someone discussed the issues around the IOMMU.
>> Given the known issues around P2P transfers in certain CPU root complexes
>> [2] it might just be a case of only allowing P2P when a PCIe switch
>> connects the two EPs. Another option is just to use CONFIG_EXPERT and make
>> sure people are aware of the pitfalls if they invoke the P2P option.
>
> iopmem is not applicable to GPU what i propose is to split the issue in 2
> so that everyone can reuse the part that needs to be common namely the DMA
> API part where you have to create IOMMU mapping for one device to point
> to the other device memory.
>
> We can have a DMA API that is agnostic to how the device memory is manage
> (so does not matter if device memory have struct page or not). This what
> i have been arguing in this thread. To make progress on this issue we need
> to stop conflicting different use case.
>
> So i say let solve the IOMMU issue first and let everyone use it in their
> own way with their device. I do not think we can share much more than
> that.

Yeah, exactly what I said from the very beginning as well. Just hacking 
together quick solutions doesn't really solve the problem in the long term.

What we need is proper adjusting of the DMA API towards handling of P2P 
and then build solutions for the different use cases on top of that.

We should also avoid falling into the trap of trying to just handle the 
existing get_user_pages and co interfaces so that the existing code 
doesn't need to change. P2P needs to be validated for each use case 
individually and not implemented in workarounds with fingers crossed and 
hoped for the best.

Regards,
Christian.

>
> Cheers,
> Jérôme


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ