lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20161123203332.GA15062@obsidianresearch.com>
Date:   Wed, 23 Nov 2016 13:33:32 -0700
From:   Jason Gunthorpe <jgunthorpe@...idianresearch.com>
To:     Serguei Sagalovitch <serguei.sagalovitch@....com>
Cc:     Logan Gunthorpe <logang@...tatee.com>,
        Dan Williams <dan.j.williams@...el.com>,
        "Deucher, Alexander" <Alexander.Deucher@....com>,
        "linux-nvdimm@...ts.01.org" <linux-nvdimm@...1.01.org>,
        "linux-rdma@...r.kernel.org" <linux-rdma@...r.kernel.org>,
        "linux-pci@...r.kernel.org" <linux-pci@...r.kernel.org>,
        "Kuehling, Felix" <Felix.Kuehling@....com>,
        "Bridgman, John" <John.Bridgman@....com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "dri-devel@...ts.freedesktop.org" <dri-devel@...ts.freedesktop.org>,
        "Koenig, Christian" <Christian.Koenig@....com>,
        "Sander, Ben" <ben.sander@....com>,
        "Suthikulpanit, Suravee" <Suravee.Suthikulpanit@....com>,
        "Blinzer, Paul" <Paul.Blinzer@....com>,
        "Linux-media@...r.kernel.org" <Linux-media@...r.kernel.org>,
        Haggai Eran <haggaie@...lanox.com>
Subject: Re: Enabling peer to peer device transactions for PCIe devices

On Wed, Nov 23, 2016 at 02:58:38PM -0500, Serguei Sagalovitch wrote:

>    We do not want to have "highly" dynamic translation due to
>    performance cost.  We need to support "overcommit" but would
>    like to minimize impact.  To support RDMA MRs for GPU/VRAM/PCIe
>    device memory (which is must) we need either globally force
>    pinning for the scope of "get_user_pages() / "put_pages" or have
>    special handling for RDMA MRs and similar cases.

As I said, there is no possible special handling. Standard IB hardware
does not support changing the DMA address once a MR is created. Forget
about doing that.

Only ODP hardware allows changing the DMA address on the fly, and it
works at the page table level. We do not need special handling for
RDMA.

>    Generally it could be difficult to correctly handle "DMA in
>    progress" due to the facts that (a) DMA could originate from
>    numerous PCIe devices simultaneously including requests to
>    receive network data.

We handle all of this today in kernel via the page pinning mechanism.
This needs to be copied into peer-peer memory and GPU memory schemes
as well. A pinned page means the DMA address channot be changed and
there is active non-CPU access to it.

Any hardware that does not support page table mirroring must go this
route.

> (b) in HSA case DMA could originated from user space without kernel
>    driver knowledge.  So without corresponding h/w support
>    everywhere I do not see how it could be solved effectively.

All true user triggered DMA must go through some kind of coherent page
table mirroring scheme (eg this is what CAPI does, presumably AMDs HSA
is similar). A page table mirroring scheme is basically the same as
what ODP does.

Like I said, this is the direction the industry seems to be moving in,
so any solution here should focus on VMAs/page tables as the way to link
the peer-peer devices.

To me this means at least items #1 and #3 should be removed from
Alexander's list.

Jason

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ