lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170612184413.GA5924@gmail.com>
Date:   Mon, 12 Jun 2017 14:44:14 -0400
From:   Jerome Glisse <j.glisse@...il.com>
To:     "Wuzongyong (Cordius Wu, Euler Dept)" <wuzongyong1@...wei.com>
Cc:     "iommu@...ts.linux-foundation.org" <iommu@...ts.linux-foundation.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "oded.gabbay@....com" <oded.gabbay@....com>,
        "Wanzongshun (Vincent)" <wanzongshun@...wei.com>
Subject: Re: What differences and relations between SVM, HSA, HMM and Unified
 Memory?

On Sat, Jun 10, 2017 at 04:06:28AM +0000, Wuzongyong (Cordius Wu, Euler Dept) wrote:
> Hi,
> 
> Could someone explain differences and relations between the SVM
> (Shared Virtual Memory, by Intel), HSA(Heterogeneous System
> Architecture, by AMD), HMM(Heterogeneous Memory Management, by Glisse)
> and UM(Unified Memory, by NVIDIA) ? Are these in the substitutional
> relation?
>
> As I understand it, these aim to solve the same thing, sharing
> pointers between CPU and GPU(implement with ATS/PASID/PRI/IOMMU
> support). So far, SVM and HSA can only be used by integrated gpu.
> And, Intel declare that the root ports doesn't not have the
> required TLP prefix support, resulting  that SVM can't be used
> by discrete devices. So could someone tell me the required TLP
> prefix means what specifically?
>
> With HMM, we can use allocator like malloc to manage host and
> device memory. Does this mean that there is no need to use SVM
> and HSA with HMM, or HMM is the basis of SVM and HAS to
> implement Fine-Grained system SVM defined in the opencl spec?

So aim of all technology is to share address space between a device
and CPU. Now they are 3 way to do it:

  A) all in hardware like CAPI or CCIX where device memory is cache
     coherent from CPU access point of view and system memory is also
     accessible by device in cache coherent way with CPU. So it is
     cache coherency going both way from CPU to device memory and from
     device to system memory


  B) partially in hardware ATS/PASID (which are the same technology
     behind both HSA and SVM). Here it is only single way solution
     where you have cache coherent access from device to system memory
     but not the other way around. Moreover you share the CPU page
     table with the device so you do not need to program the IOMMU.

    Here you can not use the device memory transparently. At least
    not without software help like HMM.


  C) all in software. Here device can access system memory with cache
     coherency but it does not share the same CPU page table. Each
     device have their own page table and thus you need to synchronize
     them.

HMM provides helper that address all of the 3 solutions.
  A) for all hardware solution HMM provides new helpers to help
     with migration of process memory to device memory
  B) for partial hardware solution you can mix with HMM to again
     provide helpers for migration to device memory. This assume
     you device can mix and match local device page table with
     ATS/PASID region
  C) full software solution using all the feature of HMM where it
     is all done in software and HMM is just doing the heavy lifting
     on behalf of device driver

In all of the above we are talking fine-grained system SVM as in
the OpenCL specificiation. So you can malloc() memory and use it
directly from the GPU.

Hope this clarify thing.

Cheers,
Jérôme

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ