lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 23 Apr 2015 09:38:15 -0500 (CDT)
From:	Christoph Lameter <cl@...ux.com>
To:	Benjamin Herrenschmidt <benh@...nel.crashing.org>
cc:	Jerome Glisse <j.glisse@...il.com>, paulmck@...ux.vnet.ibm.com,
	linux-kernel@...r.kernel.org, linux-mm@...ck.org,
	jglisse@...hat.com, mgorman@...e.de, aarcange@...hat.com,
	riel@...hat.com, airlied@...hat.com,
	aneesh.kumar@...ux.vnet.ibm.com,
	Cameron Buschardt <cabuschardt@...dia.com>,
	Mark Hairgrove <mhairgrove@...dia.com>,
	Geoffrey Gerfin <ggerfin@...dia.com>,
	John McKenna <jmckenna@...dia.com>, akpm@...ux-foundation.org
Subject: Re: Interacting with coherent memory on external devices

On Thu, 23 Apr 2015, Benjamin Herrenschmidt wrote:

> In fact I'm quite surprised, what we want to achieve is the most natural
> way from an application perspective.

Well the most natural thing would be if the beast would just do what I
tell it in plain english. But then I would not have my job anymore.

> You have something in memory, whether you got it via malloc, mmap'ing a file,
> shmem with some other application, ... and you want to work on it with the
> co-processor that is residing in your address space. Even better, pass a pointer
> to it to some library you don't control which might itself want to use the
> coprocessor ....

Yes that works already. Whats new about this? This seems to have been
solved on the Intel platform f.e.

> What you propose can simply not provide that natural usage model with any
> efficiency.

There is no effiecency anymore if the OS can create random events in a
computational stream that is highly optimized for data exchange of
multiple threads at defined time intervals. If transparency or the natural
usage model can avoid this then ok but what I see here proposed is some
behind-the-scenes model that may severely degrate performance. And this
does seem to go way beyond CAPI. At leasdt the way I so far thought about
this as a method for cache coherency at the cache line level and about a
way to simplify the coordination of page tables and TLBs across multiple
divergent architectures.

I think these two things need to be separated. The shift-the-memory-back-
and-forth approach should be separate and if someone wants to use the
thing then it should also work on other platforms like ARM and Intel.

CAPI needs to be implemented as a way to potentially improve the existing
communication paths between devices and the main processor. F.e the
existing Infiniband MMU synchronization issues and RDMA registration
problems could be addressed with this. The existing mechanisms for GPU
communication could become much cleaner and easier to handle. This is all
good but independant of any "transparent" memory implementation.

> It might not be *your* model based on *your* application but that doesn't mean
> it's not there, and isn't relevant.

Sadly this is the way that an entire industry does its thing.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ