linux-kernel - Re: Interacting with coherent memory on external devices

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.DEB.2.11.1504230914060.32297@gentwo.org>
Date:	Thu, 23 Apr 2015 09:20:55 -0500 (CDT)
From:	Christoph Lameter <cl@...ux.com>
To:	Benjamin Herrenschmidt <benh@...nel.crashing.org>
cc:	paulmck@...ux.vnet.ibm.com, linux-kernel@...r.kernel.org,
	linux-mm@...ck.org, jglisse@...hat.com, mgorman@...e.de,
	aarcange@...hat.com, riel@...hat.com, airlied@...hat.com,
	aneesh.kumar@...ux.vnet.ibm.com,
	Cameron Buschardt <cabuschardt@...dia.com>,
	Mark Hairgrove <mhairgrove@...dia.com>,
	Geoffrey Gerfin <ggerfin@...dia.com>,
	John McKenna <jmckenna@...dia.com>, akpm@...ux-foundation.org
Subject: Re: Interacting with coherent memory on external devices

On Thu, 23 Apr 2015, Benjamin Herrenschmidt wrote:

> > There are hooks in glibc where you can replace the memory
> > management of the apps if you want that.
>
> We don't control the app. Let's say we are doing a plugin for libfoo
> which accelerates "foo" using GPUs.

There are numerous examples of malloc implementation that can be used for
apps without modifying the app.
>
> Now some other app we have no control on uses libfoo. So pointers
> already allocated/mapped, possibly a long time ago, will hit libfoo (or
> the plugin) and we need GPUs to churn on the data.

IF the GPU would need to suspend one of its computation thread to wait on
a mapping to be established on demand or so then it looks like the
performance of the parallel threads on a GPU will be significantly
compromised. You would want to do the transfer explicitly in some fashion
that meshes with the concurrent calculation in the GPU. You do not want
stalls while GPU number crunching is ongoing.

> The point I'm making is you are arguing against a usage model which has
> been repeatedly asked for by large amounts of customer (after all that's
> also why HMM exists).

I am still not clear what is the use case for this would be. Who is asking
for this?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/