lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 23 Apr 2015 12:24:56 -0700
From:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:	Christoph Lameter <cl@...ux.com>
Cc:	Jerome Glisse <j.glisse@...il.com>, linux-kernel@...r.kernel.org,
	linux-mm@...ck.org, jglisse@...hat.com, mgorman@...e.de,
	aarcange@...hat.com, riel@...hat.com, airlied@...hat.com,
	benh@...nel.crashing.org, aneesh.kumar@...ux.vnet.ibm.com,
	Cameron Buschardt <cabuschardt@...dia.com>,
	Mark Hairgrove <mhairgrove@...dia.com>,
	Geoffrey Gerfin <ggerfin@...dia.com>,
	John McKenna <jmckenna@...dia.com>, akpm@...ux-foundation.org
Subject: Re: Interacting with coherent memory on external devices

On Thu, Apr 23, 2015 at 09:12:38AM -0500, Christoph Lameter wrote:
> On Wed, 22 Apr 2015, Paul E. McKenney wrote:
> 
> > Agreed, the use case that Jerome is thinking of differs from yours.
> > You would not (and should not) tolerate things like page faults because
> > it would destroy your worst-case response times.  I believe that Jerome
> > is more interested in throughput with minimal change to existing code.
> 
> As far as I know Jerome is talkeing about HPC loads and high performance
> GPU processing. This is the same use case.

The difference is sensitivity to latency.  You have latency-sensitive
HPC workloads, and Jerome is talking about HPC workloads that need
high throughput, but are insensitive to latency.

> > Let's suppose that you and Jerome were using GPGPU hardware that had
> > 32,768 hardware threads.  You would want very close to 100% of the full
> > throughput out of the hardware with pretty much zero unnecessary latency.
> > In contrast, Jerome might be OK with (say) 20,000 threads worth of
> > throughput with the occasional latency hiccup.
> >
> > And yes, support for both use cases is needed.
> 
> What you are proposing for High Performacne Computing is reducing the
> performance these guys trying to get. You cannot sell someone a Volkswagen
> if he needs the Ferrari.

You do need the low-latency Ferrari.  But others are best served by a
high-throughput freight train.

							Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ