linux-kernel - Re: RFC: THE OFFLINE SCHEDULER

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Fri, 28 Aug 2009 14:57:11 -0400 (EDT)
From:	Christoph Lameter <cl@...ux-foundation.org>
To:	Thomas Gleixner <tglx@...utronix.de>
cc:	Gregory Haskins <ghaskins@...ell.com>,
	Rik van Riel <riel@...hat.com>,
	Chris Friesen <cfriesen@...tel.com>,
	raz ben yehuda <raziebe@...il.com>,
	Andrew Morton <akpm@...ux-foundation.org>, mingo@...e.hu,
	peterz@...radead.org, maximlevitsky@...il.com, efault@....de,
	wiseman@...s.biu.ac.il, linux-kernel@...r.kernel.org,
	linux-rt-users@...r.kernel.org
Subject: Re: RFC: THE OFFLINE SCHEDULER

On Fri, 28 Aug 2009, Thomas Gleixner wrote:

> That makes sense and should not be rocket science to implement.

I like it and such a thing would do a lot for reducing noise.

However, look at a typical task (from the HPC world) that would be
running on an isolated processors. It would

1. Spin on some memory location waiting for an event.

2. Process data passed to it, prepare output data and then go back to 1.

The enticing thing about doing 1 with shared memory and/or infiniband is
that it can be done in a few hundred nanoseconds instead of 10-20
microseconds. This allows a much faster IPC communication if we bypass
the OS.

For many uses deterministic responses are desired. If the handler that
runs is never disturbed by extraneous processing (IPI, faults, irqs etc)
then we can say that we run at the maximum speed that the machine can run
at. That is what many sites expect.

In an HPC environment synchronization points are essential and the
frequency of synchronization points (where we spin on a cacheline) is
important for the ability to scale the accuratey and the performance of
the algorithm. If we can make N processor operate in a deterministic
fashion on f.e. an array of floating point numbers then the rendezvous
occurring with minimal wait time in each of the N processes. Getting rid
of all sources of interruptions gets us the best performance possible.

Right now often strong variability makes it necessary to have long
durations of the processing periods and deal with long wait times because
one of the N processes has not finished yet.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/