lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 15 Jul 2014 11:50:45 +0200
From:	Peter Zijlstra <peterz@...radead.org>
To:	Tim Chen <tim.c.chen@...ux.intel.com>
Cc:	Herbert Xu <herbert@...dor.apana.org.au>,
	"H. Peter Anvin" <hpa@...or.com>,
	"David S.Miller" <davem@...emloft.net>,
	Ingo Molnar <mingo@...nel.org>,
	Chandramouli Narayanan <mouli@...ux.intel.com>,
	Vinodh Gopal <vinodh.gopal@...el.com>,
	James Guilford <james.guilford@...el.com>,
	Wajdi Feghali <wajdi.k.feghali@...el.com>,
	Jussi Kivilinna <jussi.kivilinna@....fi>,
	linux-crypto@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4 6/7] sched: add function nr_running_cpu to expose
 number of tasks running on cpu

On Mon, Jul 14, 2014 at 12:50:50PM -0700, Tim Chen wrote:

> There is a generic multi-buffer infrastructure portion that manages
> pulling and queuing jobs on the crypto workqueue, and it is separated out
> in patch 1 of the patchset.

There's one very weird multi-line comment in that patch.

> The other portions are algorithm specific that defines
> algorithm specific data structure and  does the crypto computation 
> for a particular algorithm, mostly in
> assemblies and C glue code.  The infrastructure code is 
> meant to be reused for other similar 
> multi-buffer algorithms.

The flushing part that uses the sched thing is sha1 specific, even
though it strikes me as not being so. Flushing buffers on idle seems
like a 'generic' thing.

> We use nr_running_cpu to check whether there are other tasks running on
> the *current* cpu, (not for another cpu),

And yet, the function allows you do to exactly that..

> to decide if we should flush
> and compute crypto jobs accumulated.  If there's nobody else running,
> we can take advantage of available cpu cycles on the cpu we are running 
> on to do computation on the existing jobs in a SIMD mannner. 
> Waiting a bit longer may accumulate more jobs to process in parallel
> in a single SIMD instruction, but will have more delay.  

So you already have an idle notifier (which is x86 only, we should fix
that I suppose), and you then double check there really isn't anything
else running.

How much, if anything, does that second check buy you? There's just not
a single word on that.

Also, there is not a word on the latency vs throughput tradeoff you
make. I can imagine that for very short idle durations you loose, not
win with this thing.

So for now I still see no reason for doing this.

Also, I wonder about SMT, the point of this is to make best use of the
SIMD pipelines, does it still make sense to use siblings at the same
time even though you're running hand crafted ASM to stuff the pipelines
to the brim? Should this thing be SMT aware and not gather queues for
both siblings?

Content of type "application/pgp-signature" skipped

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ