linux-kernel - Re: [PATCH 5/7] random: replace non-blocking pool with a Chacha20-based CRNG

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20160626184743.GA11162@amd>
Date:	Sun, 26 Jun 2016 20:47:43 +0200
From:	Pavel Machek <pavel@....cz>
To:	Theodore Ts'o <tytso@....edu>,
	Herbert Xu <herbert@...dor.apana.org.au>,
	Linux Kernel Developers List <linux-kernel@...r.kernel.org>,
	linux-crypto@...r.kernel.org, smueller@...onox.de,
	andi@...stfloor.org, sandyinchina@...il.com, jsd@...n.com,
	hpa@...or.com
Subject: Re: [PATCH 5/7] random: replace non-blocking pool with a
 Chacha20-based CRNG

Hi!

> Yes, I understand the argument that the networking stack is now
> requiring the crypto layer --- but not all IOT devices may necessarily
> require the IP stack (they might be using some alternate wireless
> communications stack) and I'd much rather not make things worse.
> 
> 
> The final thing is that it's not at all clear that the accelerated
> implementation is all that important anyway.  Consider the following
> two results using the unaccelerated ChaCha20:
> 
> % dd if=/dev/urandom bs=4M count=32 of=/dev/null
> 32+0 records in
> 32+0 records out
> 134217728 bytes (134 MB, 128 MiB) copied, 1.18647 s, 113 MB/s
> 
> % dd if=/dev/urandom bs=32 count=4194304 of=/dev/null
> 4194304+0 records in
> 4194304+0 records out
> 134217728 bytes (134 MB, 128 MiB) copied, 7.08294 s, 18.9 MB/s
> 
> So in both cases, we are reading 128M from the CRNG.  In the first
> case, we see the sort of speed we would get if we were using the CRNG
> for some illegitimate, such as "dd if=/dev/urandom of=/dev/sdX bs=4M"
> (because they were too lazy to type "apt-get install nwipe").
> 
> In the second case, we see the use of /dev/urandom in a much more
> reasonable, proper, real-world use case for /de/urandom, which is some
> userspace process needing a 256 bit session key for a TLS connection,
> or some such.  In this case, we see that the other overheads of
> providing the anti-backtracking protection, system call overhead,
> etc., completely dominate the speed of the core crypto primitive.
> 
> So even if the AVX optimized is 100% faster than the generic version,
> it would change the time needed to create a 256 byte session key from
> 1.68 microseconds to 1.55 microseconds.  And this is ignoring the

Ok, so lets say I'm writing some TLS server, and I know that traffic
is currently heavy because it was heavy in last 5 minutes. Would it
make sense for me to request 128M of randomness from /dev/urandom, and
then use that internally, to avoid the syscall overhead?

Ok, maybe 128M is a bit much because by requesting that much in single
request i'd turn urandom into PRNG, but perhaps 1MB block makes sense?

And I guess even requesting 128M would make sense, as kernel can
select best crypto implementation for CRNG, and I'd prefer to avoid
that code in my application as it is hardware-specific...

									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html