linux-kernel - Re: [PATCH 1/3] Make /dev/urandom scalable

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <56044ADB.5050102@gmail.com>
Date:	Thu, 24 Sep 2015 15:11:23 -0400
From:	Austin S Hemmelgarn <ahferroin7@...il.com>
To:	Jeff Epler <jepler@...ythonic.net>
Cc:	Theodore Ts'o <tytso@....edu>, Andi Kleen <andi@...stfloor.org>,
	linux-kernel@...r.kernel.org, kirill.shutemov@...ux.intel.com,
	herbert@...dor.apana.org.au, Andi Kleen <ak@...ux.intel.com>
Subject: Re: [PATCH 1/3] Make /dev/urandom scalable

On 2015-09-24 12:52, Jeff Epler wrote:
> On Thu, Sep 24, 2015 at 12:00:44PM -0400, Austin S Hemmelgarn wrote:
>> I've had cases where I've done thousands of dieharder runs, and it
>> failed almost 10% of the time, while stuff like mt19937 fails in
>> otherwise identical tests only about 1-2% of the time
>
> That is a startling result.  Please say what architecture, kernel
> version, dieharder version and commandline arguments you are using to
> get 10% WEAK or FAILED assessments from dieharder on /dev/urandom.
I do not remember what exact dieharder version or command-line arguments 
(this was almost a decade ago), except that I compiled it from source 
myself, I do remember it was a 32-bit x86 processor (as that was sadly 
all I had to run Linux on at the time), and an early 2.6 series kernel 
(which if I remember correctly was already EOL by the time I was using 
it).  It may haven been impacted by the fact that I did the testing in 
QEMU, but I would not expect that to affect things that much.  It is 
worth noting that I only saw this happen three times, and and each time 
it was in a sample of 2000 runs (which has always been the sample size 
I've used, as that's the point at which I tend to get impatient).

I don't tend to do any of that type of testing anymore (at least, not 
since I started donating spare cycles to various BOINC projects).  I 
will make a point however to run some tests over the weekend on a 
current kernel version (4.2.1), with the current dieharder version I 
have available (3.31.1).
>
> Since the structure of linux urandom involves taking a cryptographic
> hash the basic expectation is that it would fail statistical randomness
> tests at similar rates to e.g., dieharder's AES_OFB (-g 205) even in the
> absence of any entropy in the kernel pools.
>
> So if 10% failures at correct statistical tests can be replicated it is
> important and needs attention.
>
> I did take a few moments to look into this today and got starling
> failures (p-value 0.00000000) with e.g.,
>      dieharder -g 501 -d 10
> (and a few other tests) using dieharder 3.31.1 on both debian
> linux-4.1-rt-amd64 and debian kfreebsd-10-amd64, but this seems to be an
> upstream bug known at least to debian and redhat, possibly fixed in
> current Fedora but apparently not in Debian.
>      https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=745742
>      https://bugzilla.redhat.com/show_bug.cgi?format=multiple&id=803292
> if you have an affected version, these failures are seen only with -g
> 501, not with -g 200 < /dev/urandom.  They are probably also not seen
> with 32-bit dieharder.
>
>   diehard_parking_lot|   0|     12000|     100|0.00000000|  FAILED
>      diehard_2dsphere|   2|      8000|     100|0.00000000|  FAILED
>      diehard_3dsphere|   3|      4000|     100|0.00000000|  FAILED
>       diehard_squeeze|   0|    100000|     100|0.00000000|  FAILED
>          diehard_sums|   0|       100|     100|0.00000000|  FAILED
The diehard_sums test is known and documented to be a flawed test.  As 
far as the other failures, even a top quality RNG should get them 
sometimes (because a good RNG _should_ spit out long runs of identical 
bits from time to time, which is why the absolute insanity that is FIPS 
cryptography standards should not ever be considered when doing anything 
other than security work (and only considered cautiously even there)). 
Based on what I've seen with the AES_OFB generator, 'perfect' generators 
should be getting WEAK results about 1% of the time, and FAILED results 
about 0.1% of the time (except on diehard_sums).

A generator never getting FAILED or WEAK results over thousands of runs 
should be an indication that either that generator is flawed in some way 
(ie, it's actively trying to produce numbers that pass the tests, means 
it's not really a RNG), or the test itself is flawed in some way.


Download attachment "smime.p7s" of type "application/pkcs7-signature" (3019 bytes)