linux-kernel - Re: early x86 unseeded randomness

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.DEB.2.20.1708160956140.1987@nanos>
Date:   Wed, 16 Aug 2017 11:13:03 +0200 (CEST)
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Theodore Ts'o <tytso@....edu>
cc:     Borislav Petkov <bp@...en8.de>, Ingo Molnar <mingo@...nel.org>,
        Willy Tarreau <w@....eu>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        x86-ml <x86@...nel.org>, "Jason A. Donenfeld" <Jason@...c4.com>,
        lkml <linux-kernel@...r.kernel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Nicholas Mc Guire <der.herr@...r.at>,
        Will Deacon <will.deacon@....com>
Subject: Re: early x86 unseeded randomness

On Tue, 15 Aug 2017, Theodore Ts'o wrote:
> If we really want to do this, I'd much rather *not* have code calling
> tsc_early_random().  We're better off having the code call
> get_random_bytes() and/or get_random_u32(), and having these systems
> use RDRAND if available, and if not, falling back to
> tsc_early_random() and then mixing it with whatever unpredictability
> we may have been able to gather so far if the CRNG hasn't been
> initialized yet.

I agree. This is not about systems which have RDRAND. We want to support
systems which do not have it and there the TSC magic comes handy.

> That way something like tsc_early_random() can help, but it can't make
> things worse than what we have today (excepting the performance delay
> caused by adding whatever random shite that we hope is enough to
> introduce unpredictability to the TSC --- for which I still remain
> very skeptical).

I just rerun tests in the early boot code (interrupts disabled, no NMIs
...)  with the TSC/wbinvd voodoo on several generations of machines and
stored 4M random values in a big static array. Reading it out after boot
and running it through dieharder makes me pretty confident that we observe
real random noise coming from the internals of the microarch/pipelines/bus
interactions.

> P.S.  As I recall hpa@ has talked to some Intel architects internally
> about how much unpredictability we could really get, and how much of
> it is just because there's complex state that we can't see (which if
> we could see, might make it much more predictable), and as I recall

Right, there is complex state which is not completely synchronous even if
all frequencies are derived from a single 24MHZ oscillator. The PWMs, the
memory access characteristics and quite some other sources of
asynchronousity allow us to utilize that and I'm pretty sure, that you
can't find two systems which expose exactly the same behaviour.

> they didn't say anyhing definitively; but they were nervous.  I'm

Sure, they are always nervous when you ask them questions about the
internals of their chips especially when you expect authorative answers.

> pretty sure that for Intel architects, the right answer from their
> perspective is to use RDRAND, and not to play games with the TSC.
> 
> The other thing to note here is that because Intel has RDRAND, I'm
> actually not that worried about Intel; all of the
> drivers/char/random.c will mix in inputs from RDRAND or RDSEED if
> available.  I'm actually much more worried about architectures that
> don't have a hardware random number generator (e.g., some ARM
> subarchitectures and MIPS).  So while you might be able to come up
> with something that could work on x86, the real question is it safely
> generalizable to other, non-x86 architectures.  And that's where it
> gets much more scary.

You have similar asynchronous behaviour there especially when you take RTCs
into account. But also the stupid mechanism of utilizing a timer/counter
vs. a constant number of execution cycles mixed with a cacheflush will
expose random behaviour. You need to make some decisions which bit of the
counter to use, but that merily depends on the frequency ratio of the CPU
core vs. the counter.

It'd be probably worthwhile to code up a generic version of the
memcmp/readcounter/memset/cacheflush loop and expose it to a variety of
platforms to figure out how that behaves. We won't get authorative answers
from the silicon dudes, but my (not so limited) understanding of the
hardware internals makes me pretty confident that there is enough noise
caused by clock interference and domain interaction available on all modern
CPUs.

Thanks,

	tglx