[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.21.1909290724160.2636@nanos.tec.linutronix.de>
Date: Sun, 29 Sep 2019 09:40:13 +0200 (CEST)
From: Thomas Gleixner <tglx@...utronix.de>
To: Linus Torvalds <torvalds@...ux-foundation.org>
cc: "Ahmed S. Darwish" <darwish.07@...il.com>,
LKML <linux-kernel@...r.kernel.org>,
Theodore Ts'o <tytso@....edu>,
Nicholas Mc Guire <hofrat@...ntech.at>,
the arch/x86 maintainers <x86@...nel.org>,
Andy Lutomirski <luto@...nel.org>,
Kees Cook <keescook@...omium.org>
Subject: Re: x86/random: Speculation to the rescue
On Sat, 28 Sep 2019, Linus Torvalds wrote:
> On Sat, Sep 28, 2019 at 3:24 PM Thomas Gleixner <tglx@...utronix.de> wrote:
> >
> > Nicholas presented the idea to (ab)use speculative execution for random
> > number generation years ago at the Real-Time Linux Workshop:
>
> What you describe is just a particularly simple version of the jitter
> entropy. Not very reliable.
I know that jitter entropy is not the most reliable source. Though its for
one better than no entropy and on the other hand the results are
interesting.
I just looked at the test I ran overnight on one of my machines which did
an aggregate test over 1024 runs both on the hacked up thing and the
veritable /dev/random. The overall difference is within the noise.
So I'm not trying to say that we should rely solely on that, but I think
it's something we should not dismiss upfront just because paranoid theory
says that jitter entropy is not reliable enough.
The point is that jitter entropy relies as any other entropy source on the
quality of the observed data. The non-deterministic behaviour of
speculative execution seems to yield quite good input.
There is another fun thing which I found while analyzing the crappy runtime
behaviour of a customer application:
static volatile unsigned int foo;
t0 = rdtscp();
for (i = 0; i < 10000; i++)
foo++;
/* rdtscp() or the fenced rdtsc() gives better results */
t1 = rdtscp();
Even with interrupts disabled and no NMI disturbing the loop, the resulting
execution time varies depending on the microarchitecture:
tmin < t < N * tmin
where N is >= 2 on all Intel machines which I tested. Feeding bit 0 of t1
mixed with t0 into the LSFR gives equally good results as the hacky loop I
used in the patch. Fun isn't it?
> Whatever. I'm entirely convinced this won't make everybody happy
> anyway, but it's _one_ approach to handle the issue.
>
> Ahmed - would you be willing to test this on your problem case (with
> the ext4 optimization re-enabled, of course)?
>
> And Thomas - mind double-checking that I didn't do anything
> questionable with the timer code..
Looks about right.
> And this goes without saying - this patch is ENTIRELY untested. Apart
> from making people upset for the lack of rigor, it might do
> unspeakable crimes against your pets. You have been warned.
As I'm about to vanish for a 2 weeks vacation, I'm going to shut up now and
hope that experiment gave at least food for thought.
Thanks,
tglx
Powered by blists - more mailing lists