lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <lsq.1591725832.911166893@decadent.org.uk>
Date:   Tue, 09 Jun 2020 19:04:51 +0100
From:   Ben Hutchings <ben@...adent.org.uk>
To:     linux-kernel@...r.kernel.org, stable@...r.kernel.org
CC:     akpm@...ux-foundation.org, Denis Kirjanov <kda@...ux-powerpc.org>,
        "Greg Kroah-Hartman" <gregkh@...uxfoundation.org>,
        "Theodore Ts'o" <tytso@....edu>,
        "Jason A. Donenfeld" <Jason@...c4.com>
Subject: [PATCH 3.16 60/61] random: always use batched entropy for
 get_random_u{32,64}

3.16.85-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: "Jason A. Donenfeld" <Jason@...c4.com>

commit 69efea712f5b0489e67d07565aad5c94e09a3e52 upstream.

It turns out that RDRAND is pretty slow. Comparing these two
constructions:

  for (i = 0; i < CHACHA_BLOCK_SIZE; i += sizeof(ret))
    arch_get_random_long(&ret);

and

  long buf[CHACHA_BLOCK_SIZE / sizeof(long)];
  extract_crng((u8 *)buf);

it amortizes out to 352 cycles per long for the top one and 107 cycles
per long for the bottom one, on Coffee Lake Refresh, Intel Core i9-9880H.

And importantly, the top one has the drawback of not benefiting from the
real rng, whereas the bottom one has all the nice benefits of using our
own chacha rng. As get_random_u{32,64} gets used in more places (perhaps
beyond what it was originally intended for when it was introduced as
get_random_{int,long} back in the md5 monstrosity era), it seems like it
might be a good thing to strengthen its posture a tiny bit. Doing this
should only be stronger and not any weaker because that pool is already
initialized with a bunch of rdrand data (when available). This way, we
get the benefits of the hardware rng as well as our own rng.

Another benefit of this is that we no longer hit pitfalls of the recent
stream of AMD bugs in RDRAND. One often used code pattern for various
things is:

  do {
  	val = get_random_u32();
  } while (hash_table_contains_key(val));

That recent AMD bug rendered that pattern useless, whereas we're really
very certain that chacha20 output will give pretty distributed numbers,
no matter what.

So, this simplification seems better both from a security perspective
and from a performance perspective.

Signed-off-by: Jason A. Donenfeld <Jason@...c4.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Link: https://lore.kernel.org/r/20200221201037.30231-1-Jason@zx2c4.com
Signed-off-by: Theodore Ts'o <tytso@....edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
[bwh: Backported to 3.16: Only get_random_int() exists here]
Signed-off-by: Ben Hutchings <ben@...adent.org.uk>
---
 drivers/char/random.c | 6 ------
 1 file changed, 6 deletions(-)

--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -1700,9 +1700,6 @@ unsigned int get_random_int(void)
 	__u32 *hash;
 	unsigned int ret;
 
-	if (arch_get_random_int(&ret))
-		return ret;
-
 	hash = get_cpu_var(get_random_int_hash);
 
 	hash[0] += current->pid + jiffies + random_get_entropy();

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ