lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1446906642-19372-4-git-send-email-sandyinchina@gmail.com>
Date:	Sat,  7 Nov 2015 09:30:39 -0500
From:	Sandy Harris <sandyinchina@...il.com>
To:	"Theodore Ts\\'o" <tytso@....edu>,
	Jason Cooper <jason@...edaemon.net>,
	"H. Peter Anvin" <hpa@...or.com>, John Denker <jsd@...n.com>
Cc:	linux-kernel@...r.kernel.org, linux-crypto@...r.kernel.org
Subject: [PATCH 4/7] Different version of driver using hash from AES-GCM Compiled if CONFIG_RANDOM_GCM=y

Signed-off-by: Sandy Harris <sandyinchina@...il.com>
---
 drivers/char/random_gcm.c | 3716 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 3716 insertions(+)
 create mode 100644 drivers/char/random_gcm.c

diff --git a/drivers/char/random_gcm.c b/drivers/char/random_gcm.c
new file mode 100644
index 0000000..360fbe3
--- /dev/null
+++ b/drivers/char/random_gcm.c
@@ -0,0 +1,3716 @@
+/*
+ * random.c -- A strong random number generator
+ *
+ * Copyright Matt Mackall <mpm@...enic.com>, 2003, 2004, 2005
+ *
+ * Copyright Theodore Ts'o, 1994, 1995, 1996, 1997, 1998, 1999.  All
+ * rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, and the entire permission notice in its entirety,
+ *    including the disclaimer of warranties.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ * 3. The name of the author may not be used to endorse or promote
+ *    products derived from this software without specific prior
+ *    written permission.
+ *
+ * ALTERNATIVELY, this product may be distributed under the terms of
+ * the GNU General Public License, in which case the provisions of the GPL are
+ * required INSTEAD OF the above restrictions.  (This clause is
+ * necessary due to a potential bad interaction between the GPL and
+ * the restrictions contained in a BSD-style copyright.)
+ *
+ * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESS OR IMPLIED
+ * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
+ * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE, ALL OF
+ * WHICH ARE HEREBY DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT
+ * OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
+ * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE
+ * USE OF THIS SOFTWARE, EVEN IF NOT ADVISED OF THE POSSIBILITY OF SUCH
+ * DAMAGE.
+ */
+
+/*
+ * (now, with legal B.S. out of the way.....)
+ *
+ * This routine gathers environmental noise from device drivers, etc.,
+ * and returns good random numbers, suitable for cryptographic use.
+ * Besides the obvious cryptographic uses, these numbers are also good
+ * for seeding TCP sequence numbers, and other places where it is
+ * desirable to have numbers which are not only random, but hard to
+ * predict by an attacker.
+ *
+ * Theory of operation
+ * ===================
+ *
+ * Computers are very predictable devices.  Hence it is extremely hard
+ * to produce truly random numbers on a computer --- as opposed to
+ * pseudo-random numbers, which can easily generated by using a
+ * algorithm.  Unfortunately, it is very easy for attackers to guess
+ * the sequence of pseudo-random number generators, and for some
+ * applications this is not acceptable.  So instead, we must try to
+ * gather "environmental noise" from the computer's environment, which
+ * must be hard for outside attackers to observe, and use that to
+ * generate random numbers.  In a Unix environment, this is best done
+ * from inside the kernel.
+ *
+ * Sources of randomness from the environment include inter-keyboard
+ * timings, inter-interrupt timings from some interrupts, and other
+ * events which are both (a) non-deterministic and (b) hard for an
+ * outside observer to measure.  Randomness from these sources are
+ * added to an "entropy pool", which is mixed using a CRC-like function.
+ * This is not cryptographically strong, but it is adequate assuming
+ * the randomness is not chosen maliciously, and it is fast enough that
+ * the overhead of doing it on every interrupt is very reasonable.
+ * As random bytes are mixed into the entropy pool, the routines keep
+ * an *estimate* of how many bits of randomness have been stored into
+ * the random number generator's internal state.
+ *
+ * When random bytes are desired, they are obtained by taking the SHA
+ * hash of the contents of the "entropy pool".  The SHA hash avoids
+ * exposing the internal state of the entropy pool.  It is believed to
+ * be computationally infeasible to derive any useful information
+ * about the input of SHA from its output.  Even if it is possible to
+ * analyze SHA in some clever way, as long as the amount of data
+ * returned from the generator is less than the inherent entropy in
+ * the pool, the output data is totally unpredictable.  For this
+ * reason, the routine decreases its internal estimate of how many
+ * bits of "true randomness" are contained in the entropy pool as it
+ * outputs random numbers.
+ *
+ * If this estimate goes to zero, the routine can still generate
+ * random numbers; however, an attacker may (at least in theory) be
+ * able to infer the future output of the generator from prior
+ * outputs.  This requires successful cryptanalysis of SHA, which is
+ * not believed to be feasible, but there is a remote possibility.
+ * Nonetheless, these numbers should be useful for the vast majority
+ * of purposes.
+ *
+ * Exported interfaces ---- output
+ * ===============================
+ *
+ * There are three exported interfaces; the first is one designed to
+ * be used from within the kernel:
+ *
+ * 	void get_random_bytes(void *buf, int nbytes);
+ *
+ * This interface will return the requested number of random bytes,
+ * and place it in the requested buffer.
+ *
+ * The two other interfaces are two character devices /dev/random and
+ * /dev/urandom.  /dev/random is suitable for use when very high
+ * quality randomness is desired (for example, for key generation or
+ * one-time pads), as it will only return a maximum of the number of
+ * bits of randomness (as estimated by the random number generator)
+ * contained in the entropy pool.
+ *
+ * The /dev/urandom device does not have this limit, and will return
+ * as many bytes as are requested.  As more and more random bytes are
+ * requested without giving time for the entropy pool to recharge,
+ * this will result in random numbers that are merely cryptographically
+ * strong.  For many applications, however, this is acceptable.
+ *
+ * Exported interfaces ---- input
+ * ==============================
+ *
+ * The current exported interfaces for gathering environmental noise
+ * from the devices are:
+ *
+ *	void add_device_randomness(const void *buf, unsigned int size);
+ * 	void add_input_randomness(unsigned int type, unsigned int code,
+ *                                unsigned int value);
+ *	void add_interrupt_randomness(int irq, int irq_flags);
+ * 	void add_disk_randomness(struct gendisk *disk);
+ *
+ * add_device_randomness() is for adding data to the random pool that
+ * is likely to differ between two devices (or possibly even per boot).
+ * This would be things like MAC addresses or serial numbers, or the
+ * read-out of the RTC. This does *not* add any actual entropy to the
+ * pool, but it initializes the pool to different values for devices
+ * that might otherwise be identical and have very little entropy
+ * available to them (particularly common in the embedded world).
+ *
+ * add_input_randomness() uses the input layer interrupt timing, as well as
+ * the event type information from the hardware.
+ *
+ * add_interrupt_randomness() uses the interrupt timing as random
+ * inputs to the entropy pool. Using the cycle counters and the irq source
+ * as inputs, it feeds the randomness roughly once a second.
+ *
+ * add_disk_randomness() uses what amounts to the seek time of block
+ * layer request events, on a per-disk_devt basis, as input to the
+ * entropy pool. Note that high-speed solid state drives with very low
+ * seek times do not make for good sources of entropy, as their seek
+ * times are usually fairly consistent.
+ *
+ * All of these routines try to estimate how many bits of randomness a
+ * particular randomness source.  They do this by keeping track of the
+ * first and second order deltas of the event timings.
+ *
+ * Ensuring unpredictability at system startup
+ * ============================================
+ *
+ * When any operating system starts up, it will go through a sequence
+ * of actions that are fairly predictable by an adversary, especially
+ * if the start-up does not involve interaction with a human operator.
+ * This reduces the actual number of bits of unpredictability in the
+ * entropy pool below the value in entropy_count.  In order to
+ * counteract this effect, it helps to carry information in the
+ * entropy pool across shut-downs and start-ups.  To do this, put the
+ * following lines an appropriate script which is run during the boot
+ * sequence:
+ *
+ *	echo "Initializing random number generator..."
+ *	random_seed=/var/run/random-seed
+ *	# Carry a random seed from start-up to start-up
+ *	# Load and then save the whole entropy pool
+ *	if [ -f $random_seed ]; then
+ *		cat $random_seed >/dev/urandom
+ *	else
+ *		touch $random_seed
+ *	fi
+ *	chmod 600 $random_seed
+ *	dd if=/dev/urandom of=$random_seed count=1 bs=512
+ *
+ * and the following lines in an appropriate script which is run as
+ * the system is shutdown:
+ *
+ *	# Carry a random seed from shut-down to start-up
+ *	# Save the whole entropy pool
+ *	echo "Saving random seed..."
+ *	random_seed=/var/run/random-seed
+ *	touch $random_seed
+ *	chmod 600 $random_seed
+ *	dd if=/dev/urandom of=$random_seed count=1 bs=512
+ *
+ * For example, on most modern systems using the System V init
+ * scripts, such code fragments would be found in
+ * /etc/rc.d/init.d/random.  On older Linux systems, the correct script
+ * location might be in /etc/rcb.d/rc.local or /etc/rc.d/rc.0.
+ *
+ * Effectively, these commands cause the contents of the entropy pool
+ * to be saved at shut-down time and reloaded into the entropy pool at
+ * start-up.  (The 'dd' in the addition to the bootup script is to
+ * make sure that /etc/random-seed is different for every start-up,
+ * even if the system crashes without executing rc.0.)  Even with
+ * complete knowledge of the start-up activities, predicting the state
+ * of the entropy pool requires knowledge of the previous history of
+ * the system.
+ *
+ * Configuring the /dev/random driver under Linux
+ * ==============================================
+ *
+ * The /dev/random driver under Linux uses minor numbers 8 and 9 of
+ * the /dev/mem major number (#1).  So if your system does not have
+ * /dev/random and /dev/urandom created already, they can be created
+ * by using the commands:
+ *
+ * 	mknod /dev/random c 1 8
+ * 	mknod /dev/urandom c 1 9
+ *
+ * Acknowledgements:
+ * =================
+ *
+ * Ideas for constructing this random number generator were derived
+ * from Pretty Good Privacy's random number generator, and from private
+ * discussions with Phil Karn.  Colin Plumb provided a faster random
+ * number generator, which speed up the mixing function of the entropy
+ * pool, taken from PGPfone.  Dale Worley has also contributed many
+ * useful ideas and suggestions to improve this driver.
+ *
+ * Any flaws in the design are solely my responsibility, and should
+ * not be attributed to the Phil, Colin, or any of authors of PGP.
+ *
+ * Further background information on this topic may be obtained from
+ * RFC 4086, "Randomness Requirements for Security", by Donald
+ * Eastlake, Steve Crocker, and Jeff Schiller.
+ */
+
+#include <linux/utsname.h>
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/major.h>
+#include <linux/string.h>
+#include <linux/fcntl.h>
+#include <linux/slab.h>
+#include <linux/random.h>
+#include <linux/poll.h>
+#include <linux/init.h>
+#include <linux/fs.h>
+#include <linux/genhd.h>
+#include <linux/interrupt.h>
+#include <linux/mm.h>
+#include <linux/spinlock.h>
+#include <linux/kthread.h>
+#include <linux/percpu.h>
+#include <linux/cryptohash.h>
+#include <linux/fips.h>
+#include <linux/ptrace.h>
+#include <linux/kmemcheck.h>
+#include <linux/workqueue.h>
+#include <linux/irq.h>
+#include <linux/syscalls.h>
+#include <linux/completion.h>
+
+#include <asm/processor.h>
+#include <asm/uaccess.h>
+#include <asm/irq.h>
+#include <asm/irq_regs.h>
+#include <asm/io.h>
+
+#define CREATE_TRACE_POINTS
+#include <trace/events/random.h>
+
+/* #define ADD_INTERRUPT_BENCH */
+
+#ifndef CONFIG_RANDOM_INIT
+#error This version needs CONFIG_RANDOM_INIT
+#endif
+#ifndef CONFIG_RANDOM_GCM
+#error This version should not be compiled if CONFIG_RANDOM_GCM is not set
+#endif
+
+/*
+ * Configuration information
+ */
+
+#include <generated/random_init.h>
+
+#define EXTRACT_SIZE		16	/* 128-bit GCM hash */
+#define SEC_XFER_SIZE		512
+#define DEBUG_RANDOM_BOOT 0
+
+#define LONGS(x) (((x) + sizeof(unsigned long) - 1)/sizeof(unsigned long))
+
+/*
+ * To allow fractional bits to be tracked, the entropy_count field is
+ * denominated in units of 1/8th bits.
+ *
+ * 2*(ENTROPY_SHIFT + log2(poolbits)) must <= 31, or the multiply in
+ * credit_entropy_bits() needs to be 64 bits wide.
+ */
+#define ENTROPY_SHIFT 3
+#define ENTROPY_BITS(r) ((r)->entropy_count >> ENTROPY_SHIFT)
+
+/* sanity checks */
+
+#if( (ENTROPY_SHIFT+INPUT_POOL_SHIFT) >= 16)
+#error *_SHIFT values problematic for credit_entropy_bits()ki
+#endif
+
+#if( (INPUT_POOL_WORDS%16) || (OUTPUT_POOL_WORDS%16) )
+#error Pool size not divisible by 16, which code assumes
+#endif
+
+#if( INPUT_POOL_WORDS < 32 )
+#error Input pool less than a quarter of default size
+#endif
+
+#if( INPUT_POOL_WORDS < OUTPUT_POOL_WORDS )
+#error Strange configuration, input pool smalller than output
+#endif
+
+/*
+ * The minimum number of bits of entropy before we wake up a read on
+ * /dev/random.  Should be enough to do a significant reseed.
+ */
+static int random_read_wakeup_bits = 64;
+
+/*
+ * If the entropy count falls under this number of bits, then we
+ * should wake up processes which are selecting or polling on write
+ * access to /dev/random.
+ */
+static int random_write_wakeup_bits = 28 * OUTPUT_POOL_WORDS;
+
+/*
+ * The minimum number of seconds between urandom pool reseeding.  We
+ * do this to limit the amount of entropy that can be drained from the
+ * input pool even if there are heavy demands on /dev/urandom.
+ */
+static int random_min_urandom_seed = 60;
+
+/*
+ * Originally, we used a primitive polynomial of degree .poolwords
+ * over GF(2).  The taps for various sizes are defined below.  They
+ * were chosen to be evenly spaced except for the last tap, which is 1
+ * to get the twisting happening as fast as possible.
+ *
+ * For the purposes of better mixing, we use the CRC-32 polynomial as
+ * well to make a (modified) twisted Generalized Feedback Shift
+ * Register.  (See M. Matsumoto & Y. Kurita, 1992.  Twisted GFSR
+ * generators.  ACM Transactions on Modeling and Computer Simulation
+ * 2(3):179-194.  Also see M. Matsumoto & Y. Kurita, 1994.  Twisted
+ * GFSR generators II.  ACM Transactions on Modeling and Computer
+ * Simulation 4:254-266)
+ *
+ * Thanks to Colin Plumb for suggesting this.
+ *
+ * The mixing operation is much less sensitive than the output hash,
+ * where we use SHA-1.  All that we want of mixing operation is that
+ * it be a good non-cryptographic hash; i.e. it not produce collisions
+ * when fed "random" data of the sort we expect to see.  As long as
+ * the pool state differs for different inputs, we have preserved the
+ * input entropy and done a good job.  The fact that an intelligent
+ * attacker can construct inputs that will produce controlled
+ * alterations to the pool's state is not important because we don't
+ * consider such inputs to contribute any randomness.  The only
+ * property we need with respect to them is that the attacker can't
+ * increase his/her knowledge of the pool's state.  Since all
+ * additions are reversible (knowing the final state and the input,
+ * you can reconstruct the initial state), if an attacker has any
+ * uncertainty about the initial state, he/she can only shuffle that
+ * uncertainty about, but never cause any collisions (which would
+ * decrease the uncertainty).
+ *
+ * Our mixing functions were analyzed by Lacharme, Roeck, Strubel, and
+ * Videau in their paper, "The Linux Pseudorandom Number Generator
+ * Revisited" (see: http://eprint.iacr.org/2012/251.pdf).  In their
+ * paper, they point out that we are not using a true Twisted GFSR,
+ * since Matsumoto & Kurita used a trinomial feedback polynomial (that
+ * is, with only three taps, instead of the six that we are using).
+ * As a result, the resulting polynomial is neither primitive nor
+ * irreducible, and hence does not have a maximal period over
+ * GF(2**32).  They suggest a slight change to the generator
+ * polynomial which improves the resulting TGFSR polynomial to be
+ * irreducible, which we have made here.
+ */
+static struct poolinfo {
+	int poolbitshift, poolwords, poolbytes, poolbits, poolfracbits;
+#define S(x) ilog2(x)+5, (x), (x)*4, (x)*32, (x) << (ENTROPY_SHIFT+5)
+	int tap1, tap2, tap3, tap4, tap5;
+} poolinfo_table[] = {
+	/* was: x^128 + x^103 + x^76 + x^51 +x^25 + x + 1 */
+	/* x^128 + x^104 + x^76 + x^51 +x^25 + x + 1 */
+	{ S(128),	104,	76,	51,	25,	1 },
+	/* was: x^32 + x^26 + x^20 + x^14 + x^7 + x + 1 */
+	/* x^32 + x^26 + x^19 + x^14 + x^7 + x + 1 */
+	{ S(32),	26,	19,	14,	7,	1 },
+#if 0
+	/* x^2048 + x^1638 + x^1231 + x^819 + x^411 + x + 1  -- 115 */
+	{ S(2048),	1638,	1231,	819,	411,	1 },
+
+	/* x^1024 + x^817 + x^615 + x^412 + x^204 + x + 1 -- 290 */
+	{ S(1024),	817,	615,	412,	204,	1 },
+
+	/* x^1024 + x^819 + x^616 + x^410 + x^207 + x^2 + 1 -- 115 */
+	{ S(1024),	819,	616,	410,	207,	2 },
+
+	/* x^512 + x^411 + x^308 + x^208 + x^104 + x + 1 -- 225 */
+	{ S(512),	411,	308,	208,	104,	1 },
+
+	/* x^512 + x^409 + x^307 + x^206 + x^102 + x^2 + 1 -- 95 */
+	{ S(512),	409,	307,	206,	102,	2 },
+	/* x^512 + x^409 + x^309 + x^205 + x^103 + x^2 + 1 -- 95 */
+	{ S(512),	409,	309,	205,	103,	2 },
+
+	/* x^256 + x^205 + x^155 + x^101 + x^52 + x + 1 -- 125 */
+	{ S(256),	205,	155,	101,	52,	1 },
+
+	/* x^128 + x^103 + x^78 + x^51 + x^27 + x^2 + 1 -- 70 */
+	{ S(128),	103,	78,	51,	27,	2 },
+
+	/* x^64 + x^52 + x^39 + x^26 + x^14 + x + 1 -- 15 */
+	{ S(64),	52,	39,	26,	14,	1 },
+#endif
+};
+
+/*
+ * Static global variables
+ */
+static DECLARE_WAIT_QUEUE_HEAD(random_read_wait);
+static DECLARE_WAIT_QUEUE_HEAD(random_write_wait);
+static DECLARE_WAIT_QUEUE_HEAD(urandom_init_wait);
+static struct fasync_struct *fasync;
+
+static DEFINE_SPINLOCK(random_ready_list_lock);
+static LIST_HEAD(random_ready_list);
+
+/**********************************************************************
+ *
+ * OS independent entropy store.   Here are the functions which handle
+ * storing entropy in an entropy pool.
+ *
+ **********************************************************************/
+
+struct entropy_store;
+struct entropy_store {
+	/* read-only data: */
+	const struct poolinfo *poolinfo;
+	__u32 *pool;
+	const char *name;
+	struct entropy_store *pull;
+	struct work_struct push_work;
+
+	/* read-write data: */
+	unsigned long last_pulled;
+	spinlock_t lock;
+	unsigned short add_ptr;
+	unsigned short input_rotate;
+	int entropy_count;
+	int entropy_total;
+	unsigned int initialized:1;
+	unsigned int limit:1;
+	unsigned int last_data_init:1;
+	__u8 last_data[EXTRACT_SIZE];
+	u32 *A, *B, which, count ;
+	u32 *p, *q, *end, size ;
+};
+
+static void push_to_pool(struct work_struct *work);
+
+static struct entropy_store input_pool = {
+	.poolinfo = &poolinfo_table[0],
+	.name = "input",
+	.limit = 1,
+	.lock = __SPIN_LOCK_UNLOCKED(input_pool.lock),
+	.pool = pools,
+	.A = constants,
+	.B = constants+4,
+	.which = 0,
+	.count = 0,
+	.size = INPUT_POOL_WORDS,
+	.p = pools,
+	.q = pools + (INPUT_POOL_WORDS/2),
+	.end = pools + INPUT_POOL_WORDS
+};
+
+static struct entropy_store blocking_pool = {
+	.poolinfo = &poolinfo_table[1],
+	.name = "blocking",
+	.limit = 1,
+	.pull = &input_pool,
+	.lock = __SPIN_LOCK_UNLOCKED(blocking_pool.lock),
+	.push_work = __WORK_INITIALIZER(blocking_pool.push_work,
+					push_to_pool),
+	.pool = pools + INPUT_POOL_WORDS,
+	.A = constants+8,
+	.B = constants+12,
+	.which = 0,
+	.count = 0,
+	.size = OUTPUT_POOL_WORDS,
+	.p = pools + INPUT_POOL_WORDS,
+	.q = pools + INPUT_POOL_WORDS + (OUTPUT_POOL_WORDS/2),
+	.end = pools + INPUT_POOL_WORDS + OUTPUT_POOL_WORDS
+};
+
+static struct entropy_store nonblocking_pool = {
+	.poolinfo = &poolinfo_table[1],
+	.name = "nonblocking",
+	.pull = &input_pool,
+	.lock = __SPIN_LOCK_UNLOCKED(nonblocking_pool.lock),
+	.push_work = __WORK_INITIALIZER(nonblocking_pool.push_work,
+					push_to_pool),
+	.pool = pools + INPUT_POOL_WORDS + OUTPUT_POOL_WORDS,
+	.A = constants+16,
+	.B = constants+20,
+	.which = 0,
+	.count = 0,
+	.size = OUTPUT_POOL_WORDS,
+	.p = pools + INPUT_POOL_WORDS + OUTPUT_POOL_WORDS,
+	.q = pools + INPUT_POOL_WORDS + OUTPUT_POOL_WORDS + (OUTPUT_POOL_WORDS/2),
+	.end = pools + INPUT_POOL_WORDS + (OUTPUT_POOL_WORDS*2)
+};
+
+/* no actual pool; just hash the counter */
+static struct entropy_store dummy_pool = {
+	.poolinfo = &poolinfo_table[1],
+	.name = "dummy",
+	.lock = __SPIN_LOCK_UNLOCKED(dummy_pool.lock),
+	.pool = NULL,
+	.A = constants+24,
+	.B = constants+28,
+	.which = 0,
+	.count = 0,
+	/* should never be used */
+	.size = 0,
+	.p = NULL,
+	.q = NULL,
+	.end = NULL
+};
+
+static int got_hw_rng ;
+
+/*****************************************************************
+ * forward declarations and a few macros
+ *****************************************************************/
+
+static void init_random(void) ;
+
+/* fill an output buffer from a pool */
+static void loop_output( struct entropy_store *, u32 *, u32 ) ;
+
+static void count(void) ;
+static void counter_any(void) ;
+
+/* get 128 bits */
+static int get_or_fail( struct entropy_store *, u32 * ) ;
+static void get128( struct entropy_store *, u32 * ) ;
+static int get_any( u32 * ) ;
+
+/* These functions each do a unidirectional mix
+ * into some data structure. They mix in 128 bits
+ * at a time to give "catastrophic reseeding", and
+ * all zero out the input buffer after use.
+ */
+static void buffer2array( struct entropy_store *, u32 * ) ;
+static void buffer2pool(  struct entropy_store *, u32 * ) ;
+static void buffer2counter( u32 * ) ;
+
+/* hw rng functions */
+static int get_hw_random( u32 * ) ;
+static int load_constants(void) ;
+static int load_input(void) ;
+
+/* mix chunks of data structures in place */
+static void mix_const_p( struct entropy_store * ) ;
+static void mix_const_all(void);
+static void top_mix(void);
+static void big_mix(void);
+
+static void clear_addmul(void);
+
+/* rotate a 32-bit word left n bits */
+#define ROTL(v, n) ( ((v) << (n)) | ((v) >> (32 - (n))) )
+
+/* common case with 128-bit buffer */
+#define zero128( target )	memzero_explicit( (u8 *) target, 16 )
+
+static __u32 const twist_table[8] = {
+	0x00000000, 0x3b6e20c8, 0x76dc4190, 0x4db26158,
+	0xedb88320, 0xd6d6a3e8, 0x9b64c2b0, 0xa00ae278 };
+
+/*
+ * This function adds bytes into the entropy "pool".  It does not
+ * update the entropy estimate.  The caller should call
+ * credit_entropy_bits if this is appropriate.
+ *
+ * The pool is stirred with a primitive polynomial of the appropriate
+ * degree, and then twisted.  We twist by three bits at a time because
+ * it's cheap to do so and helps slightly in the expected case where
+ * the entropy is concentrated in the low-order bits.
+ */
+static void _mix_pool_bytes(struct entropy_store *r, const void *in,
+			    int nbytes)
+{
+	unsigned long i, tap1, tap2, tap3, tap4, tap5;
+	int input_rotate;
+	int wordmask = r->poolinfo->poolwords - 1;
+	const char *bytes = in;
+	__u32 w;
+
+	tap1 = r->poolinfo->tap1;
+	tap2 = r->poolinfo->tap2;
+	tap3 = r->poolinfo->tap3;
+	tap4 = r->poolinfo->tap4;
+	tap5 = r->poolinfo->tap5;
+
+	input_rotate = r->input_rotate;
+	i = r->add_ptr;
+
+	/* mix one byte at a time to simplify size handling and churn faster */
+	while (nbytes--) {
+		w = rol32(*bytes++, input_rotate);
+		i = (i - 1) & wordmask;
+
+		/* XOR in the various taps */
+		w ^= r->pool[i];
+		w ^= r->pool[(i + tap1) & wordmask];
+		w ^= r->pool[(i + tap2) & wordmask];
+		w ^= r->pool[(i + tap3) & wordmask];
+		w ^= r->pool[(i + tap4) & wordmask];
+		w ^= r->pool[(i + tap5) & wordmask];
+
+		/* Mix the result back in with a twist */
+		r->pool[i] = (w >> 3) ^ twist_table[w & 7];
+
+		/*
+		 * Normally, we add 7 bits of rotation to the pool.
+		 * At the beginning of the pool, add an extra 7 bits
+		 * rotation, so that successive passes spread the
+		 * input bits across the pool evenly.
+		 */
+		input_rotate = (input_rotate + (i ? 7 : 14)) & 31;
+	}
+
+	r->input_rotate = input_rotate;
+	r->add_ptr = i;
+}
+
+static void __mix_pool_bytes(struct entropy_store *r, const void *in,
+			     int nbytes)
+{
+	trace_mix_pool_bytes_nolock(r->name, nbytes, _RET_IP_);
+	_mix_pool_bytes(r, in, nbytes);
+}
+
+static void mix_pool_bytes(struct entropy_store *r, const void *in,
+			   int nbytes)
+{
+	unsigned long flags;
+
+	trace_mix_pool_bytes(r->name, nbytes, _RET_IP_);
+	spin_lock_irqsave(&r->lock, flags);
+	_mix_pool_bytes(r, in, nbytes);
+	spin_unlock_irqrestore(&r->lock, flags);
+}
+
+struct fast_pool {
+	__u32		pool[4];
+	unsigned long	last;
+	unsigned short	reg_idx;
+	unsigned char	count;
+};
+
+/*
+ * This is a fast mixing routine used by the interrupt randomness
+ * collector.  It's hardcoded for an 128 bit pool and assumes that any
+ * locks that might be needed are taken by the caller.
+ */
+static void fast_mix(struct fast_pool *f)
+{
+	__u32 a = f->pool[0],	b = f->pool[1];
+	__u32 c = f->pool[2],	d = f->pool[3];
+
+	a += b;			c += d;
+	b = rol32(b, 6);	d = rol32(d, 27);
+	d ^= a;			b ^= c;
+
+	a += b;			c += d;
+	b = rol32(b, 16);	d = rol32(d, 14);
+	d ^= a;			b ^= c;
+
+	a += b;			c += d;
+	b = rol32(b, 6);	d = rol32(d, 27);
+	d ^= a;			b ^= c;
+
+	a += b;			c += d;
+	b = rol32(b, 16);	d = rol32(d, 14);
+	d ^= a;			b ^= c;
+
+	f->pool[0] = a;  f->pool[1] = b;
+	f->pool[2] = c;  f->pool[3] = d;
+	f->count++;
+}
+
+static void process_random_ready_list(void)
+{
+	unsigned long flags;
+	struct random_ready_callback *rdy, *tmp;
+
+	spin_lock_irqsave(&random_ready_list_lock, flags);
+	list_for_each_entry_safe(rdy, tmp, &random_ready_list, list) {
+		struct module *owner = rdy->owner;
+
+		list_del_init(&rdy->list);
+		rdy->func(rdy);
+		module_put(owner);
+	}
+	spin_unlock_irqrestore(&random_ready_list_lock, flags);
+}
+
+/*
+ * Credit (or debit) the entropy store with n bits of entropy.
+ * Use credit_entropy_bits_safe() if the value comes from userspace
+ * or otherwise should be checked for extreme values.
+ */
+static void credit_entropy_bits(struct entropy_store *r, int nbits)
+{
+	int entropy_count, orig;
+	const int pool_size = r->poolinfo->poolfracbits;
+	int nfrac = nbits << ENTROPY_SHIFT;
+
+	if (!nbits)
+		return;
+
+retry:
+	entropy_count = orig = ACCESS_ONCE(r->entropy_count);
+	if (nfrac < 0) {
+		/* Debit */
+		entropy_count += nfrac;
+	} else {
+		/*
+		 * Credit: we have to account for the possibility of
+		 * overwriting already present entropy.	 Even in the
+		 * ideal case of pure Shannon entropy, new contributions
+		 * approach the full value asymptotically:
+		 *
+		 * entropy <- entropy + (pool_size - entropy) *
+		 *	(1 - exp(-add_entropy/pool_size))
+		 *
+		 * For add_entropy <= pool_size/2 then
+		 * (1 - exp(-add_entropy/pool_size)) >=
+		 *    (add_entropy/pool_size)*0.7869...
+		 * so we can approximate the exponential with
+		 * 3/4*add_entropy/pool_size and still be on the
+		 * safe side by adding at most pool_size/2 at a time.
+		 *
+		 * The use of pool_size-2 in the while statement is to
+		 * prevent rounding artifacts from making the loop
+		 * arbitrarily long; this limits the loop to log2(pool_size)*2
+		 * turns no matter how large nbits is.
+		 */
+		int pnfrac = nfrac;
+		const int s = r->poolinfo->poolbitshift + ENTROPY_SHIFT + 2;
+		/* The +2 corresponds to the /4 in the denominator */
+
+		do {
+			unsigned int anfrac = min(pnfrac, pool_size/2);
+			unsigned int add =
+				((pool_size - entropy_count)*anfrac*3) >> s;
+
+			entropy_count += add;
+			pnfrac -= anfrac;
+		} while (unlikely(entropy_count < pool_size-2 && pnfrac));
+	}
+
+	if (unlikely(entropy_count < 0)) {
+		pr_warn("random: negative entropy/overflow: pool %s count %d\n",
+			r->name, entropy_count);
+		WARN_ON(1);
+		entropy_count = 0;
+	} else if (entropy_count > pool_size)
+		entropy_count = pool_size;
+	if (cmpxchg(&r->entropy_count, orig, entropy_count) != orig)
+		goto retry;
+
+	r->entropy_total += nbits;
+	if (!r->initialized && r->entropy_total > 128) {
+		r->initialized = 1;
+		r->entropy_total = 0;
+		if (r == &nonblocking_pool) {
+			prandom_reseed_late();
+			process_random_ready_list();
+			wake_up_all(&urandom_init_wait);
+			pr_notice("random: %s pool is initialized\n", r->name);
+		}
+	}
+
+	trace_credit_entropy_bits(r->name, nbits,
+				  entropy_count >> ENTROPY_SHIFT,
+				  r->entropy_total, _RET_IP_);
+
+	if (r == &input_pool) {
+		int entropy_bits = entropy_count >> ENTROPY_SHIFT;
+
+		/* should we wake readers? */
+		if (entropy_bits >= random_read_wakeup_bits) {
+			wake_up_interruptible(&random_read_wait);
+			kill_fasync(&fasync, SIGIO, POLL_IN);
+		}
+		/* If the input pool is getting full, send some
+		 * entropy to the two output pools, flipping back and
+		 * forth between them, until the output pools are 75%
+		 * full.
+		 */
+		if (entropy_bits > random_write_wakeup_bits &&
+		    r->initialized &&
+		    r->entropy_total >= 2*random_read_wakeup_bits) {
+			static struct entropy_store *last = &blocking_pool;
+			struct entropy_store *other = &blocking_pool;
+
+			if (last == &blocking_pool)
+				other = &nonblocking_pool;
+			if (other->entropy_count <=
+			    3 * other->poolinfo->poolfracbits / 4)
+				last = other;
+			if (last->entropy_count <=
+			    3 * last->poolinfo->poolfracbits / 4) {
+				schedule_work(&last->push_work);
+				r->entropy_total = 0;
+			}
+		}
+	}
+}
+
+static void credit_entropy_bits_safe(struct entropy_store *r, int nbits)
+{
+	const int nbits_max = (int)(~0U >> (ENTROPY_SHIFT + 1));
+
+	/* Cap the value to avoid overflows */
+	nbits = min(nbits,  nbits_max);
+	nbits = max(nbits, -nbits_max);
+
+	credit_entropy_bits(r, nbits);
+}
+
+/*********************************************************************
+ *
+ * Entropy input management
+ *
+ *********************************************************************/
+
+/* There is one of these per entropy source */
+struct timer_rand_state {
+	cycles_t last_time;
+	long last_delta, last_delta2;
+	unsigned dont_count_entropy:1;
+};
+
+#define INIT_TIMER_RAND_STATE { INITIAL_JIFFIES, };
+
+/*
+ * Add device- or boot-specific data to the input and nonblocking
+ * pools to help initialize them to unique values.
+ *
+ * None of this adds any entropy, it is meant to avoid the
+ * problem of the nonblocking pool having similar initial state
+ * across largely identical devices.
+ */
+void add_device_randomness(const void *buf, unsigned int size)
+{
+	unsigned long time = random_get_entropy() ^ jiffies;
+	unsigned long flags;
+
+	trace_add_device_randomness(size, _RET_IP_);
+	spin_lock_irqsave(&input_pool.lock, flags);
+	_mix_pool_bytes(&input_pool, buf, size);
+	_mix_pool_bytes(&input_pool, &time, sizeof(time));
+	spin_unlock_irqrestore(&input_pool.lock, flags);
+
+	spin_lock_irqsave(&nonblocking_pool.lock, flags);
+	_mix_pool_bytes(&nonblocking_pool, buf, size);
+	_mix_pool_bytes(&nonblocking_pool, &time, sizeof(time));
+	spin_unlock_irqrestore(&nonblocking_pool.lock, flags);
+}
+EXPORT_SYMBOL(add_device_randomness);
+
+static struct timer_rand_state input_timer_state = INIT_TIMER_RAND_STATE;
+
+/*
+ * This function adds entropy to the entropy "pool" by using timing
+ * delays.  It uses the timer_rand_state structure to make an estimate
+ * of how many bits of entropy this call has added to the pool.
+ *
+ * The number "num" is also added to the pool - it should somehow describe
+ * the type of event which just happened.  This is currently 0-255 for
+ * keyboard scan codes, and 256 upwards for interrupts.
+ *
+ */
+static void add_timer_randomness(struct timer_rand_state *state, unsigned num)
+{
+	struct entropy_store	*r;
+	struct {
+		long jiffies;
+		unsigned cycles;
+		unsigned num;
+	} sample;
+	long delta, delta2, delta3;
+
+	preempt_disable();
+
+	sample.jiffies = jiffies;
+	sample.cycles = random_get_entropy();
+	sample.num = num;
+	r = nonblocking_pool.initialized ? &input_pool : &nonblocking_pool;
+	mix_pool_bytes(r, &sample, sizeof(sample));
+
+	/*
+	 * Calculate number of bits of randomness we probably added.
+	 * We take into account the first, second and third-order deltas
+	 * in order to make our estimate.
+	 */
+
+	if (!state->dont_count_entropy) {
+		delta = sample.jiffies - state->last_time;
+		state->last_time = sample.jiffies;
+
+		delta2 = delta - state->last_delta;
+		state->last_delta = delta;
+
+		delta3 = delta2 - state->last_delta2;
+		state->last_delta2 = delta2;
+
+		if (delta < 0)
+			delta = -delta;
+		if (delta2 < 0)
+			delta2 = -delta2;
+		if (delta3 < 0)
+			delta3 = -delta3;
+		if (delta > delta2)
+			delta = delta2;
+		if (delta > delta3)
+			delta = delta3;
+
+		/*
+		 * delta is now minimum absolute delta.
+		 * Round down by 1 bit on general principles,
+		 * and limit entropy entimate to 12 bits.
+		 */
+		credit_entropy_bits(r, min_t(int, fls(delta>>1), 11));
+	}
+	preempt_enable();
+}
+
+void add_input_randomness(unsigned int type, unsigned int code,
+				 unsigned int value)
+{
+	static unsigned char last_value;
+
+	/* ignore autorepeat and the like */
+	if (value == last_value)
+		return;
+
+	last_value = value;
+	add_timer_randomness(&input_timer_state,
+			     (type << 4) ^ code ^ (code >> 4) ^ value);
+	trace_add_input_randomness(ENTROPY_BITS(&input_pool));
+}
+EXPORT_SYMBOL_GPL(add_input_randomness);
+
+static DEFINE_PER_CPU(struct fast_pool, irq_randomness);
+
+#ifdef ADD_INTERRUPT_BENCH
+static unsigned long avg_cycles, avg_deviation;
+
+#define AVG_SHIFT 8     /* Exponential average factor k=1/256 */
+#define FIXED_1_2 (1 << (AVG_SHIFT-1))
+
+static void add_interrupt_bench(cycles_t start)
+{
+        long delta = random_get_entropy() - start;
+
+        /* Use a weighted moving average */
+        delta = delta - ((avg_cycles + FIXED_1_2) >> AVG_SHIFT);
+        avg_cycles += delta;
+        /* And average deviation */
+        delta = abs(delta) - ((avg_deviation + FIXED_1_2) >> AVG_SHIFT);
+        avg_deviation += delta;
+}
+#else
+#define add_interrupt_bench(x)
+#endif
+
+static __u32 get_reg(struct fast_pool *f, struct pt_regs *regs)
+{
+	__u32 *ptr = (__u32 *) regs;
+
+	if (regs == NULL)
+		return 0;
+	if (f->reg_idx >= sizeof(struct pt_regs) / sizeof(__u32))
+		f->reg_idx = 0;
+	return *(ptr + f->reg_idx++);
+}
+
+void add_interrupt_randomness(int irq, int irq_flags)
+{
+	struct entropy_store	*r;
+	struct fast_pool	*fast_pool = this_cpu_ptr(&irq_randomness);
+	struct pt_regs		*regs = get_irq_regs();
+	unsigned long		now = jiffies;
+	cycles_t		cycles = random_get_entropy();
+	__u32			c_high, j_high;
+	__u64			ip;
+	unsigned long		seed;
+	int			credit = 0;
+
+	if (cycles == 0)
+		cycles = get_reg(fast_pool, regs);
+	c_high = (sizeof(cycles) > 4) ? cycles >> 32 : 0;
+	j_high = (sizeof(now) > 4) ? now >> 32 : 0;
+	fast_pool->pool[0] ^= cycles ^ j_high ^ irq;
+	fast_pool->pool[1] ^= now ^ c_high;
+	ip = regs ? instruction_pointer(regs) : _RET_IP_;
+	fast_pool->pool[2] ^= ip;
+	fast_pool->pool[3] ^= (sizeof(ip) > 4) ? ip >> 32 :
+		get_reg(fast_pool, regs);
+
+	fast_mix(fast_pool);
+	add_interrupt_bench(cycles);
+
+	if ((fast_pool->count < 64) &&
+	    !time_after(now, fast_pool->last + HZ))
+		return;
+
+	r = nonblocking_pool.initialized ? &input_pool : &nonblocking_pool;
+	if (!spin_trylock(&r->lock))
+		return;
+
+	fast_pool->last = now;
+	__mix_pool_bytes(r, &fast_pool->pool, sizeof(fast_pool->pool));
+
+	/*
+	 * If we have architectural seed generator, produce a seed and
+	 * add it to the pool.  For the sake of paranoia don't let the
+	 * architectural seed generator dominate the input from the
+	 * interrupt noise.
+	 */
+	if (arch_get_random_seed_long(&seed)) {
+		__mix_pool_bytes(r, &seed, sizeof(seed));
+		credit = 1;
+	}
+	spin_unlock(&r->lock);
+
+	fast_pool->count = 0;
+
+	/* award one bit for the contents of the fast pool */
+	credit_entropy_bits(r, credit + 1);
+}
+
+#ifdef CONFIG_BLOCK
+void add_disk_randomness(struct gendisk *disk)
+{
+	if (!disk || !disk->random)
+		return;
+	/* first major is 1, so we get >= 0x200 here */
+	add_timer_randomness(disk->random, 0x100 + disk_devt(disk));
+	trace_add_disk_randomness(disk_devt(disk), ENTROPY_BITS(&input_pool));
+}
+EXPORT_SYMBOL_GPL(add_disk_randomness);
+#endif
+
+/*********************************************************************
+ *
+ * Entropy extraction routines
+ *
+ *********************************************************************/
+
+static ssize_t extract_entropy(struct entropy_store *r, void *buf,
+			       size_t nbytes, int min, int rsvd);
+
+/*
+ * This utility inline function is responsible for transferring entropy
+ * from the primary pool to the secondary extraction pool. We make
+ * sure we pull enough for a 'catastrophic reseed'.
+ */
+static void _xfer_secondary_pool(struct entropy_store *r, size_t nbytes);
+static void xfer_secondary_pool(struct entropy_store *r, size_t nbytes)
+{
+	if (!r->pull ||
+	    r->entropy_count >= (nbytes << (ENTROPY_SHIFT + 3)) ||
+	    r->entropy_count > r->poolinfo->poolfracbits)
+		return;
+
+	if (r->limit == 0 && random_min_urandom_seed) {
+		unsigned long now = jiffies;
+
+		if (time_before(now,
+				r->last_pulled + random_min_urandom_seed * HZ))
+			return;
+		r->last_pulled = now;
+	}
+
+	_xfer_secondary_pool(r, nbytes);
+}
+
+static void _xfer_secondary_pool(struct entropy_store *r, size_t nbytes)
+{
+	u32	temp[4] ;
+	int bytes = nbytes;
+
+	/* pull at least as much as a wakeup */
+	bytes = max_t(int, bytes, random_read_wakeup_bits / 8);
+	/* but never more than the pool size */
+	bytes = min_t(int, bytes, OUTPUT_POOL_WORDS);
+
+	trace_xfer_secondary_pool(r->name, bytes * 8, nbytes * 8,
+				  ENTROPY_BITS(r), ENTROPY_BITS(r->pull));
+	for( ; bytes > 3 ; bytes -= 4 )		{
+		get128(r->pull, temp ) ;
+		buffer2pool( r, temp ) ;
+	}
+}
+
+/*
+ * Used as a workqueue function so that when the input pool is getting
+ * full, we can "spill over" some entropy to the output pools.  That
+ * way the output pools can store some of the excess entropy instead
+ * of letting it go to waste.
+ */
+static void push_to_pool(struct work_struct *work)
+{
+	struct entropy_store *r = container_of(work, struct entropy_store,
+					      push_work);
+	BUG_ON(!r);
+	_xfer_secondary_pool(r, random_read_wakeup_bits/8);
+	trace_push_to_pool(r->name, r->entropy_count >> ENTROPY_SHIFT,
+			   r->pull->entropy_count >> ENTROPY_SHIFT);
+}
+
+/*
+ * This function decides how many bytes to actually take from the
+ * given pool, and also debits the entropy count accordingly.
+ */
+static size_t account(struct entropy_store *r, size_t nbytes, int min,
+		      int reserved)
+{
+	int entropy_count, orig;
+	size_t ibytes, nfrac;
+
+	BUG_ON(r->entropy_count > r->poolinfo->poolfracbits);
+
+	/* Can we pull enough? */
+retry:
+	entropy_count = orig = ACCESS_ONCE(r->entropy_count);
+	ibytes = nbytes;
+	/* If limited, never pull more than available */
+	if (r->limit) {
+		int have_bytes = entropy_count >> (ENTROPY_SHIFT + 3);
+
+		if ((have_bytes -= reserved) < 0)
+			have_bytes = 0;
+		ibytes = min_t(size_t, ibytes, have_bytes);
+	}
+	if (ibytes < min)
+		ibytes = 0;
+
+	if (unlikely(entropy_count < 0)) {
+		pr_warn("random: negative entropy count: pool %s count %d\n",
+			r->name, entropy_count);
+		WARN_ON(1);
+		entropy_count = 0;
+	}
+	nfrac = ibytes << (ENTROPY_SHIFT + 3);
+	if ((size_t) entropy_count > nfrac)
+		entropy_count -= nfrac;
+	else
+		entropy_count = 0;
+
+	if (cmpxchg(&r->entropy_count, orig, entropy_count) != orig)
+		goto retry;
+
+	trace_debit_entropy(r->name, 8 * ibytes);
+	if (ibytes &&
+	    (r->entropy_count >> ENTROPY_SHIFT) < random_write_wakeup_bits) {
+		wake_up_interruptible(&random_write_wait);
+		kill_fasync(&fasync, SIGIO, POLL_OUT);
+	}
+
+	return ibytes;
+}
+
+/*
+ * This function does the actual extraction for extract_entropy and
+ * extract_entropy_user.
+ *
+ * Note: we assume that .poolwords is a multiple of 16 words.
+ */
+static void extract_buf(struct entropy_store *r, __u8 *out)
+{
+	get128( r, (u32 *) out ) ;
+}
+
+/*
+ * This function extracts randomness from the "entropy pool", and
+ * returns it in a buffer.
+ *
+ * The min parameter specifies the minimum amount we can pull before
+ * failing to avoid races that defeat catastrophic reseeding while the
+ * reserved parameter indicates how much entropy we must leave in the
+ * pool after each pull to avoid starving other readers.
+ */
+static ssize_t extract_entropy(struct entropy_store *r, void *buf,
+				 size_t nbytes, int min, int reserved)
+{
+	ssize_t ret = 0, i;
+	__u8 tmp[EXTRACT_SIZE];
+	unsigned long flags;
+
+	/* if last_data isn't primed, we need EXTRACT_SIZE extra bytes */
+	if (fips_enabled) {
+		spin_lock_irqsave(&r->lock, flags);
+		if (!r->last_data_init) {
+			r->last_data_init = 1;
+			spin_unlock_irqrestore(&r->lock, flags);
+			trace_extract_entropy(r->name, EXTRACT_SIZE,
+					      ENTROPY_BITS(r), _RET_IP_);
+			xfer_secondary_pool(r, EXTRACT_SIZE);
+			extract_buf(r, tmp);
+			spin_lock_irqsave(&r->lock, flags);
+			memcpy(r->last_data, tmp, EXTRACT_SIZE);
+		}
+		spin_unlock_irqrestore(&r->lock, flags);
+	}
+
+	trace_extract_entropy(r->name, nbytes, ENTROPY_BITS(r), _RET_IP_);
+	xfer_secondary_pool(r, nbytes);
+	nbytes = account(r, nbytes, min, reserved);
+
+	while (nbytes) {
+		extract_buf(r, tmp);
+
+		if (fips_enabled) {
+			spin_lock_irqsave(&r->lock, flags);
+			if (!memcmp(tmp, r->last_data, EXTRACT_SIZE))
+				panic("Hardware RNG duplicated output!\n");
+			memcpy(r->last_data, tmp, EXTRACT_SIZE);
+			spin_unlock_irqrestore(&r->lock, flags);
+		}
+		i = min_t(int, nbytes, EXTRACT_SIZE);
+		memcpy(buf, tmp, i);
+		nbytes -= i;
+		buf += i;
+		ret += i;
+	}
+
+	/* Wipe data just returned from memory */
+	memzero_explicit(tmp, sizeof(tmp));
+
+	return ret;
+}
+
+/*
+ * This function extracts randomness from the "entropy pool", and
+ * returns it in a userspace buffer.
+ */
+static ssize_t extract_entropy_user(struct entropy_store *r, void __user *buf,
+				    size_t nbytes)
+{
+	ssize_t ret = 0, i;
+	__u8 tmp[EXTRACT_SIZE];
+	int large_request = (nbytes > 256);
+
+	trace_extract_entropy_user(r->name, nbytes, ENTROPY_BITS(r), _RET_IP_);
+	xfer_secondary_pool(r, nbytes);
+	nbytes = account(r, nbytes, 0, 0);
+
+	while (nbytes) {
+		if (large_request && need_resched()) {
+			if (signal_pending(current)) {
+				if (ret == 0)
+					ret = -ERESTARTSYS;
+				break;
+			}
+			schedule();
+		}
+
+		extract_buf(r, tmp);
+		i = min_t(int, nbytes, EXTRACT_SIZE);
+		if (copy_to_user(buf, tmp, i)) {
+			ret = -EFAULT;
+			break;
+		}
+
+		nbytes -= i;
+		buf += i;
+		ret += i;
+	}
+
+	/* Wipe data just returned from memory */
+	memzero_explicit(tmp, sizeof(tmp));
+
+	return ret;
+}
+
+/*
+ * This function is the exported kernel interface.  It returns some
+ * number of good random numbers, suitable for key generation, seeding
+ * TCP sequence numbers, etc.  It does not rely on the hardware random
+ * number generator.  For random bytes direct from the hardware RNG
+ * (when available), use get_random_bytes_arch().
+ */
+void get_random_bytes(void *buf, int nbytes)
+{
+#if DEBUG_RANDOM_BOOT > 0
+	if (unlikely(nonblocking_pool.initialized == 0))
+		printk(KERN_NOTICE "random: %pF get_random_bytes called "
+		       "with %d bits of entropy available\n",
+		       (void *) _RET_IP_,
+		       nonblocking_pool.entropy_total);
+#endif
+	trace_get_random_bytes(nbytes, _RET_IP_);
+	loop_output(&nonblocking_pool, buf, nbytes);
+}
+EXPORT_SYMBOL(get_random_bytes);
+
+/*
+ * Add a callback function that will be invoked when the nonblocking
+ * pool is initialised.
+ *
+ * returns: 0 if callback is successfully added
+ *	    -EALREADY if pool is already initialised (callback not called)
+ *	    -ENOENT if module for callback is not alive
+ */
+int add_random_ready_callback(struct random_ready_callback *rdy)
+{
+	struct module *owner;
+	unsigned long flags;
+	int err = -EALREADY;
+
+	if (likely(nonblocking_pool.initialized))
+		return err;
+
+	owner = rdy->owner;
+	if (!try_module_get(owner))
+		return -ENOENT;
+
+	spin_lock_irqsave(&random_ready_list_lock, flags);
+	if (nonblocking_pool.initialized)
+		goto out;
+
+	owner = NULL;
+
+	list_add(&rdy->list, &random_ready_list);
+	err = 0;
+
+out:
+	spin_unlock_irqrestore(&random_ready_list_lock, flags);
+
+	module_put(owner);
+
+	return err;
+}
+EXPORT_SYMBOL(add_random_ready_callback);
+
+/*
+ * Delete a previously registered readiness callback function.
+ */
+void del_random_ready_callback(struct random_ready_callback *rdy)
+{
+	unsigned long flags;
+	struct module *owner = NULL;
+
+	spin_lock_irqsave(&random_ready_list_lock, flags);
+	if (!list_empty(&rdy->list)) {
+		list_del_init(&rdy->list);
+		owner = rdy->owner;
+	}
+	spin_unlock_irqrestore(&random_ready_list_lock, flags);
+
+	module_put(owner);
+}
+EXPORT_SYMBOL(del_random_ready_callback);
+
+/*
+ * This function will use the architecture-specific hardware random
+ * number generator if it is available.  The arch-specific hw RNG will
+ * almost certainly be faster than what we can do in software, but it
+ * is impossible to verify that it is implemented securely (as
+ * opposed, to, say, the AES encryption of a sequence number using a
+ * key known by the NSA).  So it's useful if we need the speed, but
+ * only if we're willing to trust the hardware manufacturer not to
+ * have put in a back door.
+ */
+void get_random_bytes_arch(void *buf, int nbytes)
+{
+	char *p = buf;
+
+	trace_get_random_bytes_arch(nbytes, _RET_IP_);
+	while (nbytes) {
+		unsigned long v;
+		int chunk = min(nbytes, (int)sizeof(unsigned long));
+
+		if (!arch_get_random_long(&v))
+			break;
+		
+		memcpy(p, &v, chunk);
+		p += chunk;
+		nbytes -= chunk;
+	}
+
+	if (nbytes)
+		extract_entropy(&nonblocking_pool, p, nbytes, 0, 0);
+}
+EXPORT_SYMBOL(get_random_bytes_arch);
+
+/*
+ * Note that setup_arch() may call add_device_randomness()
+ * long before we get here. This allows seeding of the pools
+ * with some platform dependent data very early in the boot
+ * process. But it limits our options here. We must use
+ * statically allocated structures that already have all
+ * initializations complete at compile time. We should also
+ * take care not to overwrite the precious per platform data
+ * we were given.
+ */
+static int rand_initialize(void)
+{
+	init_random() ;
+	return 0;
+}
+early_initcall(rand_initialize);
+
+#ifdef CONFIG_BLOCK
+void rand_initialize_disk(struct gendisk *disk)
+{
+	struct timer_rand_state *state;
+
+	/*
+	 * If kzalloc returns null, we just won't use that entropy
+	 * source.
+	 */
+	state = kzalloc(sizeof(struct timer_rand_state), GFP_KERNEL);
+	if (state) {
+		state->last_time = INITIAL_JIFFIES;
+		disk->random = state;
+	}
+}
+#endif
+
+static ssize_t
+_random_read(int nonblock, char __user *buf, size_t nbytes)
+{
+	ssize_t n;
+
+	if (nbytes == 0)
+		return 0;
+
+	nbytes = min_t(size_t, nbytes, SEC_XFER_SIZE);
+	while (1) {
+		n = extract_entropy_user(&blocking_pool, buf, nbytes);
+		if (n < 0)
+			return n;
+		trace_random_read(n*8, (nbytes-n)*8,
+				  ENTROPY_BITS(&blocking_pool),
+				  ENTROPY_BITS(&input_pool));
+		if (n > 0)
+			return n;
+
+		/* Pool is (near) empty.  Maybe wait and retry. */
+		if (nonblock)
+			return -EAGAIN;
+
+		wait_event_interruptible(random_read_wait,
+			ENTROPY_BITS(&input_pool) >=
+			random_read_wakeup_bits);
+		if (signal_pending(current))
+			return -ERESTARTSYS;
+	}
+}
+
+static ssize_t
+random_read(struct file *file, char __user *buf, size_t nbytes, loff_t *ppos)
+{
+	return _random_read(file->f_flags & O_NONBLOCK, buf, nbytes);
+}
+
+static ssize_t
+urandom_read(struct file *file, char __user *buf, size_t nbytes, loff_t *ppos)
+{
+	int ret;
+
+	if (unlikely(nonblocking_pool.initialized == 0))
+		printk_once(KERN_NOTICE "random: %s urandom read "
+			    "with %d bits of entropy available\n",
+			    current->comm, nonblocking_pool.entropy_total);
+
+	nbytes = min_t(size_t, nbytes, INT_MAX >> (ENTROPY_SHIFT + 3));
+	ret = extract_entropy_user(&nonblocking_pool, buf, nbytes);
+
+	trace_urandom_read(8 * nbytes, ENTROPY_BITS(&nonblocking_pool),
+			   ENTROPY_BITS(&input_pool));
+	return ret;
+}
+
+static unsigned int
+random_poll(struct file *file, poll_table * wait)
+{
+	unsigned int mask;
+
+	poll_wait(file, &random_read_wait, wait);
+	poll_wait(file, &random_write_wait, wait);
+	mask = 0;
+	if (ENTROPY_BITS(&input_pool) >= random_read_wakeup_bits)
+		mask |= POLLIN | POLLRDNORM;
+	if (ENTROPY_BITS(&input_pool) < random_write_wakeup_bits)
+		mask |= POLLOUT | POLLWRNORM;
+	return mask;
+}
+
+static int
+write_pool(struct entropy_store *r, const char __user *buffer, size_t count)
+{
+	size_t bytes;
+	__u32 buf[16];
+	const char __user *p = buffer;
+
+	while (count > 0) {
+		bytes = min(count, sizeof(buf));
+		if (copy_from_user(&buf, p, bytes))
+			return -EFAULT;
+
+		count -= bytes;
+		p += bytes;
+
+		mix_pool_bytes(r, buf, bytes);
+		cond_resched();
+	}
+
+	return 0;
+}
+
+static ssize_t random_write(struct file *file, const char __user *buffer,
+			    size_t count, loff_t *ppos)
+{
+	size_t ret;
+
+	ret = write_pool(&blocking_pool, buffer, count);
+	if (ret)
+		return ret;
+	ret = write_pool(&nonblocking_pool, buffer, count);
+	if (ret)
+		return ret;
+
+	return (ssize_t)count;
+}
+
+static long random_ioctl(struct file *f, unsigned int cmd, unsigned long arg)
+{
+	int size, ent_count;
+	int __user *p = (int __user *)arg;
+	int retval;
+
+	switch (cmd) {
+	case RNDGETENTCNT:
+		/* inherently racy, no point locking */
+		ent_count = ENTROPY_BITS(&input_pool);
+		if (put_user(ent_count, p))
+			return -EFAULT;
+		return 0;
+	case RNDADDTOENTCNT:
+		if (!capable(CAP_SYS_ADMIN))
+			return -EPERM;
+		if (get_user(ent_count, p))
+			return -EFAULT;
+		credit_entropy_bits_safe(&input_pool, ent_count);
+		return 0;
+	case RNDADDENTROPY:
+		if (!capable(CAP_SYS_ADMIN))
+			return -EPERM;
+		if (get_user(ent_count, p++))
+			return -EFAULT;
+		if (ent_count < 0)
+			return -EINVAL;
+		if (get_user(size, p++))
+			return -EFAULT;
+		retval = write_pool(&input_pool, (const char __user *)p,
+				    size);
+		if (retval < 0)
+			return retval;
+		credit_entropy_bits_safe(&input_pool, ent_count);
+		return 0;
+	case RNDZAPENTCNT:
+	case RNDCLEARPOOL:
+		/*
+		 * Clear the entropy pool counters. We no longer clear
+		 * the entropy pool, as that's silly.
+		 */
+		if (!capable(CAP_SYS_ADMIN))
+			return -EPERM;
+		input_pool.entropy_count = 0;
+		nonblocking_pool.entropy_count = 0;
+		blocking_pool.entropy_count = 0;
+		return 0;
+	default:
+		return -EINVAL;
+	}
+}
+
+static int random_fasync(int fd, struct file *filp, int on)
+{
+	return fasync_helper(fd, filp, on, &fasync);
+}
+
+const struct file_operations random_fops = {
+	.read  = random_read,
+	.write = random_write,
+	.poll  = random_poll,
+	.unlocked_ioctl = random_ioctl,
+	.fasync = random_fasync,
+	.llseek = noop_llseek,
+};
+
+const struct file_operations urandom_fops = {
+	.read  = urandom_read,
+	.write = random_write,
+	.unlocked_ioctl = random_ioctl,
+	.fasync = random_fasync,
+	.llseek = noop_llseek,
+};
+
+SYSCALL_DEFINE3(getrandom, char __user *, buf, size_t, count,
+		unsigned int, flags)
+{
+	if (flags & ~(GRND_NONBLOCK|GRND_RANDOM))
+		return -EINVAL;
+
+	if (count > INT_MAX)
+		count = INT_MAX;
+
+	if (flags & GRND_RANDOM)
+		return _random_read(flags & GRND_NONBLOCK, buf, count);
+
+	if (unlikely(nonblocking_pool.initialized == 0)) {
+		if (flags & GRND_NONBLOCK)
+			return -EAGAIN;
+		wait_event_interruptible(urandom_init_wait,
+					 nonblocking_pool.initialized);
+		if (signal_pending(current))
+			return -ERESTARTSYS;
+	}
+	return urandom_read(NULL, buf, count, NULL);
+}
+
+/***************************************************************
+ * Random UUID interface
+ *
+ * Used here for a Boot ID, but can be useful for other kernel
+ * drivers.
+ ***************************************************************/
+
+/*
+ * Generate random UUID
+ */
+void generate_random_uuid(unsigned char uuid_out[16])
+{
+	get_random_bytes(uuid_out, 16);
+	/* Set UUID version to 4 --- truly random generation */
+	uuid_out[6] = (uuid_out[6] & 0x0F) | 0x40;
+	/* Set the UUID variant to DCE */
+	uuid_out[8] = (uuid_out[8] & 0x3F) | 0x80;
+}
+EXPORT_SYMBOL(generate_random_uuid);
+
+/********************************************************************
+ *
+ * Sysctl interface
+ *
+ ********************************************************************/
+
+#ifdef CONFIG_SYSCTL
+
+#include <linux/sysctl.h>
+
+static int min_read_thresh = 8, min_write_thresh;
+static int max_read_thresh = OUTPUT_POOL_WORDS * 32;
+static int max_write_thresh = INPUT_POOL_WORDS * 32;
+static char sysctl_bootid[16];
+
+/*
+ * This function is used to return both the bootid UUID, and random
+ * UUID.  The difference is in whether table->data is NULL; if it is,
+ * then a new UUID is generated and returned to the user.
+ *
+ * If the user accesses this via the proc interface, the UUID will be
+ * returned as an ASCII string in the standard UUID format; if via the
+ * sysctl system call, as 16 bytes of binary data.
+ */
+static int proc_do_uuid(struct ctl_table *table, int write,
+			void __user *buffer, size_t *lenp, loff_t *ppos)
+{
+	struct ctl_table fake_table;
+	unsigned char buf[64], tmp_uuid[16], *uuid;
+
+	uuid = table->data;
+	if (!uuid) {
+		uuid = tmp_uuid;
+		generate_random_uuid(uuid);
+	} else {
+		static DEFINE_SPINLOCK(bootid_spinlock);
+
+		spin_lock(&bootid_spinlock);
+		if (!uuid[8])
+			generate_random_uuid(uuid);
+		spin_unlock(&bootid_spinlock);
+	}
+
+	sprintf(buf, "%pU", uuid);
+
+	fake_table.data = buf;
+	fake_table.maxlen = sizeof(buf);
+
+	return proc_dostring(&fake_table, write, buffer, lenp, ppos);
+}
+
+/*
+ * Return entropy available scaled to integral bits
+ */
+static int proc_do_entropy(struct ctl_table *table, int write,
+			   void __user *buffer, size_t *lenp, loff_t *ppos)
+{
+	struct ctl_table fake_table;
+	int entropy_count;
+
+	entropy_count = *(int *)table->data >> ENTROPY_SHIFT;
+
+	fake_table.data = &entropy_count;
+	fake_table.maxlen = sizeof(entropy_count);
+
+	return proc_dointvec(&fake_table, write, buffer, lenp, ppos);
+}
+
+static int sysctl_poolsize = INPUT_POOL_WORDS * 32;
+extern struct ctl_table random_table[];
+struct ctl_table random_table[] = {
+	{
+		.procname	= "poolsize",
+		.data		= &sysctl_poolsize,
+		.maxlen		= sizeof(int),
+		.mode		= 0444,
+		.proc_handler	= proc_dointvec,
+	},
+	{
+		.procname	= "entropy_avail",
+		.maxlen		= sizeof(int),
+		.mode		= 0444,
+		.proc_handler	= proc_do_entropy,
+		.data		= &input_pool.entropy_count,
+	},
+	{
+		.procname	= "read_wakeup_threshold",
+		.data		= &random_read_wakeup_bits,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec_minmax,
+		.extra1		= &min_read_thresh,
+		.extra2		= &max_read_thresh,
+	},
+	{
+		.procname	= "write_wakeup_threshold",
+		.data		= &random_write_wakeup_bits,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec_minmax,
+		.extra1		= &min_write_thresh,
+		.extra2		= &max_write_thresh,
+	},
+	{
+		.procname	= "urandom_min_reseed_secs",
+		.data		= &random_min_urandom_seed,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec,
+	},
+	{
+		.procname	= "boot_id",
+		.data		= &sysctl_bootid,
+		.maxlen		= 16,
+		.mode		= 0444,
+		.proc_handler	= proc_do_uuid,
+	},
+	{
+		.procname	= "uuid",
+		.maxlen		= 16,
+		.mode		= 0444,
+		.proc_handler	= proc_do_uuid,
+	},
+#ifdef ADD_INTERRUPT_BENCH
+	{
+		.procname	= "add_interrupt_avg_cycles",
+		.data		= &avg_cycles,
+		.maxlen		= sizeof(avg_cycles),
+		.mode		= 0444,
+		.proc_handler	= proc_doulongvec_minmax,
+	},
+	{
+		.procname	= "add_interrupt_avg_deviation",
+		.data		= &avg_deviation,
+		.maxlen		= sizeof(avg_deviation),
+		.mode		= 0444,
+		.proc_handler	= proc_doulongvec_minmax,
+	},
+#endif
+	{ }
+};
+#endif 	/* CONFIG_SYSCTL */
+
+static u32 random_int_secret[MD5_MESSAGE_BYTES / 4] ____cacheline_aligned;
+
+int random_int_secret_init(void)
+{
+	get_random_bytes(random_int_secret, sizeof(random_int_secret));
+	return 0;
+}
+
+/*
+ * Get a random word for internal kernel use only. Similar to urandom but
+ * with the goal of minimal entropy pool depletion. As a result, the random
+ * value is not cryptographically secure but for several uses the cost of
+ * depleting entropy is too high
+ */
+static DEFINE_PER_CPU(__u32 [MD5_DIGEST_WORDS], get_random_int_hash);
+unsigned int get_random_int(void)
+{
+	__u32 *hash;
+	unsigned int ret;
+
+	if (arch_get_random_int(&ret))
+		return ret;
+
+	hash = get_cpu_var(get_random_int_hash);
+
+	hash[0] += current->pid + jiffies + random_get_entropy();
+	md5_transform(hash, random_int_secret);
+	ret = hash[0];
+	put_cpu_var(get_random_int_hash);
+
+	return ret;
+}
+EXPORT_SYMBOL(get_random_int);
+
+/*
+ * randomize_range() returns a start address such that
+ *
+ *    [...... <range> .....]
+ *  start                  end
+ *
+ * a <range> with size "len" starting at the return value is inside in the
+ * area defined by [start, end], but is otherwise randomized.
+ */
+unsigned long
+randomize_range(unsigned long start, unsigned long end, unsigned long len)
+{
+	unsigned long range = end - len - start;
+
+	if (end <= start + len)
+		return 0;
+	return PAGE_ALIGN(get_random_int() % range + start);
+}
+
+/* Interface for in-kernel drivers of true hardware RNGs.
+ * Those devices may produce endless random bits and will be throttled
+ * when our pool is full.
+ */
+void add_hwgenerator_randomness(const char *buffer, size_t count,
+				size_t entropy)
+{
+	struct entropy_store *poolp = &input_pool;
+
+	/* Suspend writing if we're above the trickle threshold.
+	 * We'll be woken up again once below random_write_wakeup_thresh,
+	 * or when the calling thread is about to terminate.
+	 */
+	wait_event_interruptible(random_write_wait, kthread_should_stop() ||
+			ENTROPY_BITS(&input_pool) <= random_write_wakeup_bits);
+	mix_pool_bytes(poolp, buffer, count);
+	credit_entropy_bits(poolp, entropy);
+}
+EXPORT_SYMBOL_GPL(add_hwgenerator_randomness);
+
+/*
+ * Experimental code to replace parts of random.c
+ * Everything from here down is new code.
+ * Sandy Harris, sandyinchina@...il.com
+ *
+ * Uses 128-bit hash from AES-GCM instead of 160-bit
+ * SHA-1. Changing the hash also allows other changes.
+ *
+ * Goals:
+ *
+ * The main design goal was improved decoupling so that
+ * heavy use of /dev/urandom does not deplete the entropy
+ * pool for /dev/random. As I see it, this is the only
+ * place where the current random(4) design is visibly
+ * flawed.
+ *
+ * Another goal was simpler mixing in of additional data
+ * in various places. This may help with the difficult
+ * problem of timely initialisation; there have been
+ * some security failures due to mis-handling of this
+ * issue. These cannot be completely dealt with in the
+ * driver, but we can do some things.
+ *
+ * I believe this code achieves both goals.
+ *
+ * The GCM hash:
+ *
+ * This sort of hash-like primitive has largely replaced
+ * more complex hashes in IPsec and TLS authentication;
+ * the new methods are often considerably faster and the
+ * code is simpler. It therefore seemed worth trying such
+ * a hash here.
+ *
+ * I chose the Galois field multiplication from AES-GCM
+ * because it is widely used, well-analysed, and
+ * considered secure. References are RFCs 4106 and 5288
+ * and NIST standard SP-800-38D.
+ *
+ * Intel and AMD both have instructions designed to
+ * make the GCM calculation faster
+ * https://en.wikipedia.org/wiki/CLMUL_instruction_set
+ * Those are not used in this proof-of-concept code
+ *
+ * https://eprint.iacr.org/2013/157.pdf discusses bugs
+ * in the Open SSL version of this hash.
+ *
+ * Whether GCM is secure for this application needs
+ * analysis. IPsec generates a 128-bit hash but uses
+ * only 96 bits, which makes some attacks much harder;
+ * this application uses all 128 bits. Also, the input
+ * for IPsec authentication is ciphertext, which is
+ * highly random with any decent cipher; input here is
+ * mainly pool data which may be much less random.
+ *
+ * Existing random(4) code folds the 160-bit SHA-1
+ * output to get an 80-bit final output; I do not
+ * consider such a transform necessary here, but that
+ * needs analysis too.
+ *	
+ * I add complications beyond the basic hash; those need
+ * analysis as well.
+ *
+ * Differences from current driver:
+ *
+ * I change nothing on the input side; the whole entropy
+ * collection and estimation part of existing code, as
+ * applied to the input pool, are untouched.
+ *
+ * The hashing and output routines, though, are completely
+ * replaced. The management of output pools is also changed;
+ * they just count how many outputs since the last reseed,
+ * as a counter-mode block cipher does, rather than trying
+ * to track entropy.
+ *
+ * Mixing:
+ *
+ * Much of the mixing uses invertible functions such
+ * as the pseudo-Hadamard transform or aria_mix().
+ * These provably cannot reduce entropy; if they
+ * did, it would not be possible to invert them.
+ *
+ * As in existing code, all operations putting data
+ * into any pool are unidirectional; they use += or
+ * ^= to mix in new data so they cannot reduce the
+ * randomness of the pool, even with bad input data.
+ *
+ * I add an array of constants[], two for each pool,
+ * for use in the hashing, and a counter[] used
+ * in every output operation. All operations that
+ * put new data into those are also unidirectional.
+ *
+ * Output dependencies
+ *
+ * Every output from a normal pool (input, blocking
+ * or non-blocking) involves a GCM hash of pool
+ * contents.
+ *
+ * As well as pool data, every output depends on:
+ *
+ *   two-128-bit entries from constants[] used
+ *      in the hashing
+ *   a global counter[] which is also hashed
+ *
+ * There is a 4th dummy pool (pool == NULL)
+ * which only hashes the counter, intended to
+ * replace the MD5 code in the current driver.
+ *
+ * There are three functions to get 128 bits,
+ * two from a specified pool p
+ *
+ *	get128( p, out )      may block
+ *	get_or_fail( p, out ) non-blocking
+ *
+ * get_any( out ) tries a series of sources,
+ * never blocks but does not always give a
+ * high-grade result
+ *
+ * Tests:
+ *
+ * Various tests here are deliberately more general
+ * than necessary; this protects against coding
+ * blunders, against flukes like a cosmic ray changing
+ * memory, and against misbehaviour from stressed devices
+ * like an overheated router, whether the stress is just
+ * natural or is part of an attack.
+ *
+ * For example, when a value is confidently expected
+ * to be either 0 or 1, if(x==0) ... if(x==1) ...
+ * is the obvious way to test it, but it is slightly
+ * safer to use if(x==0) ... else ... so unexpected
+ * cases can be handled. Similarly, end-of-loop tests
+ * could use x == N but x >= N is slightly safer.
+ *
+ * The value of this is arguably negligible and certainly
+ * minor, but the cost is near-zero and the behaviour
+ * is identical in all expected cases. I have therefore
+ * done this everywhere that I noticed it was possible.
+ * It would also be possible, of course, to detect and
+ * log unexpected cases, but it is not clear that this
+ * would be of much value.
+ */
+
+static spinlock_t counter_lock ;
+static spinlock_t constants_lock ;
+
+/*********************************************************
+ * unidirectional mixing operations
+ *
+ * both mix 128 bits from source into target
+ * two ways: xor or additions
+ ********************************************************/
+
+static void xor128(u32 *target, u32 *source)
+{
+#ifdef CONFIG_64BIT
+	u64 *s, *t ;
+	s = (u64 *) source ;
+	t = (u64 *) target ;
+	t[0] ^= s[0] ;
+	t[1] ^= s[1] ;
+#else
+	target[0] ^= source[0] ;
+	target[1] ^= source[1] ;
+	target[2] ^= source[2] ;
+	target[3] ^= source[3] ;
+#endif
+}	
+
+/*
+ * not a 128-bit addition,
+ * just four 32-bit or two 64-bit
+ */
+static void add128(u32 *target, u32 *source)
+{
+#ifdef CONFIG_64BIT
+	u64 *s, *t ;
+	s = (u64 *) source ;
+	t = (u64 *) target ;
+	t[0] += s[0] ;
+	t[1] += s[1] ;
+#else
+	target[0] += source[0] ;
+	target[1] += source[1] ;
+	target[2] += source[2] ;
+	target[3] += source[3] ;
+#endif
+}
+
+static void add256(u32 *target, u32 *source)
+{
+#ifdef CONFIG_64BIT
+	u64 *s, *t ;
+	s = (u64 *) source ;
+	t = (u64 *) target ;
+	t[0] += s[0] ;
+	t[1] += s[1] ;
+	t[2] += s[2] ;
+	t[3] += s[3] ;
+#else
+	target[0] += source[0] ;
+	target[1] += source[1] ;
+	target[2] += source[2] ;
+	target[3] += source[3] ;
+	target[4] += source[4] ;
+	target[5] += source[5] ;
+	target[6] += source[6] ;
+	target[7] += source[7] ;
+#endif
+}
+
+/*********************************************************************
+ * Two ways to mix a 128-bit buffer, one each for 256, 512 and 1024
+ * These are generic functions that can mix anything the right size
+ * None know anything about pools or take any locks
+ *
+ * All mix in place, using no external data except buffer contents
+ * Any temporary storage used is cleared before returning
+ *********************************************************************/
+
+/*
+ * The Aria block cipher is a Korean standard
+ * Cipher home page: http://210.104.33.10/ARIA/index-e.html
+ * See also RFC 5794
+ *
+ * This application uses only the linear transform from
+ * Aria, not the whole cipher
+ *
+ * Mixes a 128-bit object treated as 16 bytes
+ * Each output byte is the XOR of 7 input bytes
+ *
+ * Some caution is needed in applying this since the
+ * function is its own inverse; using it twice on the
+ * same data gets you right back where you started
+ *
+ * Version here is based on GPL source at:
+ * http://www.oryx-embedded.com/doc/aria_8c_source.html
+ */
+static void aria_mix( u8 *x )
+{
+	u8 y[16] ;
+
+	y[0] = x[3] ^ x[4] ^ x[6] ^ x[8] ^ x[9] ^ x[13] ^ x[14];
+	y[1] = x[2] ^ x[5] ^ x[7] ^ x[8] ^ x[9] ^ x[12] ^ x[15];
+	y[2] = x[1] ^ x[4] ^ x[6] ^ x[10] ^ x[11] ^ x[12] ^ x[15];
+	y[3] = x[0] ^ x[5] ^ x[7] ^ x[10] ^ x[11] ^ x[13] ^ x[14];
+	y[4] = x[0] ^ x[2] ^ x[5] ^ x[8] ^ x[11] ^ x[14] ^ x[15];
+	y[5] = x[1] ^ x[3] ^ x[4] ^ x[9] ^ x[10] ^ x[14] ^ x[15];
+	y[6] = x[0] ^ x[2] ^ x[7] ^ x[9] ^ x[10] ^ x[12] ^ x[13];
+	y[7] = x[1] ^ x[3] ^ x[6] ^ x[8] ^ x[11] ^ x[12] ^ x[13];
+	y[8] = x[0] ^ x[1] ^ x[4] ^ x[7] ^ x[10] ^ x[13] ^ x[15];
+	y[9] = x[0] ^ x[1] ^ x[5] ^ x[6] ^ x[11] ^ x[12] ^ x[14];
+	y[10] = x[2] ^ x[3] ^ x[5] ^ x[6] ^ x[8] ^ x[13] ^ x[15];
+	y[11] = x[2] ^ x[3] ^ x[4] ^ x[7] ^ x[9] ^ x[12] ^ x[14];
+	y[12] = x[1] ^ x[2] ^ x[6] ^ x[7] ^ x[9] ^ x[11] ^ x[12];
+	y[13] = x[0] ^ x[3] ^ x[6] ^ x[7] ^ x[8] ^ x[10] ^ x[13];
+	y[14] = x[0] ^ x[3] ^ x[4] ^ x[5] ^ x[9] ^ x[11] ^ x[14];
+	y[15] = x[1] ^ x[2] ^ x[4] ^ x[5] ^ x[8] ^ x[10] ^ x[15];
+	memcpy( x, y, 16 ) ;
+	zero128( y ) ;
+}
+
+/*
+ * The pseudo-Hadamard transform (PHT) can be
+ * applied to any word size and any number of words
+ * that is a power of two. Here for 4, 8 or 16
+ * 32-bit words.
+ *
+ * In all cases it is invertible so it provably loses
+ * no entropy, and it makes every output word depend
+ * on every input word.
+ *
+ * conceptually, a 2-way PHT on a, b is
+ *      	x = a + b
+ *      	y = a + 2b
+ *      	a = x
+ *      	b = y
+ * a better implementation is just
+ *      	a += b
+ *      	b += a
+ *
+ * Larger PHTs use multiple applications of that.
+ *
+ * If you have 64-bit operations and aligned
+ * data structures, then these can be made
+ * faster. Only pht128() and add128() need to
+ * change; others just call them.
+ *
+ * If 32-bit arithmetic is used, then pht128()
+ * pht256() and pht512() are exactly the PHT
+ * on the appropriate number of 32-bit words.
+ *
+ * The 64-bit versions are not quite PHTs, but
+ * the important properties remain. They are still
+ * invertible & still make all 32-bit output words
+ * depend on all input words.
+ */
+
+static void pht128( u32 *x )
+{
+#ifndef CONFIG_64BIT
+	/*
+	 * a 4-way PHT is built from 4 2-way PHTs
+	 * here it is unrolled into 8 += operations
+	 * each line is a two-way PHT
+	 */
+	x[0] += x[1] ; x[1] += x[0] ;
+	x[2] += x[3] ; x[3] += x[2] ;
+	x[0] += x[2] ; x[2] += x[0] ;
+	x[1] += x[3] ; x[3] += x[1] ;
+#else
+	/*
+	 * two 2-way 64-bit PHTs (4 += operations)
+	 * and a swap of two 32-bit words
+	 */
+	u32 temp ;
+	u64 *y ;
+	y = (u64 *) x ;
+	y[0] += y[1] ; y[1] += y[0] ;
+	temp = x[1]; x[1] = x[2] ; x[2] = temp ;
+	y[0] += y[1] ; y[1] += y[0] ;
+#endif
+}
+
+static void pht256( u32 *x )
+{
+	u32 *y ;
+	y = x + 4 ;
+
+	pht128(x) ;
+	pht128(y) ;
+	add128( x, y ) ;
+	add128( y, x ) ;
+}
+
+static void pht512( u32 *x )
+{
+	u32 *y ;
+	y = x + 8 ;
+
+	pht256(x) ;
+	pht256(y) ;
+	add256( x, y ) ;
+	add256( y, x ) ;
+}
+
+/*
+ * cube_mix() is from Daniel Bernstein's Cubehash
+ * It mixes 1024 bits, treated as an array of 32-bit words.
+ *
+ * based on Bernstein's code as distributed at
+ * http://bench.cr.yp.to/supercop.html
+ * He labels his code as public domain
+ *
+ * He has multiple versions. This is from the file
+ * cubehash1632/simple where 1632 indicates his main
+ * proposal (16 rounds and a 32-word state) and simple
+ * indicates the simplest code. The 1632 directory also
+ * has four different unrolled versions and over 20
+ * versions for specific hardware. There are also
+ * many other directories, so lots of options for
+ * eventual optimisations. Here I just use a simple
+ * one for proof-of-concept testing.
+ *
+ * The Cubehash algorithm has three stages:
+ *
+ *    1 put some constants into the array
+ *      mix with this transform to get initial state
+ *    2 for each input block
+ *        mix input into state
+ *        mix with this transform
+ *    3 mix with a different transform to
+ *       get an output smaller than state
+ *
+ * Here there is no stage 1 or 3 since the state we
+ * mix is already initialised and we want output of
+ * the same size. Nor is there any input data; we are
+ * not hashing here.
+ *
+ * We just use the central transform to mix a buffer. 
+ */
+
+/*
+ * This is what Bernstein uses in his main proposal
+ * Arguably we need more because we lack stages 1 and 3
+ * Arguably less since this not a hash; any mixing is OK
+ */
+#define CUBEHASH_ROUNDS 16
+
+static void cube_mix( u32 *x )
+{
+  int i;
+  int r;
+  u32 y[16];
+
+  for (r = 0;r < CUBEHASH_ROUNDS;++r) {
+    for (i = 0;i < 16;++i) x[i + 16] += x[i];
+    for (i = 0;i < 16;++i) y[i ^ 8] = x[i];
+    for (i = 0;i < 16;++i) x[i] = ROTL(y[i],7);
+    for (i = 0;i < 16;++i) x[i] ^= x[i + 16];
+    for (i = 0;i < 16;++i) y[i ^ 2] = x[i + 16];
+    for (i = 0;i < 16;++i) x[i + 16] = y[i];
+    for (i = 0;i < 16;++i) x[i + 16] += x[i];
+    for (i = 0;i < 16;++i) y[i ^ 4] = x[i];
+    for (i = 0;i < 16;++i) x[i] = ROTL(y[i],11);
+    for (i = 0;i < 16;++i) x[i] ^= x[i + 16];
+    for (i = 0;i < 16;++i) y[i ^ 1] = x[i + 16];
+    for (i = 0;i < 16;++i) x[i + 16] = y[i];
+  }
+  memzero_explicit(y, 64) ;
+}
+
+/********************************************************************
+ * Code to manage the array of two 128-bit "constants" per pool
+ * These are not really constants; this code changes them
+ * They are treated as constants in the extract-from-pool code
+ *********************************************************************/
+
+/*
+ * mix one pool's constants array, two 128-bit rows
+ * in place mixing, uses no external data
+ * PHT + a rotation to make it nonlinear
+ */
+static void mix_const_p( struct entropy_store *r )
+{
+	u32 *x ;
+	unsigned long flags ;
+
+	x = r->A ;
+
+	spin_lock_irqsave( &constants_lock, flags ) ;
+	*x = ROTL( *x, 5 ) ;
+	pht256( x ) ;
+	spin_unlock_irqrestore( &constants_lock, flags ) ; 
+}
+
+/*
+ * Update both constants for a pool.
+ * Needs no rotations because mix_const_p() has one
+ *
+ * Every call to this affects every hash for that pool,
+ * all future outputs from it, and all future feedback
+ * into it.
+ *
+ * This is the preferred way to rekey a pool, rather than
+ * buffer2pool() which mixes into the pool contents.
+ *
+ * This mixes in 128 bits of new data, so it is what the
+ * Yarrow paper calls "catastrophic reseeding". It resets
+ * r->count to indicate the rekeying but does not change
+ * r->entropy_count.
+ *
+ * All buffer2*() routines zero the input data after using it
+ */
+static void buffer2array( struct entropy_store *r, u32 *data )
+{
+	u32 *x;
+	unsigned long flags1, flags2 ;
+
+	x = r->A ;
+
+	spin_lock_irqsave( &r->lock, flags1 ) ;
+	spin_lock_irqsave( &constants_lock, flags2 ) ; 
+	xor128( x, data ) ;
+	pht256( x ) ;
+	spin_unlock_irqrestore( &constants_lock, flags2 ) ;
+	r->count = 0 ;
+	spin_unlock_irqrestore( &r->lock, flags1 ) ; 
+	zero128( data ) ;
+}
+
+/*
+ * mix the eight 128-bit constants[] for all pools
+ * in place mixing, uses no external data
+ *
+ * This uses the 1024-bit transform from Bernstein's Cubehash
+ * that has XOR, + and rotations so mixing is quite nonlinear
+ */
+static void mix_const_all( )
+{
+	unsigned long flags ;
+
+	spin_lock_irqsave( &constants_lock, flags ) ;
+	cube_mix( constants ) ;
+	spin_unlock_irqrestore( &constants_lock, flags ) ;
+}
+
+/*
+ * mix the constants[] array and both output pools
+ * all in-place mixing, no external data
+ */
+static void big_mix()
+{
+	struct entropy_store *n, *b ;
+	unsigned long flags, flags2 ;
+
+	n = &nonblocking_pool ;
+	b = &blocking_pool ;
+
+	(void) mix_const_all() ;
+
+	/*
+	 * mix the output pools if possible
+	 * with the default value for OUTPUT_POOL_WORDS
+	 * the if here always succeeds
+	 *
+	 * for the >32 case, only part of pool is mixed
+	 * but probably enough
+	 */
+	if( OUTPUT_POOL_WORDS >= 32 )	{
+		spin_lock_irqsave( &n->lock, flags ) ;
+		cube_mix( n->pool ) ;
+		spin_unlock_irqrestore( &n->lock, flags ) ;
+
+		spin_lock_irqsave( &b->lock, flags ) ;
+		cube_mix( b->pool ) ;
+		spin_unlock_irqrestore( &b->lock, flags ) ;
+	}
+	/*
+	 * the two pools combined are big enough
+	 * do one mix for both
+	 */
+	else if( (OUTPUT_POOL_WORDS >= 16) && (n->pool == b->pool+OUTPUT_POOL_WORDS) )	{
+		spin_lock_irqsave( &n->lock, flags ) ;
+		spin_lock_irqsave( &b->lock, flags2 ) ;
+		cube_mix( b->pool ) ;
+		spin_unlock_irqrestore( &b->lock, flags2 ) ;
+		spin_unlock_irqrestore( &n->lock, flags ) ;
+	}
+	/*
+	 * this should never be reached
+	 * but put in some code for safety
+	 */
+	else if( OUTPUT_POOL_WORDS >= 8 )	{
+		spin_lock_irqsave( &n->lock, flags ) ;
+		pht256( n->pool ) ;
+		spin_unlock_irqrestore( &n->lock, flags ) ;
+		spin_lock_irqsave( &b->lock, flags ) ;
+		pht256( b->pool ) ;
+		spin_unlock_irqrestore( &b->lock, flags ) ;
+	}
+	/* This should definitely never be reached */
+	else	pr_warn("random: strange output pool size %d\n", OUTPUT_POOL_WORDS ) ;
+}
+
+/*
+ * constants[] array has 10 128-bit rows
+ * 8 are pool constants, last 2 counter[]
+ *
+ * mix the last 4 rows
+ *   8 words in counter[]
+ *   8 words of constants[] for dummy_pool
+ *
+ * no rotations needed here; count() has enough
+ */
+static void top_mix()
+{
+	u32 *x ;
+	struct entropy_store *d ;
+	unsigned long flags1, flags2 ;
+
+	d = &dummy_pool ;
+	x = d->A ;
+
+	spin_lock_irqsave( &d->lock, flags1 ) ;
+	spin_lock_irqsave( &constants_lock, flags2 ) ;
+	pht512( x ) ;
+	spin_unlock_irqrestore( &constants_lock, flags2 ) ;
+	spin_unlock_irqrestore( &d->lock, flags1 ) ;
+}
+
+/**********************************************************************
+ * The main hashing routines, based on authenticator code from AES-GCM
+ *
+ * GCM is Galois Counter Mode
+ * All operations are in a Galois field with 128-bit elements
+ * see http://csrc.nist.gov/publications/nistpubs/800-38D/SP-800-38D.pdf
+ **********************************************************************/
+
+static u8 abits[128], ybits[128], prodbits[256] ;
+
+/*
+ * based on Dan Bernstein's AES-GCM implementation,
+ * part of CAESAR test code http://competitions.cr.yp.to/caesar.html
+ *
+ * Bernstein's description:
+ *
+ *     a = (a + x) * y in the finite field
+ *     16 bytes in a
+ *     xlen bytes in x; xlen <= 16; x is implicitly 0-padded
+ *     16 bytes in y
+ */
+
+static void addmul(u8 *a, const u8 *x, u32 xlen, const u8 *y)
+{
+	int i, j;
+
+	for (i = 0;i < xlen;++i)
+		a[i] ^= x[i];
+	for (i = 0;i < 128;++i)
+		abits[i] = (a[i / 8] >> (7 - (i % 8))) & 1;
+	for (i = 0;i < 128;++i)
+		ybits[i] = (y[i / 8] >> (7 - (i % 8))) & 1;
+
+	memzero_explicit( prodbits, 256 ) ;
+	for (i = 0;i < 128;++i)
+		for (j = 0;j < 128;++j)
+			prodbits[i + j] ^= abits[i] & ybits[j];
+	for (i = 127;i >= 0;--i)			{
+		prodbits[i] ^= prodbits[i + 128];
+		prodbits[i + 1] ^= prodbits[i + 128];
+		prodbits[i + 2] ^= prodbits[i + 128];
+		prodbits[i + 7] ^= prodbits[i + 128];
+		prodbits[i + 128] ^= prodbits[i + 128];
+	}
+
+	zero128( a ) ;
+	for (i = 0;i < 128;++i)
+		a[i / 8] |= (prodbits[i] << (7 - (i % 8)));
+}
+
+/*
+ * Bernstein's code has prodbits[], abits[] and ybits[] as locals 
+ * We make them global so this function can clear them
+ *
+ * With them as locals we could
+ * 	either clear them for every addmul() call (expensive)
+ *	or not clear them at all (possible, though minor, security risk)
+ * better to use globals, clear them at end of sequence
+ */
+static void clear_addmul()
+{
+	memzero_explicit( prodbits, 256 ) ;
+	memzero_explicit( abits, 128 ) ;
+	memzero_explicit( ybits, 128 ) ;
+}
+
+/*
+ * Mix n bytes into an accumulator using addmul()
+ *
+ * This is a keyed hash that takes nbytes of input, a 128-bit initial value
+ * and 128-bit key (the multiplier for addmul()), and gives a 128-bit output.
+ *
+ * This routine does not either initialise the accumulator or finalise output.
+ * The expected calling sequence looks like this:
+ *
+ *     intialise accumulator (from some constant)
+ *     call this to mix in data (another constant is multiplier)
+ *     optionally, repeat call one or more times for other data 
+ *     finalise output
+ *
+ * The main use here is against the various pools, replacing the hash
+ * previously used there. This should be faster and as secure, though
+ * speed needs testing & the security claim needs analysis.
+ *
+ * Note that it can be used with any data, and with a sequence of data
+ * chunks. In AES-GCM it is run over unencrypted headers so those can
+ * be authenticated along with the encrypted payload.
+ *
+ * Here it is run over counter[] as well as pool data so that outputs
+ * depend on a global piece of state, not just on one pool.
+ *
+ * It might also be run over any kernel data structure that is expected
+ * to be unpredictable to an enemy, giving extra entropy.
+ *
+ * It can also be run over anything that is expected to be different
+ *
+ *      on each machine (e.g. Ethernet MACs)
+ *      on each boot (clock data)
+ * or   on each read of /dev/urandom (process info for reader).
+ *
+ * Such data cannot be trusted for entropy; it may be unknown to some
+ * attackers, but we cannot rely on it being unknown to all. However it
+ * can still be useful in a role like that of salt in a hash; it makes
+ * brute force or table-driven attacks much harder.
+ */
+static void mix_in( u8 *data, u32 nbytes, u8 *mul, u32 *accum)
+{
+	u32 len, left ;
+	u8 *p ;
+	for( p = data, left = nbytes ; left != 0 ; p += len, left -= len)	{
+		len = (left >= 16) ? 16 : left ;
+		addmul( (u8 *) accum, p, len, mul ) ;
+	}
+}
+
+/*
+ * Start of every output routine.
+ *
+ * The Schneier et al Yarrow rng design rekeys a counter mode
+ * block cipher from its own output every 10 blocks, to avoid
+ * giving an enemy a sequence of related values to work on.
+ *
+ * Here we have feedback into any non-dummy pool on every iteration,
+ * changing 8 pool words every time. If the pool is 4K bits, 128 words,
+ * then every word is changed after 16 iterations; in a smaller pool
+ * this happens sooner. That may be all the rekeying we need, but there
+ * is some mixing of the constants here to supplement it.
+ *
+ * The dummy pool (r->pool == NULL) gets no feedback into the pool, so
+ * we mix its constants more often.
+ *
+ * This routine never requests output from any pool to drive rekeying.
+ * That overhead would be excessive in a routine that is called for
+ * every output operation from any pool.
+ *
+ * AES-GCM authentication is
+ *
+ *     initialise accumulator all-zero
+ *     mix in data with multiplier H
+ *     xor in H before output
+ *
+ * Algorithm here is
+ *
+ *     maybe mix constants r->A and r-o>B
+ *     initialise accumulator from r->A
+ *     mix in data with multiplier r->B
+ *         counter[] for any pool
+ *         pool data for non-dummy pools
+ *     xor in r->B
+ *
+ * That finishes the first hash. For the dummy pool, we stop
+ * there and use that output.
+ *
+ * Some constants, both primes from list at:
+ * https://primes.utm.edu/lists/small/10000.txt
+ *
+ * ADJUST THESE FOR TUNING
+ * To test, I just use the first primes > 10, 100
+ *
+ * FREQUENCY	how often to mix constants for most pools
+ * FREQDUMMY	for dummy pool
+ */
+
+#define FREQUENCY	  101
+#define FREQDUMMY	   11
+
+static void mix_first( struct entropy_store *r, u32 *accum )
+{
+	u32 x ;
+	unsigned long flags ;
+
+	spin_lock_irqsave( &r->lock, flags ) ;
+	x = r->count++ ;
+	spin_unlock_irqrestore( &r->lock, flags ) ;
+
+	/*
+	 * sometimes mix constants before using them
+	 * do not zero the count
+	 * only buffer2array() does that
+	 */
+	if( r->pool != NULL)	{
+		if( (x%FREQUENCY) == 0 )
+			mix_const_p( r ) ;
+	}
+	else	{
+		if( (x%FREQDUMMY) == 0 )
+			mix_const_p( r ) ;
+	}
+
+	/* initialise the accumulator */
+	memcpy( (u8 *) accum, (u8 *) r->A, 16 ) ;
+
+	/* mix in the counter and update it */
+	addmul( (u8 *) accum, (u8 *) counter, 16, (u8 *) r->B) ;
+	count() ;
+
+	/* for non-dummy pools, mix in pool data */
+	if( r->pool != NULL )
+		mix_in( (u8 *) r->pool, r->size, (u8 *) r->B, accum ) ;
+
+	/*
+	 * finalise result
+	 * it depends on at least r->A, r->B and counter[]
+	 * for non-dummy pools, on pool contents as well
+	 */
+	xor128( accum, r->B ) ;
+
+	clear_addmul() ;
+}
+
+/*
+ * Last function in mixing sequence for any of 3 real pools
+ * Not used for dummy pool
+ *
+ * No locking needed in this function
+ * Caller need not hold locks either, & should not
+ *
+ * First, put feedback into the pool
+ *
+ *   save a copy of the 1st hash's result
+ *   feed the result back into pool
+ *
+ * Then do 2nd hash to get output different from the feedback
+ *
+ *   re-initialise accumulator from r->B
+ *   mix in saved data with multiplier r->A
+ *   xor in data to get output
+ *
+ * The constants are used differently in the two hashes. In
+ * mix_first(), A is the initialiser and B the multiplier.
+ * In the second hash here, they swap roles.
+ *
+ * In the first hash, the same constant is used twice, first
+ * as the muiltipler in finite field multiplication then in
+ * an XOR. This is exactly the way that AES-GCM uses its
+ * constant H.
+ *
+ * AES-GCM has:    hash( data, all-0, H ) xor H
+ * our 1st hash:   hash( data,   A,   B ) xor B
+ * our 2nd hash:   hash( data,   B,   A ) xor data
+ *
+ * A well-known paper on building hashes from block ciphers,
+ * pretty much the standard reference on the topic, is:
+ * Preneel, Govaerts & Vandewalle
+ * https://www.cosic.esat.kuleuven.be/publications/article-48.ps
+ *
+ * It shows that some structures resist backtracking.
+ * They consider 64 possibilities and show that exactly
+ * 12 of them are secure. Both hashes here use structures
+ * from among that 12.
+ */
+
+static void mix_last( struct entropy_store *r, u32 *accum )
+{
+	u32 temp[4] ;
+
+	/*
+	 * for the dummy pool, this should not be called
+	 * if it is, there is nothing to do here
+	 */
+	if( r->pool == NULL )	{
+		pr_warn("random: mix_last() called for dummy pool\n" ) ;
+		return ;
+	}
+
+	/*
+	 * for any other pool, continue
+	 * save result for use in generating output
+	 */
+	memcpy( temp, accum, 16 ) ;
+
+	/* feed intermediate result back into pool */
+	buffer2pool( r, accum ) ;
+
+	/*
+	 * Apply another hash step to the saved value in temp[]
+	 * to create an output different from feedback
+	 */
+	memcpy( accum, r->B, 16 ) ;
+	addmul( (u8 *) accum, (u8 *) temp, 16, (u8 *) r->A) ;
+	xor128( accum, temp ) ;
+
+	clear_addmul() ;
+	zero128( temp ) ;
+}
+
+/*
+ * Input pool rekeys from external data and maybe hardware rng
+ * Blocking pool rekeys from the input pool before every output
+ * Dummy pool gets its constants changed when top_mix() is used.
+ *
+ * In mix_first() all pools sometimes mix their own constants
+ * and in mix_last() all non-dummy pools get feedback applied
+ * to their pool data. All pools are affected by the counter[]
+ * and by mix_const_all().
+ *
+ * The only place where rekeying needs more complex management
+ * is for the nonblocking pool
+ *
+ * The blocking pool generates only one /dev/random output
+ * each time it is reseeded. It appears safe to generate
+ * additional outputs to reseed the nonblocking pool; there is
+ * good mixing there so blocking pool output is not exposed to
+ * attack by this, except in a remarkably indirect way.
+ *
+ * The blocking pool is reseeded whenever /dev/random is
+ * used, so if it is used often, then the nonblocking pool
+ * will almost always be able to safely reseed from there.
+ *
+ * How many outputs can we safely take from a seeded pool?
+ * ======================================================
+ *
+ * Too large a value will be insecure, but it is not clear what
+ * "too large" means here. The question has been well studied
+ * for counter mode block ciphers, but the analysis does not
+ * apply directly here; at best it allows sensible guesses.
+ *
+ * For n-bit block size the Yarrow paper shows a generic attack
+ * for any counter mode block cipher after 2^(n/3) output blocks,
+ * about 2^42 for 128-bit block size, and one NIST document
+ * suggests an absolute upper limit of 2^48 for AES-CTR.
+ *
+ * Real applications generally use a much lower limit. Here I
+ * think a value for SAFE_OUT around 2^16 is the largest that
+ * could reasonably be considered, perhaps the prime (2^16)+1.
+ *
+ * However, using that seems unnecessary; a much lower value
+ * is enough to effectively decouple /dev/urandom and /devrandom.
+ * We want a low enough value that going over it sometimes when
+ * entropy is low will not be fatal.
+ *
+ * Even if /dev/random is not used, the nonblocking pool can reseed
+ * from the blocking pool SAFE_OUT times before it needs to reseed
+ * from a hardware rng or the input pool. Since it does SAFE_OUT
+ * output blocks per reseed, it can produce SAFE_OUT*SAFE_OUT blocks
+ * before it needs to reseed other than from the blocking pool.
+ *
+ * Using primes (just because), some possibilities are:
+ *
+ * with SAFE_OUT =   31,     almost   1,000 blocks
+ * with SAFE_OUT =  101,     over    10,000 blocks
+ * with SAFE_OUT =  331,     over   100,000 blocks
+ * with SAFE_OUT =  503,     over   250,000 blocks
+ * with SAFE_OUT = 1009,     over 1,000,000 blocks
+ * with SAFE_OUT = (2^16)+1, over      2^32 blocks
+ *
+ * Any sensible value for SAFE_OUT will greatly reduce load on the
+ * input pool when the nonblocking pool is heavily used.
+ */
+
+#define SAFE_OUT 503
+
+/* constants to test input pool entropy level */
+#define E_MINIMUM	1024
+#define E_PLENTY	(INPUT_POOL_WORDS*24)
+
+/*
+ * try to get 128 bits from a pool
+ * return 1 for success, 0 for failure
+ */
+static int get_or_fail( struct entropy_store *r, u32 *out )
+{
+	int got ;
+	u32 temp[4] ;
+	unsigned long flags ;
+
+	if( r == &input_pool )		{
+		spin_lock_irqsave( &r->lock, flags ) ;
+		if( (got = (ENTROPY_BITS(r) > E_MINIMUM)) )
+			credit_entropy_bits( r, -128 ) ;
+		spin_unlock_irqrestore( &r->lock, flags ) ;
+		if( got )		{
+			mix_first( r, out ) ;
+			mix_last( r, out ) ;
+			return 1 ;
+		}
+		else	return 0 ;
+	}
+	else if( (r == &blocking_pool) || (r == &nonblocking_pool) )	{
+		/*
+		 * need not lock here
+		 * going slightly over SAFE_OUT is not dangerous
+		 */
+		if( r->count < SAFE_OUT )	{
+			mix_first( r, out ) ;
+			mix_last( r, out ) ;
+			return 1 ;
+		}
+		else	return 0 ;
+	}
+	/*
+	 * dummy pool always succeeds
+	 * but may need rekeying first
+	 */
+	else if( r == &dummy_pool)	{
+		if( r->count >= SAFE_OUT )	{
+			get_any( temp ) ;
+			buffer2array( r, temp ) ;
+		}
+		mix_first( r, out ) ;
+		return 1 ;
+	}
+	else	{
+		pr_warn("random: get_or_fail() gets bad pool argument\n" ) ;
+		return 0 ;
+	}
+}
+
+/*
+ * get 128 bits from somewhere
+ * always succeeds, but may not always give good data
+ *
+ * return value indicates data source
+ * 1 = input, 2 = blocking, 3 = nonblocking
+ * 4 = dummy, 5 = hw rng
+ */
+static int get_any( u32 *out )
+{
+	int got ;
+	struct entropy_store *r ;
+	unsigned long flags ;
+
+	/*
+	 * use the input pool if it has plenty
+	 * of entropy
+	 *
+	 * unlike get_or_fail(), this function
+	 * does not test for > E_MINIMUM
+	 * so it avoids depleting input entropy
+	 * except when there is plenty
+	 */
+	r = &input_pool ;
+	spin_lock_irqsave( &r->lock, flags ) ;
+	if( (got = (ENTROPY_BITS(r) > E_PLENTY)) )
+		credit_entropy_bits( r, -128 ) ;
+	spin_unlock_irqrestore( &r->lock, flags ) ;
+	if( got )	{
+		mix_first( r, out ) ;
+		mix_last( r, out ) ;
+		return 1 ;
+	}
+
+	/*
+	 * this is likely to be the most common case
+	 * & should usually succeed
+	 */
+	if( get_or_fail( &blocking_pool, out ) )
+		return 2 ;
+
+	/*
+	 * hw rng may not be fully trusted,
+	 * but it is fine as a fallback here
+	 */
+	if( get_hw_random( out ) )	{
+		/*
+		 * if we reach here, hw rng works
+		 * but input pool is not close to full
+		 * so try to refill it
+		 */
+		load_input() ;
+		return 5 ;
+	}
+
+	/* reaching here should be rare; do what we can */
+	if( get_or_fail( &nonblocking_pool, out ) )
+		return 3 ;
+
+	/* dummy pool always succeeds */
+	get128( &dummy_pool, out ) ;
+	return 4 ;
+}
+
+/*
+ * get 128 bits from a pool
+ * for input or blocking pool, this may block
+ * for dummy or nonblocking, it will not
+ */
+
+static u32 rekey_flip_flop = 0 ;
+
+static void get128( struct entropy_store *r, u32 *out )
+{
+	u32 temp[4] ;
+	unsigned long flags ;
+
+	/*
+	 * get_or_fail( r, out ) cannot be used here
+	 * pool must be rekeyed before output
+	 */
+	if( r == &blocking_pool )	{
+		/*
+		 * try non-blocking function first
+		 * if it fails, use blocking function
+		 */
+		if( !get_or_fail( &input_pool, temp ) )
+			get128( &input_pool, temp ) ;
+		/*
+		 * one way or the other, we have data, so reseed
+		 * r->count is reset in buffer2array()
+		 */
+		buffer2array( r, temp ) ;
+
+		/* produce output */
+		mix_first( r, out ) ;
+		mix_last( r, out ) ;
+		return ;
+	}
+
+	/*
+	 * for any pool except blocking
+	 * see if pool is ready for output
+	 * dummy pool is always ready
+	 */ 
+	if( get_or_fail( r, out) )	{
+		return ;
+	}
+
+	/*
+	 * nonblocking pool not ready
+	 * rekey it, without blocking
+	 */
+	if( r == &nonblocking_pool )	{
+		/*
+		 * First choice is to rekey from blocking pool
+		 * This should very often succeed
+		 * else non-blocking function that always succeeds
+		 */
+		if( !get_or_fail(&blocking_pool, temp) )
+			(void) get_any( temp ) ;
+		/*
+		 * one way or the other, we have data, so reseed
+		 * r->count is reset in buffer2array()
+		 */
+		buffer2array( r, temp ) ;
+		/*
+		 * Do some extra mixing
+		 *
+		 * Rekeying is infrequent enough (once
+		 * every SAFE_OUT blocks) that we can
+		 * afford a somewhat expensive mix here
+		 *
+		 * constants[] has 10 128-bits rows
+		 * 8 for pool constants, 2 for counter[]
+		 *
+		 * mix_const_all() mixes first 8 rows
+		 * top_mix() mixes last 4
+		 * they overlap so all 10 get mixed
+		 * if both are used
+		 */
+		if( rekey_flip_flop )	{
+			/*
+			 * Mix all the pool constants
+			 * so the rekey affects all pools
+			 * This is the only full mix except
+			 * during initialisation
+			 */
+			mix_const_all() ;
+			rekey_flip_flop = 0 ;
+		}
+		else	{
+			/*
+			 * mix counter[]
+			 * and constants for dummy pool  
+			 */
+			top_mix() ;
+			rekey_flip_flop = 1 ;
+		}
+
+		/* produce output */
+		mix_first( r, out ) ;
+		mix_last( r, out ) ;
+		return ;
+	}
+
+	if( r == &input_pool )	{
+		/* pool entropy is low, so try hw rng */
+		if( !load_input() )	{
+			/* no hw rng, toss in something */
+			(void) get_any( temp ) ;
+			buffer2pool( r, temp ) ;
+		}
+
+		/*
+		 * ADD CODE HERE
+		 * adapt code from current driver
+		 * needs to block sometimes
+		 * and deal with entropy_count
+		 */
+		spin_lock_irqsave( &r->lock, flags ) ;
+		credit_entropy_bits( r, -128 ) ;
+		spin_unlock_irqrestore( &r->lock, flags ) ;
+
+		/* produce output */
+		mix_first( r, out ) ;
+		mix_last( r, out ) ;
+		return ;
+	}
+}
+
+/*****************************************************************
+ * loop to fill an output buffer with data
+ * for input or blocking pool, this may block
+ *****************************************************************/
+
+static void loop_output( struct entropy_store *r, u32 *out, u32 nbytes )
+{
+	u32 temp[4] ;
+	int n, m ;
+	u8 *p ;
+
+	/*
+	 * for pools that may block, try to avoid it
+	 * fill input pool from hw rng if available
+	 */
+	if( got_hw_rng && ((r == &input_pool) || (r==&blocking_pool)) )
+		load_input() ;
+
+	/*
+	 * Ensure that each call to this function will start
+	 * a new output stream which is almost independent
+	 * of previous streams. For a rationale, see the
+	 * Fortuna paper by Schneier et al.
+	 */
+	counter_any() ;
+
+	/*
+	* ADD CODE HERE?
+	*
+	* For /dev/urandom accesses, we could mix in process
+	* info for the reading process, just apply addmul()
+	* to task_info struct to mix it into counter[] or
+	* into the constants
+	*
+	* This depends on a different aspect of the system than
+	* anything else in the driver, namely the order in which
+	* user processes ask for data and the current state of
+	* those processes.
+	*
+	* Except perhaps on simple embedded systems, this should
+	* be hard to guess. It should be impossible to monitor
+	* unless the attacker is logged into the system or has
+	* left a background process running on it. Even then,
+	* monitoring it would not be easy.
+	*/
+
+	for( n = nbytes, p = (u8 *) out ; n > 0 ; n -= m, p += m )	{
+		m = (n >= 16) ? 16 : n ;
+		get128( r, temp ) ;
+		memcpy( p, (u8 *) temp, m) ;
+	}
+	zero128( temp ) ;
+}
+
+/******************************************************************
+ * Mixing into pool data
+ *
+ * This routine is used only to mix data into the pool itself,
+ * for feedback in mix_last()
+ *
+ *   Output operations from any pool use the hashing parts of
+ *   mix_last(), not this code.
+ *
+ *   For rekeying, buffer2array() is preferred over this; change a
+ *   constant rather than pool data. The effects are more easily
+ *   analysed, and more general since changing a constant always
+ *   affects the pool but not vice versa.
+ *
+ * Use this only for data known to be (or at least appear)
+ * highly random
+ *
+ *      hardware RNG data
+ *      hash output
+ *      cipher output (not used here)
+ *
+ * Input mixing should NOT use this; existing driver code is far
+ * better for low-to-medium entropy inputs. Existing code is OK
+ * for high-entropy inputs as well, though it appears to have been
+ * designed for the low entropy case.
+ *
+ * I added this in hopes it would be faster, and easier to analyze
+ * in the high-entropy case. Also, using two different mixers gives
+ * insurance if either has some unknown weakness.
+ *******************************************************************/
+
+/*
+ * Mix a 128-bit buffer into a pool, changing 8 32-bit pool words
+ * All buffer2*() routines zero the input data after using it
+ *
+ * This does not reset r->count; only buffer2array() does that
+ * Nor does it change r->entropy_count
+ *
+ * Eventually this stirs the entire pool, making every pool word
+ * depend both on every other pool word and on many external inputs.
+ * This is the only stirring the output pools get, except during
+ * initialisation.
+ */
+static void buffer2pool( struct entropy_store *r, u32 *buff)
+{
+	u32 *a, *b ;
+	unsigned long flags ;
+
+	/* normal case, real pool */
+	if( r->pool != NULL )	{
+		spin_lock_irqsave( &r->lock, flags ) ;
+		a = r->p ;
+		b = r->q ;
+		/* mix a[] and add new data */
+		a[0] = ROTL( a[0], 5 ) ;
+		xor128( a, buff ) ;
+		pht128( a ) ;
+		/* mix b[] */
+		aria_mix( (u8 *) b ) ;
+		/* PHTs between rows */
+		add128( a, b ) ;
+		add128( b, a ) ;
+		/* update pointers */
+		r->p += 4 ;
+		if( r->p >= r->end )
+			r->p = r->pool ;
+		r->q += 4 ;
+		if( r->q >= r->end )
+			r->q = r->pool ;		
+		spin_unlock_irqrestore( &r->lock, flags ) ;
+		zero128( buff ) ;
+	}
+	/*
+	 * if called for dummy pool, which should not happen
+	 * there is no pool to mix to
+	 * so mix to constants instead
+	 */
+	else	{
+		buffer2array( r, buff ) ;
+		pr_warn("random: buffer2pool() called for dummy pool\n" ) ;
+	}
+}
+
+/*********************************************************
+ * initialise counter & output pools
+ *
+ * This should not be done until there is enough (256 bits?)
+ * entropy in the input pool.
+ *
+ * This code does not deal with that problem!
+ * FIX BEFORE USING
+ ********************************************************/
+
+/* how many 128-bit chunks to mix into a pool */
+#define HOW_MANY	4
+
+static void init_random()
+{
+	u32 temp[4], *x, *y ;
+	int j ;
+	struct entropy_store *i, *b, *n, *d ;
+	ktime_t now ;
+
+	i = &input_pool ;
+	b = &blocking_pool ;
+	n = &nonblocking_pool ;
+	d = &dummy_pool ;
+
+	spin_unlock( &counter_lock ) ;
+	spin_unlock( &constants_lock ) ;
+
+	/*
+	 * fill input pool from hardware rng if possible
+	 * if that works, mix hw data into constants as well
+	 */
+	if( load_input() )
+		(void) load_constants() ;
+
+	/*
+	 * ADD CODE HERE?
+	 *
+	 * If data from kernel command line is available,
+	 * mix it into counter[] or input pool before doing
+	 * anything else. Either way, it will then affect
+	 * all future operations
+	 *
+	 * Simplest: XOR 256 bits into 8 words of counter[]
+	 * or with exactly 128, call buffer2counter()
+	 */
+ 
+	mix_first( i, temp ) ;
+
+	/*
+	 * Existing code to get data for the input pool uses timer
+	 * information. So do programs like my maxwell(8), Stephan
+	 * Mueller's jitter driver or Havege. Most of my code here
+	 * therefore does not use timings since that entropy is
+	 * already accounted for. There are two exceptions:
+	 *
+	 * buffer2counter() mixes in jiffies
+	 *
+	 * Here timer info is added so initialisation is a bit
+	 * different each time. Nowhere near enough entropy
+	 * to make things secure by itself, but better than
+	 * nothing.
+	 */
+	now = ktime_get_real() ;
+	mix_in( (u8 *) &now, sizeof(now), (u8 *) i->B, temp) ;
+
+	mix_in( (u8 *) utsname(), sizeof(*(utsname())), (u8 *) i->B, temp) ;
+
+	/*
+	 * ADD CODE HERE
+	 *
+	 * Mix static info into temp[]
+	 * things that can act as salt
+	 * 
+	 * These need not be unpredictable
+	 * just different on different systems
+	 * e.g. ethernet MAC, other hardware info.
+	 *
+	 * Existing code uses utsname(). That and if
+	 * possible more should be added here.
+	 */
+
+	mix_last( i, temp ) ;
+
+	/*
+	 * Use that first result to re-initialise the counter
+	 * This will affect all future outputs from any pool
+	 *
+	 * Provided enough entropy is present before this,
+	 * from any of:
+	 *	data in random_init.h
+	 *	kernel command line
+	 *	input to pool before this runs
+	 * this makes the counter unknowable to an enemy
+	 *
+	 * All future outputs, including the ones that
+	 * rekey pools below, depend on the counter
+	 */
+	buffer2counter( temp ) ;
+
+	/*
+	 * mix data into the output pools
+	 * try to get from input pool first
+	 * else from dummy pool which never blocks
+	 *
+	 * don't use get_any() yet; its only advantage
+	 * over just using dummy pool is that it might
+	 * get from output pools, but that is much more
+	 * expensive and output pools are not fully
+	 * initialised yet 
+	 */
+	for( j = 0, x=n->pool, y=b->pool ; j < HOW_MANY ; j++, x+=4, y+=4 )	{
+		if( !get_or_fail(i, temp) )
+			get128( d, temp) ;
+		spin_lock( &n->lock) ;
+		xor128( x, temp ) ;
+		spin_unlock( &n->lock) ;
+		spin_lock( &b->lock) ;
+		add128( y, temp ) ;
+		spin_unlock( &b->lock) ;
+	}
+	/* now get_any() and constants_any() can safely be used */
+
+	/*
+	 * refill input pool from hardware rng if possible
+	 * if that works, mix hw data into constants as well
+	 */
+	if( load_input() )	{
+		(void) load_constants() ;
+	}
+	else	{
+		/*
+		 * update counter[] and constants for dummy pool
+		 * before using them
+		 */
+		top_mix() ;
+		/*
+		 * mix pseudorandom bits into input pool
+		 * use cheap non-blocking source, dummy pool
+		 */
+		for( j = 0, x=i->pool ; j < HOW_MANY ; j++, x+=4 )	{
+			get128( d, temp ) ;
+			add128( x, temp ) ;
+		}
+		/*
+		 * mix random data into constants[]
+		 * use best available data
+		 */
+		(void) get_any( temp ) ;
+		buffer2array( i, temp );
+		(void) get_any( temp ) ;
+		buffer2array( n, temp );
+		(void) get_any( temp ) ;
+		buffer2array( b, temp );
+		(void) get_any( temp ) ;
+		buffer2array( d, temp );
+	}
+	/* Mix constants[] and both output pools */
+	big_mix() ;
+
+	/* output should use a different counter[] value */
+	counter_any() ;	
+}
+
+/*****************************************************************
+ * 128-bit counter to mix in when hashing
+ *
+ * There is only one counter[] and three functions to update it,
+ * count() to iterate it, buffer2counter() or counter_any()
+ * to re-initialise it with a new starting value
+ * 
+ * mix_first() uses counter[] and calls count(), so the count both
+ * affects and is affected by all output operations on any pool.
+ *
+ * Operations on this counter do not affect the per-pool counts
+ * for any pool, neither the entropy count nor the r->count
+ * iteration counter.
+ *
+ * One reason for including the counter is that it allows fast
+ * initialisation. The very first output from the input pool is
+ * used to update the counter. Once that is done, even if the
+ * pools were all worthless, every output operation would still
+ * have at least the strength of hash(constants, counter) which
+ * is very roughly equivalent to a counter mode block cipher
+ * encrypt(key,counter).
+ *
+ * mix_first() mixes in the counter so it affects all output from
+ * any pool and all feedback into any pool. Every operation on any
+ * pool changes the counter, so it automatically influences all the
+ * other pools, albeit in an indirect and quite limited way.
+ *
+ * This can contribute to recovery after an rng state compromise.
+ * Even knowing the counter value at one time an enemy cannot infer
+ * the future effects unless he can predict the order of future
+ * output operations, which depends on data requests from all sources.
+ * Nor can he work backwards to get previous outputs unless he knows
+ * the order of previous operations.
+ *
+ * This may provide almost no protection on a simple embedded system
+ * or over a very short time span, since in those cases an enemy
+ * might guess the sequence of operations or search through some
+ * moderate number of possibilties. However it should be quite
+ * effective for more complex systems and longer time spans. 
+ ****************************************************************/
+
+static u32 iter_count = 0 ;
+static u32 loop_count = 0 ;
+
+/*
+ * 41 times 251 iterations per loop
+ * gives about 10,000 outputs before auto-rekey
+ */
+#define MAX_LOOPS 41
+
+/* constant from SHA-1 */
+#define COUNTER_DELTA 0x67452301
+
+/*
+ * Code is based on my own work in the Enchilada cipher:
+ * https://aezoo.compute.dtu.dk/doku.php?id=enchilada
+ * That implements a 128-bit counter in 4 32-bit words
+ *
+ * Here counter[] is declared as 8 words; the others
+ * are used only during updates, in buffer2counter()
+ *
+ * Add a constant instead of just incrementing, and include some
+ * other operations, so Hamming weight changes more than for a
+ * simple counter. Mix +, XOR and rotation so it is nonlinear.
+ *
+ * This may not be strictly necessary, but a simple counter can
+ * be considered safe only if you trust the crypto completely.
+ * Low Hamming weight differences in inputs do allow some attacks
+ * on block ciphers or hashes and the high bits of a large counter
+ * that is only incremented do not change for aeons.
+ *
+ * The extra code here is cheap insurance.
+ *
+ * For discussion, see mailing list thread starting at:
+ * http://www.metzdowd.com/pipermail/cryptography/2014-May/021345.html
+ */
+
+static void count(void)
+{
+	int reseed ;
+	unsigned long flags ;
+
+	/*
+	 * There should be enough other rekeying that
+	 * this is quite rare. This is just here for
+	 * safety, much as IPsec rekeys after 2^32
+	 * blocks if no other rekeying is done.
+	 */
+	spin_lock_irqsave( &counter_lock, flags ) ;
+	reseed = (loop_count >= MAX_LOOPS) ;
+	spin_unlock_irqrestore( &counter_lock, flags ) ;
+	if( reseed )
+		counter_any() ;
+
+	spin_lock_irqsave( &counter_lock, flags ) ;
+
+	/*
+	 * Limit the switch to < 256 cases
+	 * should work with any CPU & compiler
+	 *
+	 * Five constants used, all primes
+	 * roughly evenly spaced, around 50, 100, 150, 200, 250
+	 */
+	switch( iter_count )	{
+		/*
+		 * mix three array elements
+		 * each element is used twice
+		 * once on left, once on right
+		 * pattern is circular
+		 * order chosen for fast mixing
+		 */
+		case 47:
+			counter[1] += counter[3] ;
+			break ;
+		case 101:
+			counter[2] += counter[1] ;
+			break ;
+		case 197:
+			counter[3] += counter[2] ;
+			break ;
+		/*
+		 * inject counter[0] into that loop
+		 * the loop and counter[0] use +=
+		 * so use ^= here
+		 *
+		 * inject into counter[2]
+		 * so case 197 starts spreading the effect
+		 */
+		case 149:
+			counter[2] ^= counter[0] ;
+			break ;
+		/*
+		 * restart loop
+		 * throw in rotations for nonlinearity
+		 */
+		case 251:
+			counter[1] = ROTL( counter[1], 3) ;
+			counter[2] = ROTL( counter[2], 7) ;
+			counter[3] = ROTL( counter[3], 13) ;
+			iter_count = 0 ;
+			loop_count++ ;
+			break ;
+		/*
+		 * for 247 out of every 252 iterations
+		 * the switch does nothing
+		 */ 
+		default:
+			break ;
+	}
+	/*
+	 * counter[0] is purely a counter
+	 * nothing above affects it
+	 * uses += instead of ++ to change Hamming weight more
+	 *
+	 * would repeat after 2^32 iterations
+	 * not a problem since the rest of counter[] changes too
+	 * and 2^32 will not be reached
+	 */
+	counter[0] += COUNTER_DELTA ;
+	iter_count++ ;
+
+	spin_unlock_irqrestore( &counter_lock, flags ) ;
+}
+
+/*
+ * code to set a new counter value
+ *
+ * All buffer2*() routines
+ *    expect 128 bits of input
+ *    zero the input data after using it
+ */
+static void buffer2counter( u32 *data )
+{
+	unsigned long flags ;
+
+	spin_lock_irqsave( &counter_lock, flags ) ;
+	/*
+	 * timing data is used elsewhere in driver
+	 * and we do not want an expensive operation
+	 * here, so use simplest thing that makes
+	 * every call different
+	 */
+	counter[0] ^= jiffies ;
+	/*
+	 * mix all 8 words in counter[] array
+	 * this and top_mix() are the only things
+	 * that change the high 4 words
+	 */
+	pht256( counter ) ;
+	/*
+	 * input data mixed into low 4 words of counter[]
+	 * which are the actual 128-bit counter
+	 *
+	 * high 4 words are multiplier in GCM mixing
+	 * this is the only place they are used
+	 */
+	addmul( (u8 *) counter, (u8 *) data, 16, (u8 *) (counter+4) ) ;
+	/*
+	 * make the mixing non-invertible
+	 * see reference to Preneel et al. in comment for mix_last()
+	 */
+	xor128( counter, data ) ;
+
+	loop_count = 0 ;
+	iter_count = 0 ;
+
+	spin_unlock_irqrestore( &counter_lock, flags ) ;
+
+	zero128( data ) ;
+	clear_addmul() ;
+}
+
+static void counter_any( )
+{
+	u32 temp[4] ;
+	(void) get_any( temp ) ;
+	buffer2counter( temp ) ;
+}
+
+/****************************************************************
+ * Code to deal with hardware RNG, if any
+ *
+ * get_hw_random() just puts 128 bits from hw rng in a buffer
+ *
+ * load_input() makes sure that, if we have a hardware rng, then the
+ * input pool is well supplied with data
+ *
+ * Absent an rng instruction, these functions would be the logical
+ * place to add data from something else, such as a hardware rng
+ * accessed via a driver rather than an instruction (Turbid, or an
+ * on-board or plug-in device) or something using timing data such
+ * as Havege or Stephan Mueller's jitter. There is no code for that
+ * here yet.
+ *
+ * Both get_hw_random() and load_input() set got_hw_rng and return
+ * a value for success/failure. If all arch_get_random_long() calls
+ * succeed, both got_hw_rng and the return are 1; if any call fails
+ * both are 0
+ *
+ * Code calling those functions can either check got_hw_rng and
+ * avoid the call if it is 0 or just make the call unconditionally
+ * and let the function set got_hw_rng. 
+ ***********************************************************************/
+
+/*
+ * How much do we trust the hardware?
+ * 0-32 for entropy credit per 32-bit word
+ *
+ * arbitrary number here for testing
+ * NEEDS TO BE SET MORE CAREFULLY
+ * may need #ifdef for architecture-specific value
+ */
+#define	TRUST32		25
+
+/*
+ * check for out-of-bounds values, allowing only values 1-31
+ * a value of 0 would be senseless
+ * 32 is too trusting for any real device
+ */
+#if (TRUST32 < 1) || (TRUST32 > 31)
+#error Out-of-bounds setting for TRUST32
+#endif
+
+/*
+ * fill a 128-bit buffer with hw rand data
+ * only used by routines in this section
+ * other code calls those, not this, since
+ * the higher-level routines do more
+ */
+static inline int hw2buff( u32 *out )
+{
+	int i ;
+	unsigned long *p ;
+
+	for( i = 0, p = (unsigned long *) out ; i < 4 ; i++, p++ )
+		if( !arch_get_random_long( p ) )
+			return 0 ;
+	return 1 ;
+}
+
+/* put 128 bits into a buffer, set got_hw_rng */
+static int get_hw_random( u32 *out )
+{
+	int ret ;
+	ret = hw2buff( out ) ;
+	got_hw_rng = ret ;
+	return ret ;
+}
+
+/* (approximately) fill the input pool with hw rng data */
+
+static u32 *next_word = pools ;
+
+static int load_input()
+{
+	struct entropy_store *r ;
+	u32 temp[4], *end_buffer ;
+	int i, n, ret, limit, e_count ;
+	unsigned long x, flags ;
+
+	r = &input_pool ;
+
+	/*
+	 * deliberately somewhat imprecise calculation
+	 * we need not exactly fill the pool
+	 *
+	 * no lock here; we are just reading values
+	 * and an error will not do real harm
+	 */
+	n = (r->poolinfo->poolbits - ENTROPY_BITS(r)) / 128 ;
+
+	/*
+	 * if pool is not full
+	 * loop to put data into the pool itself
+	 * this does need the lock
+	 */
+	if( n > 0 )			{
+		limit = n*4 ;
+		end_buffer = r->pool + INPUT_POOL_WORDS ;
+		spin_lock_irqsave( &r->lock, flags ) ;
+		for( i = e_count = 0, ret = 1 ; ret && (i<limit) ; i++, next_word++ )	{
+			if( next_word >= end_buffer )
+				next_word = r->pool ;
+			if( (ret = arch_get_random_long( &x )) )	{
+				*next_word ^= x ;
+				e_count += TRUST32 ;
+			}
+		}
+		credit_entropy_bits( r, e_count ) ;
+		spin_unlock_irqrestore( &r->lock, flags ) ;
+	}			
+	/*
+	 * if pool is near full, change its constants
+	 * no loop, just do 128 bits
+	 */
+	else if( (ret = hw2buff(temp)) )	{
+		buffer2array( r, temp ) ;
+	}
+	got_hw_rng = ret ;
+	return ret ;
+}
+
+/* update all constants with data from hw rng if possible */
+static int load_constants()
+{
+	int i, ret ;
+	u32 *p ;
+	unsigned long x, flags ;
+
+	spin_lock_irqsave( &constants_lock, flags ) ;
+	for( i = 0, p = constants, ret = 1 ; ret && (i < ARRAY_WORDS) ; i++, p++ )	{
+		if( (ret = arch_get_random_long( &x )) )
+			*p ^= x ;
+	}
+	spin_unlock_irqrestore( &constants_lock, flags ) ;
+	got_hw_rng = ret ;
+	return ret ;
+}
-- 
2.5.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ