lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 10 Jun 2014 12:14:01 -0400
From:	Theodore Ts'o <tytso@....edu>
To:	Jörn Engel <joern@...fs.org>
Cc:	"H. Peter Anvin" <hpa@...or.com>,
	lkml <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] random: mix all saved registers into entropy pool

On Mon, May 19, 2014 at 05:17:19PM -0400, Jörn Engel wrote:
> +/*
> + * Ratelimit to a steady state of about once per jiffy.  A naïve approach
> + * would be to return 1 every time jiffies changes.  But we want to avoid
> + * being closely coupled to the timer interrupt.  So instead we increment
> + * a counter on every call and shift it right every time jiffies changes.
> + * If the counter is a power of two, return false;
> + *
> + * Effect is that some time after a jiffies change and cutting the counter
> + * in half we reach another power of two and return false.  But the
> + * likelihood of this happening is about the same at any time within a
> + * jiffies interval.
> + */
> +static inline int ratelimited(struct fast_pool *p)
> +{
> +	int ret = !(p->regs_count == 0 || is_power_of_2(p->regs_count));
> +
> +	p->regs_count++;
> +	if (p->last_shift != (u32)jiffies) {
> +		p->regs_count >>= min(31u, (u32)jiffies - p->last_shift);
> +		p->last_shift = (u32)jiffies;
> +	}
> +	return ret;
> +}

I wasn't convinced this would do the right thing, so I wrote a quick
test program where the main loop was basically this:

	for (i=0; i < 1024; i++) {
		jiffies = i >> 2;
		r = ratelimited(jiffies);
		printf("%5u %5u %5u %d\n", i, jiffies, regs_count, r);
	}

... which basically simulated a very simple scheme where there were
four interrupts for each clock tick.  In the steady state ratelimited
returns true 75% of the time.  If this was as documented, we would
expect it to return true 25% of the time.  So I don't think this is
working quite right:

   20     5     3 1
   21     5     4 1
   22     5     5 0
   23     5     6 1
   24     6     3 1
   25     6     4 1
   26     6     5 0
   27     6     6 1
   28     7     3 1
   29     7     4 1
   30     7     5 0
   31     7     6 1
   32     8     3 1


> +static void mix_regs(struct pt_regs *regs, struct fast_pool *fast_pool)
> +{
> +	struct entropy_store	*r;
> +	/* Two variables avoid decrementing by two without using atomics */
> +	static int boot_count = BOOT_IRQS;
> +	int in_boot = boot_count;
> +
> +	if (in_boot) {
> +		boot_count = in_boot - 1;
> +	} else if (ratelimited(fast_pool))
> +		return;
> +
> +	/* During bootup we alternately feed both pools */
> +	r = (in_boot & 1) ? &nonblocking_pool : &input_pool;
> +	__mix_pool_bytes(r, regs, sizeof(*regs), NULL);
> +}

I'm also concerned about how much overhead this might eat up.  I've
already had someone who was benchmarking a high performance storage
array where the increased interrupt latency before adding something
like this was something he noticed, and kvetched to me about.  The
pt_regs structure is going to be larger than the fast_pool (which is
only 16 bytes), and we're going to be doing it once every jiffy, which
means between 100 and 1000 times per second.

If we do this only during boot up, I'd be much less concerned, but
doing it all the time is making me a bit nervous about the overhead.

If we knew which parts of the pt_regs were actually the "best" in
terms of entropy, we could xor that into the input[] array in
add_interrupt_randomness(), perhaps?  


> -	input[0] = cycles ^ j_high ^ irq;
> -	input[1] = now ^ c_high;
> +	input[0] ^= cycles ^ j_high ^ irq;
> +	input[1] ^= now ^ c_high;

This is an unrelated change, so if you want to do this, let's discuss
this on a separate commit.  We used to have something like this, but
it causes static code checkers to complain about our using stack
garbage, and people argued convincingly that it really did add as much
unpredictability, especially when correlated with the ip field.

(And yes, I know that instruction_pointer(regs) doesn't work on all
platforms, and _RET_IP_ becomes largely static --- but this is why I'd
much rather have each architecture tell me which parts of regs were
actually the best, and use that as the initialized portions of the
input[] array.)

Cheers,

					- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ