netdev - Re: Revising prandom

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <0ce9983d-f257-4808-bd84-4385587e1b41@www.fastmail.com>
Date:   Wed, 27 Mar 2019 14:32:55 -0400
From:   "Hannes Frederic Sowa" <hannes@...essinduktion.org>
To:     "George Spelvin" <lkml@....org>,
        "Daniel Borkmann" <daniel@...earbox.net>
Cc:     netdev@...r.kernel.org
Subject: Re: Revising prandom_32 generator

On Tue, Mar 26, 2019, at 20:07, George Spelvin wrote:
> On Tue, 26 Mar 2019 at 14:03:55 -0400, Hannes Frederic Sowa wrote:
> > On Tue, Mar 26, 2019, at 12:10, George Spelvin wrote:
> > The conversation definitely makes sense.
> 
> I also fixed the seeding of prandom; it now uses the
> random_ready_callback system for seeding.
> 
> > Are you trying to fix the modulo biases? I think that prandom_u32_max
> > also has bias, would that be worth fixing as well?
> 
> I could if you like.  I submitted a patch (appended) to do that
> for /dev/urandom (using a very clever new algorithm that does it
> all with a single multiply in the common case!), but didn't think
> the prandom users were critical enough to care.

I am not sure. A quick look shows that it might get used for selecting timeouts in the networking stack. Should not matter here at all. If used to select an index in pre-allocated arrays, then effects might be visible.
Additionally, by adding all the branches it might make sense to downgrade the function from 'static inline' to regular function.

All in all, I don't think it is required and I would just add it if you think it is worth it as well. The additional comments you proposed are definitely worth to add.

> > I think tausworthe is not _trivially_ to predict, what about your
> > proposed algorithms? I think it is a nice to have safety-net in
> > case too much random numbers accidentally leaks (despite reseeding).
> 
> lfsr113 is indeed trivial to predict.  It's a 113-bit LFSR defined
> by a degree-113 polynomial.  (The implementation as four separate
> polynomials of degree 31, 29, 28 and 25 doesn't change this.)  Given
> any 113 bits of its output (not necessarily consecutive), that's
> 113 boolean linear equations in 113 unknowns to find the internal
> state.
> 
> I don't have PoC code, but Gaussian elimination is not exactly
> rocket science.

Fair enough; the attack vector I had in mind was solely that some random output could be observed from the kernel via prandom_u32() and the security bar I wanted to beat was a PRNGs which outputs its entire state or a considerable amount of them. As you wrote, your proposed RNGs exhibit more non-linearity, so it is time to switch. :) I wasn't sure if the non-consecutive extraction of output would blow up the Gaussian Elimination at some point (as in taking a few hours to reduce), which I would not consider trivial anymore.

Anyway, using those functions in those situations would already be a bug, as Stephen correctly said, but some safety net would be useful (e.g. observing timeouts from a box right after boot up to determine the initial state of the fragmentation IDs would be bad - but early reseeding etc. makes that already difficult).

> All of the proposed replacements are considerably less linear and
> harder to invert.  The xoshiro family are the most linear,
> with an LFSR core and only a very simple non-linear state masking.

Then all of the proposed prngs are an improvement? Definitely go for it.

> Here's the patch mentioned above for unbiased range reduction.
> 
> I just thought of a great improvement!  The slow computation
> is "-range % range".  I could use __buitin_constant_p(range)
> to decide between the following implementation and one that
> has -range % range pre-computed and passed in.  I.e.

Heh, if you already made the work and maybe benchmarked it, it might make sense to kill the bias in prandom_u32_max anyway? :) The proposal below looks great!

> static inline u32 get_random_max32(u32 range)
> {
> 	if (__builtin_constant_p(range))
> 		return get_random_max32_const(range, -range % range);
> 	else
> 		return get_random_max32_var(range);
> }
>
> [...]
>
> u32 get_random_max32_var(u32 range)
> {
> 	u64 prod = mul_u32_u32(get_random_u32(), range);
> 
> 	if ((u32)prod < range - 1) {

Maybe 'unlikely' could help here?

> 		u32 lim = -range % range;	/* = (1<<32) % range */
> 
> 		while ((u32)prod < lim)
> 			prod = mul_u32_u32(get_random_u32(), range);
> 	}
> 	return prod >> 32;
> }

Bye,
Hannes