[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1444708222.900.0@smtp.gmail.com>
Date: Mon, 12 Oct 2015 20:50:22 -0700
From: Raymond Jennings <shentino@...il.com>
To: Theodore Ts'o <tytso@....edu>, George Spelvin <linux@...izon.com>,
ahferroin7@...il.com, andi@...stfloor.org, jepler@...ythonic.net,
linux-kernel@...r.kernel.org, linux@...musvillemoes.dk
Subject: Re: Updated scalable urandom patchkit
On Mon, Oct 12, 2015 at 7:46 PM, Theodore Ts'o <tytso@....edu> wrote:
> On Mon, Oct 12, 2015 at 04:30:59PM -0400, George Spelvin wrote:
>> > Segregating abusers solves both problems. If we do this then we
>> don't
>> > need to drop the locks from the nonblocking pool, which solves the
>> > security problem.
>>
>> Er, sort of. I still think my points were valid, but they're
>> about a particular optimization suggestion you had. By avoiding
>> the need for the optimization, the entire issue is mooted.
>
> Sure, I'm not in love with anyone's particular optimization, whether
> it's mine, yours, or Andi's. I'm just trying to solve the scalability
> problem while also trying to keep the code maintainable and easy to
> understand (and over the years we've actually made things worse, to
> the extent that having a single mixing for the input and output pools
> is starting to be more of problem than a feature, since we're coding
> in a bunch of exceptions when it's the output pool, etc.).
>
> So if we can solve a problem by routing around it, that's fine in my
> book.
>
>> You have to copy the state *anyway* because you don't want it
>> overwritten
>> by the ChaCha output, so there's really no point storing the
>> constants.
>> (Also, ChaCha has a simpler input block structure than Salsa20; the
>> constants are all adjacent.)
>
> We're really getting into low-level implementations here, and I think
> it's best to worry about these sorts of things when we have a patch to
> review.....
>
>> (Note: one problem with ChaCha specifically is that is needs 16x32
>> bits
>> of registers, and Arm32 doesn't quite have enough. We may want to
>> provide
>> an arch CPRNG hook so people can plug in other algorithms with good
>> platform support, like x86 AES instructions.)
>
> So while a ChaCha20-based CRNG should be faster than a SHA-1 based
> CRNG, and I consider this a good thing, for me speed is **not** more
> important than keeping the underlying code maintainable and simple.
> This is one of the reasons why I looked at, and then discarded, to use
> x86 accelerated AES as the basis for a CRNG. Setting up AES so that
> it can be used easily with or without hardware acceleration looks very
> complicated to do in a cross-architectural way, and I don't want to
> drag in all of the crypto layer for /dev/random.
>
>> The same variables can be used (with different parameters) to
>> decide if
>> we want to get out of mitigation mode. The one thing to watch out
>> for
>> is that "cat </dev/urandom >/dev/sdX" may have some huge pauses once
>> the buffer cache fills. We don't want to forgive after too small a
>> fixed interval.
>
> At least initially, once we go into mitigation mode for a particular
> process, it's probably safer to simply not exit it.
>
>> Finally, we have the issue of where to attach this rate-limiting
>> structure
>> and crypto context. My idea was to use the struct file. But now
>> that
>> we have getrandom(2), it's harder. mm, task_struct, signal_struct,
>> what?
>
> I'm personally more inclined to keep it with the task struct, so that
> different threads will use different crypto contexts, just from
> simplicity point of view since we won't need to worry about locking.
>
> Since many processes don't use /dev/urandom or getrandom(2) at all,
> the first time they do, we'd allocate a structure and hang it off the
> task_struct. When the process exits, we would explicitly memzero it
> and then release the memory.
>
>> (Post-finally, do we want this feature to be configurable under
>> CONFIG_EMBEDDED? I know keeping the /dev/random code size small is
>> a speficic design goal, and abuse mitigation is optional.)
>
> Once we code it up we can see how many bytes this takes, we can have
> this discussion. I'll note that ChaCha20 is much more compact than
> SHA1:
>
> text data bss dec hex filename
> 4230 0 0 4230 1086 /build/ext4-64/lib/sha1.o
> 1152 304 0 1456
> 5b0 /build/ext4-64/crypto/chacha20_generic.o
>
> ... and I've thought about this as being the first step towards
> potentially replacing SHA1 with something ChaCha20 based, in light of
> the SHAppening attack. Unfortunately, BLAKE2s is similar to ChaCha
> only from design perspective, not an implementation perspective.
> Still, I suspect the just looking at the crypto primitives, even if we
> need to include two independent copies of the ChaCha20 core crypto and
> the Blake2s core crypto, it still should be about half the size of the
> SHA-1 crypto primitive.
>
> And from the non-plumbing side of things, Andi's patchset increases
> the size of /dev/random by a bit over 6%, or 974 bytes from a starting
> base of 15719 bytes. It ought to be possible to implement a ChaCha20
> based CRNG (ignoring the crypto primitives) in less than 974 bytes of
> x86_64 assembly. :-)
>
> - Ted
>
> --
> To unsubscribe from this list: send the line "unsubscribe
> linux-kernel" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
This might be stupid, but could something asynchronous work? Perhaps
have the entropy generators dump their entropy into a central pool via
a cycbuf, and have a background kthread manage the per-cpu or
per-process entropy pools?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists