linux-ext4 - Re: [PATCH RFC v4 1/1] random: WARN on large getrandom() waits and introduce getrandom2()

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAHk-=wiCqDiU7SE3FLn2W26MS_voUAuqj5XFa1V_tiGTrrW-zQ@mail.gmail.com>
Date:   Wed, 18 Sep 2019 16:57:58 -0700
From:   Linus Torvalds <torvalds@...ux-foundation.org>
To:     "Ahmed S. Darwish" <darwish.07@...il.com>
Cc:     Lennart Poettering <mzxreary@...inter.de>,
        "Theodore Y. Ts'o" <tytso@....edu>,
        "Eric W. Biederman" <ebiederm@...ssion.com>,
        "Alexander E. Patrakov" <patrakov@...il.com>,
        Michael Kerrisk <mtk.manpages@...il.com>,
        lkml <linux-kernel@...r.kernel.org>, linux-ext4@...r.kernel.org,
        linux-man@...r.kernel.org
Subject: Re: [PATCH RFC v4 1/1] random: WARN on large getrandom() waits and
 introduce getrandom2()

On Wed, Sep 18, 2019 at 2:17 PM Ahmed S. Darwish <darwish.07@...il.com> wrote:
>
> Since Linux v3.17, getrandom(2) has been created as a new and more
> secure interface for pseudorandom data requests.  It attempted to
> solve three problems, as compared to /dev/urandom:

I don't think your patch is really _wrong_, but I think it's silly to
introduce a new system call, when we have 30 bits left in the flags of
the old one, and the old system call checked them.

So it's much simpler and more straightforward to  just introduce a
single new bit #2 that says "I actually know what I'm doing, and I'm
explicitly asking for secure/insecure random data".

And then say that the existing bit #1 just means "I want to wait for entropy".

So then you end up with this:

    /*
     * Flags for getrandom(2)
     *
     * GRND_NONBLOCK    Don't block and return EAGAIN instead
     * GRND_WAIT_ENTROPY        Explicitly wait for entropy
     * GRND_EXPLICIT    Make it clear you know what you are doing
     */
    #define GRND_NONBLOCK               0x0001
    #define GRND_WAIT_ENTROPY   0x0002
    #define GRND_EXPLICIT               0x0004

    #define GRND_SECURE (GRND_EXPLICIT | GRND_WAIT_ENTROPY)
    #define GRND_INSECURE       (GRND_EXPLICIT | GRND_NONBLOCK)

    /* Nobody wants /dev/random behavior, nobody should use it */
    #define GRND_RANDOM 0x0002

which is actually fairly easy to understand. So now we have three
bits, and the values are:

 000  - ambiguous "secure or just lazy/ignorant"
 001 - -EAGAIN or secure
 010 - blocking /dev/random DO NOT USE
 011 - nonblocking /dev/random DO NOT USE
 100 - nonsense, returns -EINVAL
 101 - /dev/urandom without warnings
 110 - blocking secure
 111 - -EAGAIN or secure

and people would be encouraged to use one of these three:

 - GRND_INSECURE
 - GRND_SECURE
 - GRND_SECURE | GRND_NONBLOCK

all of which actually make sense, and none of which have any
ambiguity. And while "GRND_INSECURE | GRND_NONBLOCK" works, it's
exactly the same as just plain GRND_INSECURE - the point is that it
doesn't block for entropy anyway, so non-blocking makes no different.

NOTE! This patch looks bigger than it really is. I've changed the
if-statement in getrandom() to a switch-statement, and I did this:

-       if (count > INT_MAX)
-               count = INT_MAX;
+       count = min_t(size_t, count, INT_MAX >> (ENTROPY_SHIFT + 3));

to match what "urandom_read()" already did. That changes the semantics
a bit, but only for the /dev/random case, and only for insanity (the
limit we truncate to is now 32MB read, rather than 2GB - and we
already had that limit for urandom).

There is *one* other small semantic change: The old code did
urandom_read() which added warnings, but each warning also _reset_ the
crng_init_cnt. Until it decided not to warn any more, at which point
it also stops that resetting of crng_init_cnt.

And that reset of crng_init_cnt, btw, is some cray cray.

It's basically a "we used up entropy" thing, which is very
questionable to begin with as the whole discussion has shown, but
since it stops doing it after 10 cases, it's not even good security
assuming the "use up entropy" case makes sense in the first place.

So I didn't copy that insanity either. And I'm wondering if removing
it from /dev/urandom might also end up helping Ahmed's case of getting
entropy earlier, when we don't reset the counter.

But other than those two details, none of the existing semantics
changed, we just added the three actually _sane_ cases without any
ambiguity.

In particular, this still leaves the semantics of that nasty
"getrandom(0)" as the same "blocking urandom" that it currently is.
But now it's a separate case, and we can make that perhaps do the
timeout, or at least the warning.

And the new cases are defined to *not* warn. In particular,
GRND_INSECURE very much does *not* warn about early urandom access
when crng isn't ready. Because the whole point of that new mode is
that the user knows it isn't secure.

So that should make getrandom(GRND_INSECURE) palatable to the systemd
kind of use that wanted to avoid the pointless kernel warning.

And we could mark this for stable and try to get it backported so that
it will have better coverage, and encourage people to use the new sane
_explicit_ waiting (or not) for entropy.

Comments? Full patch as attachment.

                  Linus

View attachment "patch.diff" of type "text/x-patch" (2727 bytes)