[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wjuVT+2oj_U2V94MBVaJdWsbo1RWzy0qXQSMAUnSaQzxw@mail.gmail.com>
Date: Sat, 14 Sep 2019 09:30:19 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: "Ahmed S. Darwish" <darwish.07@...il.com>
Cc: "Theodore Y. Ts'o" <tytso@....edu>,
Andreas Dilger <adilger.kernel@...ger.ca>,
Jan Kara <jack@...e.cz>, Ray Strode <rstrode@...hat.com>,
William Jon McCann <mccann@....edu>,
"Alexander E. Patrakov" <patrakov@...il.com>,
zhangjs <zachary@...shancloud.com>, linux-ext4@...r.kernel.org,
Lennart Poettering <lennart@...ttering.net>,
lkml <linux-kernel@...r.kernel.org>
Subject: Re: Linux 5.3-rc8
On Sat, Sep 14, 2019 at 8:02 AM Ahmed S. Darwish <darwish.07@...il.com> wrote:
>
> On Thu, Sep 12, 2019 at 12:34:45PM +0100, Linus Torvalds wrote:
> >
> > An alternative might be to make getrandom() just return an error
> > instead of waiting. Sure, fill the buffer with "as random as we can"
> > stuff, but then return -EINVAL because you called us too early.
>
> ACK, that's probably _the_ most sensible approach. Only caveat is
> the slight change in user-space API semantics though...
>
> For example, this breaks the just released systemd-random-seed(8)
> as it _explicitly_ requests blocking behvior from getrandom() here:
>
Actually, I would argue that the "don't ever block, instead fill
buffer and return error instead" fixes this broken case.
> => src/random-seed/random-seed.c:
> /*
> * Let's make this whole job asynchronous, i.e. let's make
> * ourselves a barrier for proper initialization of the
> * random pool.
> */
> k = getrandom(buf, buf_size, GRND_NONBLOCK);
> if (k < 0 && errno == EAGAIN && synchronous) {
> log_notice("Kernel entropy pool is not initialized yet, "
> "waiting until it is.");
>
> k = getrandom(buf, buf_size, 0); /* retry synchronously */
> }
Yeah, the above is yet another example of completely broken garbage.
You can't just wait and block at boot. That is simply 100%
unacceptable, and always has been, exactly because that may
potentially mean waiting forever since you didn't do anything that
actually is likely to add any entropy.
> if (k < 0) {
> log_debug_errno(errno, "Failed to read random data with "
> "getrandom(), falling back to "
> "/dev/urandom: %m");
At least it gets a log message.
So I think the right thing to do is to just make getrandom() return
-EINVAL, and refuse to block.
As mentioned, this has already historically been a huge issue on
embedded devices, and with disks turnign not just to NVMe but to
actual polling nvdimm/xpoint/flash, the amount of true "entropy"
randomness we can give at boot is very questionable.
We can (and will) continue to do a best-effort thing (including very
much using rdread and friends), but the whole "wait for entropy"
simply *must* stop.
> I've sent an RFC patch at [1].
>
> [1] https://lkml.kernel.org/r/20190914122500.GA1425@darwi-home-pc
Looks reasonable to me. Except I'd just make it simpler and make it a
big WARN_ON_ONCE(), which is a lot harder to miss than pr_notice().
Make it clear that it is a *bug* if user space thinks it should wait
at boot time.
Also, we might even want to just fill the buffer and return 0 at that
point, to make sure that even more broken user space doesn't then try
to sleep manually and turn it into a "I'll wait myself" loop.
Linus
Powered by blists - more mailing lists