[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190915063323.GA20811@1wt.eu>
Date:   Sun, 15 Sep 2019 08:33:23 +0200
From:   Willy Tarreau <w@....eu>
To:     "Theodore Y. Ts'o" <tytso@....edu>
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>,
        "Ahmed S. Darwish" <darwish.07@...il.com>,
        Andreas Dilger <adilger.kernel@...ger.ca>,
        Jan Kara <jack@...e.cz>, Ray Strode <rstrode@...hat.com>,
        William Jon McCann <mccann@....edu>,
        "Alexander E. Patrakov" <patrakov@...il.com>,
        zhangjs <zachary@...shancloud.com>, linux-ext4@...r.kernel.org,
        Lennart Poettering <lennart@...ttering.net>,
        lkml <linux-kernel@...r.kernel.org>
Subject: Re: Linux 5.3-rc8
On Sat, Sep 14, 2019 at 10:05:21PM -0400, Theodore Y. Ts'o wrote:
> I'd be willing to let it take at least 2 minutes, since that's slow
> enough to be annoying.
It's an eternity, and prevents a backup system from being turned on in
time to replace a dead system. In fact the main problem with this is
that it destroys uptime on already configured systems for the sake of
making sure a private SSH key is produce correctly. It turns out that
if we instead give the info to this tool that the produced random is
not strong, this only tool that requires good entropy will be able to
ask the user to type something to add real entropy. But making the
system wait forever will not bring any extra entropy because the
services cannot start, it will not even receive network traffic and
will not be able to collect entropy. Sorry Ted, but I've been hit by
this already. It's a real problem to see a system not finish to boot
after a crash when you know your systems have only 5 minutes of total
downtime allowed per year (5 nines). And when the SSH keys, like the
rest of the config, were supposed to be either synchronized from the
network or pre-populated in a system image, nobody finds this a valid
justification for an extended downtime.
> Except the developer could (and *has) just ignored the warning, which
> is what happened with /dev/urandom when it was accessed too early.
That's why it's nice to have getrandom() return the error : it will
for once allow the developer of the program to care depending on the
program. Those proposing to choose the pieces to present in Tetris
will not care, those trying to generate an SSH key will care and will
have solid and well known fallbacks. And the rare ones who need good
randoms and ignore the error will be the ones *responsible* for this,
it will not be the kernel anymore giving bad random.
BTW I was thinking that EAGAIN was semantically better than EINVAL to
indicate that the same call should be done with blocking.
Just my two cents,
Willy
Powered by blists - more mailing lists
 
