linux-ext4 - Re: Linux 5.3-rc8

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20190917121156.GC6762@mit.edu>
Date:   Tue, 17 Sep 2019 08:11:56 -0400
From:   "Theodore Y. Ts'o" <tytso@....edu>
To:     Martin Steigerwald <martin@...htvoll.de>
Cc:     Willy Tarreau <w@....eu>, Matthew Garrett <mjg59@...f.ucam.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        "Ahmed S. Darwish" <darwish.07@...il.com>,
        Vito Caputo <vcaputo@...garu.com>,
        Lennart Poettering <mzxreary@...inter.de>,
        Andreas Dilger <adilger.kernel@...ger.ca>,
        Jan Kara <jack@...e.cz>, Ray Strode <rstrode@...hat.com>,
        William Jon McCann <mccann@....edu>,
        "Alexander E. Patrakov" <patrakov@...il.com>,
        zhangjs <zachary@...shancloud.com>, linux-ext4@...r.kernel.org,
        lkml <linux-kernel@...r.kernel.org>
Subject: Re: Linux 5.3-rc8

On Tue, Sep 17, 2019 at 09:33:40AM +0200, Martin Steigerwald wrote:
> Willy Tarreau - 17.09.19, 07:24:38 CEST:
> > On Mon, Sep 16, 2019 at 06:46:07PM -0700, Matthew Garrett wrote:
> > > >Well, the patch actually made getrandom() return en error too, but
> > > >you seem more interested in the hypotheticals than in arguing
> > > >actualities.> 
> > > If you want to be safe, terminate the process.
> > 
> > This is an interesting approach. At least it will cause bug reports in
> > application using getrandom() in an unreliable way and they will
> > check for other options. Because one of the issues with systems that
> > do not finish to boot is that usually the user doesn't know what
> > process is hanging.
> 

I would be happy with a change which changes getrandom(0) to send a
kill -9 to the process if it is called too early, with a new flag,
getrandom(GRND_BLOCK) which blocks until entropy is available.  That
leaves it up to the application developer to decide what behavior they
want.

Userspace applications which want to do something more sophisticated
could set a timer which will cause getrandom(GRND_BLOCK) to return
with EINTR (or the signal handler could use longjmp; whatever) to
abort and do something else, like calling random_r if it's for some
pathetic use of random numbers like MIT-MAGIC-COOKIE.

> A userspace process could just poll on the kernel by forking a process 
> to use getrandom() and waiting until it does not get terminated anymore. 
> And then it would still hang.

So.... I'm not too worried about that, because if a process is
determined to do something stupid, they can always do something
stupid.

This could potentially be a problem, as would GRND_BLOCK, in that if
an application author decides to use to do something to wait for real
randomness, because in the good judgement of the application author,
it d*mned needs real security because otherwise an attacker could,
say, force a launch of nuclear weapons and cause world war III, and
then some small 3rd-tier distro decides to repurpose that application
for some other use, and puts it in early boot, it's possible that a
user will report it as a "regression", and we'll be back to the
question of whether we revert a performance optimization patch.

There are only two ways out of this mess.  The first option is we take
functionality away from a userspace author who Really Wants A Secure
Random Number Generator.  And there are an awful lot of programs who
really want secure crypto, becuase this is not a hypothetical.  The
result in "Mining your P's and Q's" did happen before.  If we forget
the history, we are doomed to repeat it.

The only other way is that we need to try to get the CRNG initialized
securely in early boot, before we let userspace start.  If we do it
early enough, we can also make the kernel facilities like KASLR and
Stack Canaries more secure.  And this is *doable*, at least for most
common platforms.  We can leverage UEFI; we cn try to use the TPM's
random number generator, etc.  It won't help so much for certain
brain-dead architectures, like MIPS and ARM, but if they are used for
embedded use cases, it will be caught before the product is released
for consumer use.  And this is where blocking is *way* better than a
big fat warning, or sleeping for 15 seconds, both of which can easily
get missed in the embedded case.  If we can fix this for traditional
servers/desktops/laptops, then users won't be complaining to Linus,
and I think we can all be happy.

Regards,

					- Ted