linux-kernel - [PATCH v23] getrandom() in vDSO (still flawed)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [day] [month] [year] [list]

Message-ID: <23483431.vTCxPXJkl2@wukong>
Date: Tue, 22 Oct 2024 22:49:54 +0200
From: Bernd Paysan <bernd@...2o.de>
To: linux-kernel@...r.kernel.org
Cc: "Jason A. Donenfeld" <Jason@...c4.com>,
 David Hildenbrand <david@...hat.com>,
 Linus Torvalds <torvalds@...ux-foundation.org>, Yu Zhao <yu.zhao@...el.com>,
 Adhemerval Zanella Netto <adhemerval.zanella@...aro.org>
Subject: [PATCH v23] getrandom() in vDSO (still flawed)

This recently popped up here (I'm not a Linux kernel developer, but do a 
secure network protocol (net2o) and also did a hardware CPRNG for cryptech.is) 
after the author of dietlibc had a look at getrandom() in vDSO.  I kindly 
helped him to review this.  We concluded that he should ATM NOT implement it, 
in particular, because it is not just the normal easy stuff where you just link 
to the exact same symbol with the same signature in the vDSO lib instead of 
doing a kernel call, but this is putting additional (and fragile!) burden on 
the libc maintainer of managing an opaque state.

Random numbers are too important, so I subscribed to LKML… and jump in with my 
flame-proof suit on.

CC to everybody who replied to [Patch v22] of this proposal, and the person 
who wrote the glibc patch.

Linus Torvalds wrote:
> In other words, I want to see actual *users* piping up and saying "this is a
> problem, here's my real load that spends 10% of time on getrandom(), and
> this fixes it".

# My usage #

In my net2o crypto stack, I use getrandom() once at the process start, and 
roll my own CPRNG.  Even if Linux kernels had a superfast getrandom(), doing 
this backward compatible and across OS portable requires to roll my own CPRNG 
for at least the next decade.  I still check if getrandom is actually 
available, and read /dev/urandom if not.  I feel comfortable generating 
randomness myself, but I understand that others don't and shouldn't.

# When do you need lots and lots of randomness? #

For connection-oriented protocols, you can and should generate the nonces on 
both sides starting with a shared secret, and not transmit them over the 
network at all.  Then, there's no per packet kernel getrandom() call involved, 
but at most a per-connection call.  Connection oriented protocols that require 
a CPRNG nonce per packet are ill-conceived, but do exist, though.  There's no 
problem to encrypt 1GB/s on a single core, so calling getrandom() per packet 
would be a clearly noticeable slowdown.  There's sendmmsg/recvmmsg for sending 
out and receiving multiple packets with one kernel call, and it is necessary.

Leaves connection-less protocols, where you can't avoid generating some random 
stuff per packet; usually ephemeral keys (where you don't send the raw random 
numbers over the network).  Those protocols have a higher overhead, as they 
need a per-packet key exchange.  Still >100k/s of such DHEs can be done per 
core with the recent speed improvements of x/ed25519, so the overhead is 
measurable.

So while I don't need this, I'm not saying it is completely useless.

# There are however some flaws that need to be fixed first #

1. Burden on the libc implementer to allocate the state, re-initialize it on 
fork() and whatnot.  This is calling for disaster, there's certainly one or 
the other maintainer who gets something wrong, and it's not just glibc, it's 
also musl, bionic, eglibc, dietlibc, and programs that rather talk to the 
kernel directly to avoid broken libcs.  It really has to be getrandom(buf, 
size, flags).  That's how the other vDSO entries work, and that's how far I 
trust libc maintainers.  I don't trust them to maintain the state by 
themselves.

2. There's no documentation.  This would only be ok if it was a 1:1 
replacement of the syscall that doesn't need documentation.

3. The ChaCha20 (good choice) based implementation requires more state than 
necessary.

So, how much state do you need for a multi-threaded CPRNG, based on ChaCha20?

Answer:
1 *shared* secret key (256 or 512 bits) *per process* (yes, you can share that 
secret)
1 64 bit counter and
1 64 bit nonce (essentially pthread_self) *per thread*

The per-thread 128 bits don't have to be secret.  The counter can be shared or 
per-thread.  This means handing over the burden of allocating state to the 
library maintainer is completely unnecessary.

# In summary: This can be fixed #

But ATM, C library maintainer should not touch it.

-- 
Bernd Paysan
"If you want it done right, you have to do it yourself"
net2o id: kQusJzA;7*?t=uy@...GWr!+0qqp_Cn176t4(dQ*
https://net2o.de/

Download attachment "signature.asc" of type "application/pgp-signature" (834 bytes)