linux-kernel - Re: deconflicting new syscall numbers for 6.11

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Zobf3fZOuvOJOGPN@zx2c4.com>
Date: Thu, 4 Jul 2024 19:46:05 +0200
From: "Jason A. Donenfeld" <Jason@...c4.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: jolsa@...nel.org, mhiramat@...nel.org, cgzones@...glemail.com,
	brauner@...nel.org, linux-kernel@...r.kernel.org, arnd@...db.de
Subject: Re: deconflicting new syscall numbers for 6.11

Hi Linus,

On Thu, Jul 04, 2024 at 10:21:34AM -0700, Linus Torvalds wrote:
> On Thu, 4 Jul 2024 at 10:10, Jason A. Donenfeld <Jason@...c4.com> wrote:
> >
> > The three of us all have new syscalls planned for 6.11. Arnd suggested
> > that we coordinate to deconflict, to make the merge easier.
> 
> Nobody has explained to me what has changed since your last vdso
> getrandom, and I'm not planning on pulling it unless that fundamental
> flaw is fixed.

Oh. That's an unpleasant surprise. I've been hard at work on bringing
everything up to snuff. That's pretty much been my sole focus.

Changes since the last time I worked on this are explained in large at
the top of this:

https://lore.kernel.org/lkml/20240703183115.1075219-1-Jason@zx2c4.com/

The big issue before was that the mm additions were too insane. I've
paired those down and made them really minimal. Then the mm people piped
up and it became even more minimal. Now I think it's pretty alright.

But I think, perhaps evidently barring you, the use case of this in the
first place and need for it is well understood and appreciated at large
by now. So to answer that,

> Why is this _so_ critical that it needs a vdso?
> 
> Why isn't user space just doing it itself?
> 
> What's so magical about this all?
> 
> This all seems entirely pointless to me still, because it's optimizing
> something that nobody seems to care about
>
> IOW, I want to see actual *users* piping up and saying "this is a
> problem, here's my real load that spends 10% of time on getrandom(),
> and this fixes it".
>
> I'm not AT ALL interested in microbenchmarks or theoretical "if users
> need high-performance random numbers".
>
> I need a real actual live user that says "I can't just use rdrand and
> my own chacha mixing on top" and explains why having a SSE2 chachacha
> in kernel code exposed as a vdso is so critical, and a magical buffer
> maintained by the kernel.

As far as speed goes, there are many legitimate applications that cannot
make a syscall every time. TLS nonces and keys come to mind as a huge
one. "Make getrandom() fast enough that the TLS library can use it" is
something that's come up over and over. There's now also arc4random() in
glibc, whose addition is what sparked this whole patchset two years ago.
That's not a micro benchmark thing either. I too don't really care for
microbenchmarks with the random driver. But I do want it to be actually
useable, so that people use it, because it is the best facility for the
task. With regards to why VDSO, the cover letter lays that out in
detail. Userspace does not have access to the information in a timely
manner that the kernel does, and the particulars of the kernel's
accounting are bound to change, especially as all this matures with VMs.
The RNG in the vDSO needs to be tightly coupled with the RNG in the
kernel; these are part of the same thing.

Anyway, those actual users exist, and the partial solutions and hacks
required to workaround this shortcoming are kind of grotesque and in one
way or another bad. This isn't theoretical. I'm not working on this for
"fun".

Jason