linux-kernel - Re: deconflicting new syscall numbers for 6.11

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ZoiaWz9mG9rb0QND@localhost.localdomain>
Date: Fri, 5 Jul 2024 21:14:03 -0400
From: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: "Jason A. Donenfeld" <Jason@...c4.com>, jolsa@...nel.org,
	mhiramat@...nel.org, cgzones@...glemail.com, brauner@...nel.org,
	linux-kernel@...r.kernel.org, arnd@...db.de,
	Adhemerval Zanella Netto <adhemerval.zanella@...aro.org>,
	Zack Weinberg <zack@...folio.org>,
	Cristian Rodríguez <cristian@...riguez.im>,
	Florian Weimer <fweimer@...hat.com>,
	Wilco Dijkstra <Wilco.Dijkstra@....com>
Subject: Re: deconflicting new syscall numbers for 6.11

On 04-Jul-2024 10:21:34 AM, Linus Torvalds wrote:
> On Thu, 4 Jul 2024 at 10:10, Jason A. Donenfeld <Jason@...c4.com> wrote:
> >
> > The three of us all have new syscalls planned for 6.11. Arnd suggested
> > that we coordinate to deconflict, to make the merge easier.
> 
> Nobody has explained to me what has changed since your last vdso
> getrandom, and I'm not planning on pulling it unless that fundamental
> flaw is fixed.
> 
> Why is this _so_ critical that it needs a vdso?
> 
> Why isn't user space just doing it itself?
> 
> What's so magical about this all?
> 
> This all seems entirely pointless to me still, because it's optimizing
> something that nobody seems to care about, adding new VM
> infrastructure, new magic system calls, yadda yadda.
> 
> I was very sceptical last time, and absolutely _nothing_ has changed.
> Not a peep on why it's now suddenly so hugely important again.
> 
> We don't add stuff "just because we can". We need to have a damn good
> reason for it. And I still don't see the reason, and I haven't seen
> anybody even trying to explain the reason.

[ Note: as I wrote down this email, I notice that you are heading
  towards the same conclusions I'm reaching on other sub-threads of this
  discussion. But I'm providing this feedback because it adds relevant
  information based on earlier discussions with libc developers. ]

Earlier this year in March, I've jumped into the discussion on the
libc-alpha mailing list to understand the userspace RNG seeding
requirements better. The interesting bits that explain how the kernel
can play an important role start here:

https://sourceware.org/pipermail/libc-alpha/2024-March/155534.html

>From an absolutely-not-security-expert perspective, here is how I see
the desiderata breakdown:

- There appears to be a need to make sure the random seed is not exposed
  across fork, core dump and other similar scenarios. This can be
  achieved by simply letting userspace use the appropriate madvise(2)
  advices on a memory mapping created through mmap(2). I don't see why
  there would be any need to create any RNG-centric ABI for this. If
  new madvise(2) advices are needed, they can simply be added there.

- There appears to be interest in having a RNG faster than a system call
  for various reasons I'm not familiar with. A vDSO appears to be one
  way to do this. Another way would be to let userspace implement it
  all, which raises the following question: what is the minimal state
  known only by the kernel currently unknown from userspace ? This
  brings the following point.

- Based on the libc-alpha discussion, I understand that the main thing
  the kernel knows about which is unknown from userspace is a sort-of
  generation counter, which tracks for instance the fact that the kernel
  was migrated to a different VM, or suspended and then resumed, and
  hence the current seed should be discarded and re-seeded entirely.
  I suspect that is the _key_ information that is currently missing from
  a purely userspace RNG perspective today. I hinted at extending the
  rseq(2) ABI for that purpose: exposing a generation counter for the
  RNG in a thread area shared between kernel and user-space. The
  per-thread area is already there and the hard work of integrating it
  with libc is mostly complete. Another alternative would be, as you
  hint elsewhere in this thread
(https://lore.kernel.org/lkml/CAHk-=wgqD9h0Eb-n94ZEuK9SugnkczXvX497X=OdACVEhsw5xQ@mail.gmail.com/)
  to create a vDSO to expose exactly this kind of generation counter.
  Given this is not a thread-specific thing, it might be a better
  approach that the rseq per-thread area.

So either I'm missing something important (please enlighten me), or we
could achieve all those end-goals with a small fraction of the ABI
complexity introduced by the vDSO as it is initially proposed.

I don't think that just because there happens to be bad userspace RNG
implementations out there we should give up on userspace and maintain
this all complexity in the kernel. This is just working around userspace
ecosystem issues by moving the implementation and maintainance burden
into the kernel.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com