[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZoiaWz9mG9rb0QND@localhost.localdomain>
Date: Fri, 5 Jul 2024 21:14:03 -0400
From: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: "Jason A. Donenfeld" <Jason@...c4.com>, jolsa@...nel.org,
mhiramat@...nel.org, cgzones@...glemail.com, brauner@...nel.org,
linux-kernel@...r.kernel.org, arnd@...db.de,
Adhemerval Zanella Netto <adhemerval.zanella@...aro.org>,
Zack Weinberg <zack@...folio.org>,
Cristian RodrÃguez <cristian@...riguez.im>,
Florian Weimer <fweimer@...hat.com>,
Wilco Dijkstra <Wilco.Dijkstra@....com>
Subject: Re: deconflicting new syscall numbers for 6.11
On 04-Jul-2024 10:21:34 AM, Linus Torvalds wrote:
> On Thu, 4 Jul 2024 at 10:10, Jason A. Donenfeld <Jason@...c4.com> wrote:
> >
> > The three of us all have new syscalls planned for 6.11. Arnd suggested
> > that we coordinate to deconflict, to make the merge easier.
>
> Nobody has explained to me what has changed since your last vdso
> getrandom, and I'm not planning on pulling it unless that fundamental
> flaw is fixed.
>
> Why is this _so_ critical that it needs a vdso?
>
> Why isn't user space just doing it itself?
>
> What's so magical about this all?
>
> This all seems entirely pointless to me still, because it's optimizing
> something that nobody seems to care about, adding new VM
> infrastructure, new magic system calls, yadda yadda.
>
> I was very sceptical last time, and absolutely _nothing_ has changed.
> Not a peep on why it's now suddenly so hugely important again.
>
> We don't add stuff "just because we can". We need to have a damn good
> reason for it. And I still don't see the reason, and I haven't seen
> anybody even trying to explain the reason.
[ Note: as I wrote down this email, I notice that you are heading
towards the same conclusions I'm reaching on other sub-threads of this
discussion. But I'm providing this feedback because it adds relevant
information based on earlier discussions with libc developers. ]
Earlier this year in March, I've jumped into the discussion on the
libc-alpha mailing list to understand the userspace RNG seeding
requirements better. The interesting bits that explain how the kernel
can play an important role start here:
https://sourceware.org/pipermail/libc-alpha/2024-March/155534.html
>From an absolutely-not-security-expert perspective, here is how I see
the desiderata breakdown:
- There appears to be a need to make sure the random seed is not exposed
across fork, core dump and other similar scenarios. This can be
achieved by simply letting userspace use the appropriate madvise(2)
advices on a memory mapping created through mmap(2). I don't see why
there would be any need to create any RNG-centric ABI for this. If
new madvise(2) advices are needed, they can simply be added there.
- There appears to be interest in having a RNG faster than a system call
for various reasons I'm not familiar with. A vDSO appears to be one
way to do this. Another way would be to let userspace implement it
all, which raises the following question: what is the minimal state
known only by the kernel currently unknown from userspace ? This
brings the following point.
- Based on the libc-alpha discussion, I understand that the main thing
the kernel knows about which is unknown from userspace is a sort-of
generation counter, which tracks for instance the fact that the kernel
was migrated to a different VM, or suspended and then resumed, and
hence the current seed should be discarded and re-seeded entirely.
I suspect that is the _key_ information that is currently missing from
a purely userspace RNG perspective today. I hinted at extending the
rseq(2) ABI for that purpose: exposing a generation counter for the
RNG in a thread area shared between kernel and user-space. The
per-thread area is already there and the hard work of integrating it
with libc is mostly complete. Another alternative would be, as you
hint elsewhere in this thread
(https://lore.kernel.org/lkml/CAHk-=wgqD9h0Eb-n94ZEuK9SugnkczXvX497X=OdACVEhsw5xQ@mail.gmail.com/)
to create a vDSO to expose exactly this kind of generation counter.
Given this is not a thread-specific thing, it might be a better
approach that the rseq per-thread area.
So either I'm missing something important (please enlighten me), or we
could achieve all those end-goals with a small fraction of the ABI
complexity introduced by the vDSO as it is initially proposed.
I don't think that just because there happens to be bad userspace RNG
implementations out there we should give up on userspace and maintain
this all complexity in the kernel. This is just working around userspace
ecosystem issues by moving the implementation and maintainance burden
into the kernel.
Thanks,
Mathieu
--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
Powered by blists - more mailing lists