[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251110114550-a3f2afa8-4f86-4048-be5b-2dc4f4ef340d@linutronix.de>
Date: Mon, 10 Nov 2025 12:24:13 +0100
From: Thomas Weißschuh <thomas.weissschuh@...utronix.de>
To: "Jason A. Donenfeld" <Jason@...c4.com>
Cc: Andy Lutomirski <luto@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>, Vincenzo Frascino <vincenzo.frascino@....com>,
Arnd Bergmann <arnd@...db.de>, "David S. Miller" <davem@...emloft.net>,
Andreas Larsson <andreas@...sler.com>, Nick Alcock <nick.alcock@...cle.com>,
John Stultz <jstultz@...gle.com>, Stephen Boyd <sboyd@...nel.org>,
John Paul Adrian Glaubitz <glaubitz@...sik.fu-berlin.de>, Shuah Khan <shuah@...nel.org>,
Catalin Marinas <catalin.marinas@....com>, Will Deacon <will@...nel.org>, Theodore Ts'o <tytso@....edu>,
Russell King <linux@...linux.org.uk>, Madhavan Srinivasan <maddy@...ux.ibm.com>,
Michael Ellerman <mpe@...erman.id.au>, Nicholas Piggin <npiggin@...il.com>,
Christophe Leroy <christophe.leroy@...roup.eu>, Huacai Chen <chenhuacai@...nel.org>,
WANG Xuerui <kernel@...0n.name>, Thomas Bogendoerfer <tsbogend@...ha.franken.de>,
Heiko Carstens <hca@...ux.ibm.com>, Vasily Gorbik <gor@...ux.ibm.com>,
Alexander Gordeev <agordeev@...ux.ibm.com>, Christian Borntraeger <borntraeger@...ux.ibm.com>,
Sven Schnelle <svens@...ux.ibm.com>, Shannon Nelson <sln@...main.com>, linux-kernel@...r.kernel.org,
sparclinux@...r.kernel.org, linux-kselftest@...r.kernel.org,
linux-arm-kernel@...ts.infradead.org, linuxppc-dev@...ts.ozlabs.org, loongarch@...ts.linux.dev,
linux-mips@...r.kernel.org, linux-s390@...r.kernel.org
Subject: Re: [PATCH v5 19/34] random: vDSO: only access vDSO datapage after
random_init()
On Mon, Nov 10, 2025 at 11:37:07AM +0100, Jason A. Donenfeld wrote:
> On Mon, Nov 10, 2025 at 10:04:17AM +0100, Thomas Weißschuh wrote:
> > On Sat, Nov 08, 2025 at 12:46:05AM +0100, Jason A. Donenfeld wrote:
> > > I'm not a huge fan of this change:
> > >
> > > On Thu, Nov 06, 2025 at 11:02:12AM +0100, Thomas Weißschuh wrote:
> > > > +static DEFINE_STATIC_KEY_FALSE(random_vdso_is_ready);
> > > >
> > > > /* Control how we warn userspace. */
> > > > static struct ratelimit_state urandom_warning =
> > > > @@ -252,6 +253,9 @@ static void random_vdso_update_generation(unsigned long next_gen)
> > > > if (!IS_ENABLED(CONFIG_VDSO_GETRANDOM))
> > > > return;
> > > >
> > > > + if (!static_branch_likely(&random_vdso_is_ready))
> > > > + return;
> > > > +
> > > > /* base_crng.generation's invalid value is ULONG_MAX, while
> > > > * vdso_k_rng_data->generation's invalid value is 0, so add one to the
> > > > * former to arrive at the latter. Use smp_store_release so that this
> > > > @@ -274,6 +278,9 @@ static void random_vdso_set_ready(void)
> > > > if (!IS_ENABLED(CONFIG_VDSO_GETRANDOM))
> > > > return;
> > > >
> > > > + if (!static_branch_likely(&random_vdso_is_ready))
> > > > + return;
> > > > +
> > > > WRITE_ONCE(vdso_k_rng_data->is_ready, true);
> > > > }
> > > >
> > > > @@ -925,6 +932,9 @@ void __init random_init(void)
> > > > _mix_pool_bytes(&entropy, sizeof(entropy));
> > > > add_latent_entropy();
> > > >
> > > > + if (IS_ENABLED(CONFIG_VDSO_GETRANDOM))
> > > > + static_branch_enable(&random_vdso_is_ready);
> > > > +
> > > > /*
> > > > * If we were initialized by the cpu or bootloader before jump labels
> > > > * or workqueues are initialized, then we should enable the static
> > > > @@ -934,8 +944,10 @@ void __init random_init(void)
> > > > crng_set_ready(NULL);
> > > >
> > > > /* Reseed if already seeded by earlier phases. */
> > > > - if (crng_ready())
> > > > + if (crng_ready()) {
> > > > crng_reseed(NULL);
> > > > + random_vdso_set_ready();
> > > > + }
> > >
> > > The fact that the vdso datapage is set up by the time random_init() is
> > > called seems incredibly contingent on init details. Why not, instead,
> > > make this a necessary part of the structure of vdso setup code, which
> > > can actually know about what happens when?
> >
> > The whole early init is "carefully" ordered in any case. I would have been
> > happy to allocate the data pages before the random initialization, but the
> > allocator is not yet usable by then.
> > We could also make the ordering more visible by having the vDSO datastore call
> > into a dedicated function to allow the random core to touch the data pages:
> > random_vdso_enable_datapages().
> >
> > > For example, one clean way of
> > > doing that would be to make vdso_k_rng_data always valid by having it
> > > initially point to __initdata memory, and then when it's time to
> > > initialize the real datapage, memcpy() the __initdata memory to the new
> > > specially allocated memory. Then we don't need the complex state
> > > tracking that this commit and the prior one introduce.
> >
> > Wouldn't that require synchronization between the update path and the memcpy()
> > path? Also if the pointer is going to change at some point we'll probably need
> > to use READ_ONCE()/WRITE_ONCE(). In general I would be happy about a cleaner
> > solution for this but didn't find a great one.
>
> This is still before userspace has started, and interrupts are disabled,
> so I don't think so?
Interrupts being disabled is a good point. But we are still leaking
implementation details from the random code into the vdso datastore.
> Also, you only care about being after
> mm_core_init(), right? So move your thing before sched_init() and then
> you'll really have nothing to worry about.
The callchains random_init_early() -> crng_reseed()/_credit_init_bits() could
still touch the datapage before it is allocated.
Adding conditionals to prevent those is essentially what my patch does.
> But I think globally I agree with Andy/Arnd -- this is kind of ugly and
> not worth it. Disable vDSO for these old CPUs with cache aliasing
> issues.
(I obviously still need to properly respond to Andy and Arnd)
The dynamic allocation does more.
1) It is a preparation for mlockall() for the datapages [0].
2) On the SPARC T4 Niagara this was tested on, the MMU would send an exception,
if userspace accessed the mapped kernel .data page. Even compensating for dcache
aliasing did not prevent this, so that is not the reason. It is not clear why
it happens. At some point I gave up investigating as point 1) would still hold
true.
[0] https://lore.kernel.org/lkml/20250901-vdso-mlockall-v2-0-68f5a6f03345@linutronix.de/
(This revision used 'struct page' with .data memory, but that doesn't
actually work everywhere)
Thomas
Powered by blists - more mailing lists