[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHmME9oonMxxfEq7sjSSYc7XPwzjW4e45JTbBCJ2hFEbL-tnyw@mail.gmail.com>
Date: Tue, 7 Dec 2021 18:22:26 +0100
From: "Jason A. Donenfeld" <Jason@...c4.com>
To: Dominik Brodowski <linux@...inikbrodowski.net>
Cc: Hsin-Yi Wang <hsinyi@...omium.org>,
"Theodore Ts'o" <tytso@....edu>,
"Ivan T. Ivanov" <iivanov@...e.de>,
Ard Biesheuvel <ardb@...nel.org>, linux-efi@...r.kernel.org,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v5] random: fix crash on multiple early calls to add_bootloader_randomness()
Hi Dominik,
Thanks for the followup patch. Some comments below:
On Mon, Dec 6, 2021 at 9:58 PM Dominik Brodowski
<linux@...inikbrodowski.net> wrote:
> @@ -722,7 +722,8 @@ static void credit_entropy_bits(struct entropy_store *r, int nbits)
> if (r == &input_pool) {
> int entropy_bits = entropy_count >> ENTROPY_SHIFT;
>
> - if (crng_init < 2 && entropy_bits >= 128)
> + if (crng_init < 2 && entropy_bits >= 128 &&
> + crng_global_init_time > 0)
crng_global_init_time is being used because it's set in
rand_initialize(), and rand_initialize() is called after workqueues
have been set up. Fair enough. But:
a) It's still set to `jiffies - 1` afterwards at runtime, via
ioctl(RNDRESEEDCRNG). I'm not sure if we want to mess around with
having a 1:2**32 chance of getting this at the unlucky 0 wraparound.
It seems like that kind of strategy generally works if unlikely
failure is tolerable, but it seems like that's not a game to play
here. There are a few other options:
b) Use another proxy for the state, like
rand_initialize()->primary_crng.state (ugly) or something better.
c) Add another static global state variable or static branch to random.c.
d) Actually conditionalize this on whether or not workqueues have been
init'd, which is *actually* the thing you care about, not whether
rand_initialize() has fired.
I think that of these, (d) seems the most correct. The thing you're
*actually* trying to prevent is that
`schedule_work(&numa_crng_init_work)` call being triggered before
workqueues are up. It looks like system_wq is made non-NULL by
workqueue_init_early(); perhaps you could get away with
conditionalizing it on that instead?
This also seems like a distinct state machine derp from the rest of
the patch, so I wonder if it could be separated out into a more
trivial patch, in case it does prove problematic somehow.
> +#ifdef CONFIG_RANDOM_TRUST_BOOTLOADER
Can you use IF_ENABLED() like the code that this replaces? Easier to
read and ensures syntax doesn't regress on less-used configurations.
> void add_bootloader_randomness(const void *buf, unsigned int size)
> + unsigned long time = random_get_entropy() ^ jiffies;
> + unsigned long flags;
> +
> + if (!crng_ready() && size) {
> +#ifdef CONFIG_RANDOM_TRUST_BOOTLOADER
> + crng_fast_load(buf, size);
> +#else
> + crng_slow_load(buf, size);
> +#endif /* CONFIG_RANDOM_TRUST_BOOTLOADER */
> + }
> +
> + spin_lock_irqsave(&input_pool.lock, flags);
> + _mix_pool_bytes(&input_pool, buf, size);
> + _mix_pool_bytes(&input_pool, &time, sizeof(time));
> + spin_unlock_irqrestore(&input_pool.lock, flags);
> +
> +#ifdef CONFIG_RANDOM_TRUST_BOOTLOADER
> + credit_entropy_bits(&input_pool, size * 8);
> +#endif
> }
> EXPORT_SYMBOL_GPL(add_bootloader_randomness);
Trying to understand this and compare it to what was here before. Two
cases: trustbootloader and !trustbootloader.
trustbootloader looks like this:
unsigned long time = random_get_entropy() ^ jiffies;
unsigned long flags;
if (!crng_ready() && size)
crng_fast_load(buf, size);
spin_lock_irqsave(&input_pool.lock, flags);
_mix_pool_bytes(&input_pool, buf, size);
_mix_pool_bytes(&input_pool, &time, sizeof(time));
spin_unlock_irqrestore(&input_pool.lock, flags);
credit_entropy_bits(&input_pool, size * 8);
!trustbooloader looks like this:
unsigned long time = random_get_entropy() ^ jiffies;
unsigned long flags;
if (!crng_ready() && size)
crng_slow_load(buf, size);
spin_lock_irqsave(&input_pool.lock, flags);
_mix_pool_bytes(&input_pool, buf, size);
_mix_pool_bytes(&input_pool, &time, sizeof(time));
spin_unlock_irqrestore(&input_pool.lock, flags);
So this means !trustbootloader is the same as add_device_randomness
(with the call to trace_add_device_randomness removed). It'd probably
make sense to continue just branching to add_device_randomness on an
IS_ENABLED(), then, so it's more clear what's up, rather than open
coding the distinct paths and copying code around.
But trustbootloader is a different case. Here the differences appear to be:
1) crng_fast_load is now called for crng_init==1||crng_init==0 [via
crng_ready()] instead of for crng_init==0;
2) The function now continues onward after calling crng_fast_load
instead of returning;
3) The useless call to wait_event_interruptible is now skipped;
4) _mix_pool_bytes(random_get_entropy() ^ jiffies) is now called in
addition to _mix_pool_bytes(buf).
I'd now like to map (1), (2), (3), and (4) back to the rationale in
your commit message.
(2) seems to be related to:
> On the first call to add_hwgenerator_randomness(), crng_fast_load() is
> executed, and if the seed is long enough, crng_init will be set to 1.
> However, no entropy is currently credited for that, even though the
> name and description of CONFIG_RANDOM_TRUST_BOOTLOADER states otherwise.
But it seems like rather than going from:
if (!ready) {
crng_fast_load(buf, size);
return;
}
to:
if (!ready)
crng_fast_load(buf, size);
the actually corresponding fix would be to go instead to:
if (!ready)
crng_fast_load(buf, size);
if (!ready)
return;
(3) seems to be related to:
> wait_event_interruptible() (which makes no sense for the init process)
which is fine.
(1) and (4) I don't see justification for. Perhaps it's a mistake?
Regards,
Jason
Powered by blists - more mailing lists