[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <202402202226.A6817927@keescook>
Date: Tue, 20 Feb 2024 22:33:39 -0800
From: Kees Cook <keescook@...omium.org>
To: Jeremy Linton <jeremy.linton@....com>,
"Jason A. Donenfeld" <Jason@...c4.com>
Cc: linux-arm-kernel@...ts.infradead.org, catalin.marinas@....com,
will@...nel.org, gustavoars@...nel.org, mark.rutland@....com,
rostedt@...dmis.org, arnd@...db.de, broonie@...nel.org,
guohui@...ontech.com, Manoj.Iyer@....com,
linux-kernel@...r.kernel.org, linux-hardening@...r.kernel.org,
James Yang <james.yang@....com>,
Shiyou Huang <shiyou.huang@....com>
Subject: Re: [RFC] arm64: syscall: Direct PRNG kstack randomization
On Tue, Feb 20, 2024 at 08:02:58PM -0600, Jeremy Linton wrote:
> The existing arm64 stack randomization uses the kernel rng to acquire
> 5 bits of address space randomization. This is problematic because it
> creates non determinism in the syscall path when the rng needs to be
> generated or reseeded. This shows up as large tail latencies in some
> benchmarks and directly affects the minimum RT latencies as seen by
> cyclictest.
Some questions:
- for benchmarks, why not disable kstack randomization?
- if the existing pRNG reseeding is a problem here, why isn't it a
problem in the many other places it's used?
- I though the pRNG already did out-of-line reseeding?
> Other architectures are using timers/cycle counters for this function,
> which is sketchy from a randomization perspective because it should be
> possible to estimate this value from knowledge of the syscall return
> time, and from reading the current value of the timer/counters.
The expectation is that it would be, at best, unstable.
> So, a poor rng should be better than the cycle counter if it is hard
> to extract the stack offsets sufficiently to be able to detect the
> PRNG's period.
>
> So, we can potentially choose a 'better' or larger PRNG, going as far
> as using one of the CSPRNGs already in the kernel, but the overhead
> increases appropriately. Further, there are a few options for
> reseeding, possibly out of the syscall path, but is it even useful in
> this case?
I'd love to find a way to avoid an pRNG that could be reconstructed
given enough samples. (But perhaps this xorshift RNG resists that?)
-Kees
> Reported-by: James Yang <james.yang@....com>
> Reported-by: Shiyou Huang <shiyou.huang@....com>
> Signed-off-by: Jeremy Linton <jeremy.linton@....com>
> ---
> arch/arm64/kernel/syscall.c | 55 ++++++++++++++++++++++++++++++++++++-
> 1 file changed, 54 insertions(+), 1 deletion(-)
>
> diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c
> index 9a70d9746b66..70143cb8c7be 100644
> --- a/arch/arm64/kernel/syscall.c
> +++ b/arch/arm64/kernel/syscall.c
> @@ -37,6 +37,59 @@ static long __invoke_syscall(struct pt_regs *regs, syscall_fn_t syscall_fn)
> return syscall_fn(regs);
> }
>
> +#ifdef CONFIG_RANDOMIZE_KSTACK_OFFSET
> +DEFINE_PER_CPU(u32, kstackrng);
> +static u32 xorshift32(u32 state)
> +{
> + /*
> + * From top of page 4 of Marsaglia, "Xorshift RNGs"
> + * This algorithm is intended to have a period 2^32 -1
> + * And should not be used anywhere else outside of this
> + * code path.
> + */
> + state ^= state << 13;
> + state ^= state >> 17;
> + state ^= state << 5;
> + return state;
> +}
> +
> +static u16 kstack_rng(void)
> +{
> + u32 rng = raw_cpu_read(kstackrng);
> +
> + rng = xorshift32(rng);
> + raw_cpu_write(kstackrng, rng);
> + return rng & 0x1ff;
> +}
> +
> +/* Should we reseed? */
> +static int kstack_rng_setup(unsigned int cpu)
> +{
> + u32 rng_seed;
> +
> + do {
> + rng_seed = get_random_u32();
> + } while (!rng_seed);
> + raw_cpu_write(kstackrng, rng_seed);
> + return 0;
> +}
> +
> +static int kstack_init(void)
> +{
> + int ret;
> +
> + ret = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "arm64/cpuinfo:kstackrandomize",
> + kstack_rng_setup, NULL);
> + if (ret < 0)
> + pr_err("kstack: failed to register rng callbacks.\n");
> + return 0;
> +}
> +
> +arch_initcall(kstack_init);
> +#else
> +static u16 kstack_rng(void) { return 0; }
> +#endif /* CONFIG_RANDOMIZE_KSTACK_OFFSET */
> +
> static void invoke_syscall(struct pt_regs *regs, unsigned int scno,
> unsigned int sc_nr,
> const syscall_fn_t syscall_table[])
> @@ -66,7 +119,7 @@ static void invoke_syscall(struct pt_regs *regs, unsigned int scno,
> *
> * The resulting 5 bits of entropy is seen in SP[8:4].
> */
> - choose_random_kstack_offset(get_random_u16() & 0x1FF);
> + choose_random_kstack_offset(kstack_rng());
> }
>
> static inline bool has_syscall_work(unsigned long flags)
> --
> 2.43.0
>
--
Kees Cook
Powered by blists - more mailing lists