lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <202402202226.A6817927@keescook>
Date: Tue, 20 Feb 2024 22:33:39 -0800
From: Kees Cook <keescook@...omium.org>
To: Jeremy Linton <jeremy.linton@....com>,
	"Jason A. Donenfeld" <Jason@...c4.com>
Cc: linux-arm-kernel@...ts.infradead.org, catalin.marinas@....com,
	will@...nel.org, gustavoars@...nel.org, mark.rutland@....com,
	rostedt@...dmis.org, arnd@...db.de, broonie@...nel.org,
	guohui@...ontech.com, Manoj.Iyer@....com,
	linux-kernel@...r.kernel.org, linux-hardening@...r.kernel.org,
	James Yang <james.yang@....com>,
	Shiyou Huang <shiyou.huang@....com>
Subject: Re: [RFC] arm64: syscall: Direct PRNG kstack randomization

On Tue, Feb 20, 2024 at 08:02:58PM -0600, Jeremy Linton wrote:
> The existing arm64 stack randomization uses the kernel rng to acquire
> 5 bits of address space randomization. This is problematic because it
> creates non determinism in the syscall path when the rng needs to be
> generated or reseeded. This shows up as large tail latencies in some
> benchmarks and directly affects the minimum RT latencies as seen by
> cyclictest.

Some questions:

- for benchmarks, why not disable kstack randomization?
- if the existing pRNG reseeding is a problem here, why isn't it a
  problem in the many other places it's used?
- I though the pRNG already did out-of-line reseeding?

> Other architectures are using timers/cycle counters for this function,
> which is sketchy from a randomization perspective because it should be
> possible to estimate this value from knowledge of the syscall return
> time, and from reading the current value of the timer/counters.

The expectation is that it would be, at best, unstable.

> So, a poor rng should be better than the cycle counter if it is hard
> to extract the stack offsets sufficiently to be able to detect the
> PRNG's period.
> 
> So, we can potentially choose a 'better' or larger PRNG, going as far
> as using one of the CSPRNGs already in the kernel, but the overhead
> increases appropriately. Further, there are a few options for
> reseeding, possibly out of the syscall path, but is it even useful in
> this case?

I'd love to find a way to avoid an pRNG that could be reconstructed
given enough samples. (But perhaps this xorshift RNG resists that?)

-Kees

> Reported-by: James Yang <james.yang@....com>
> Reported-by: Shiyou Huang <shiyou.huang@....com>
> Signed-off-by: Jeremy Linton <jeremy.linton@....com>
> ---
>  arch/arm64/kernel/syscall.c | 55 ++++++++++++++++++++++++++++++++++++-
>  1 file changed, 54 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c
> index 9a70d9746b66..70143cb8c7be 100644
> --- a/arch/arm64/kernel/syscall.c
> +++ b/arch/arm64/kernel/syscall.c
> @@ -37,6 +37,59 @@ static long __invoke_syscall(struct pt_regs *regs, syscall_fn_t syscall_fn)
>  	return syscall_fn(regs);
>  }
>  
> +#ifdef CONFIG_RANDOMIZE_KSTACK_OFFSET
> +DEFINE_PER_CPU(u32, kstackrng);
> +static u32 xorshift32(u32 state)
> +{
> +	/*
> +	 * From top of page 4 of Marsaglia, "Xorshift RNGs"
> +	 * This algorithm is intended to have a period 2^32 -1
> +	 * And should not be used anywhere else outside of this
> +	 * code path.
> +	 */
> +	state ^= state << 13;
> +	state ^= state >> 17;
> +	state ^= state << 5;
> +	return state;
> +}
> +
> +static u16 kstack_rng(void)
> +{
> +	u32 rng = raw_cpu_read(kstackrng);
> +
> +	rng = xorshift32(rng);
> +	raw_cpu_write(kstackrng, rng);
> +	return rng & 0x1ff;
> +}
> +
> +/* Should we reseed? */
> +static int kstack_rng_setup(unsigned int cpu)
> +{
> +	u32 rng_seed;
> +
> +	do {
> +		rng_seed = get_random_u32();
> +	} while (!rng_seed);
> +	raw_cpu_write(kstackrng, rng_seed);
> +	return 0;
> +}
> +
> +static int kstack_init(void)
> +{
> +	int ret;
> +
> +	ret = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "arm64/cpuinfo:kstackrandomize",
> +				kstack_rng_setup, NULL);
> +	if (ret < 0)
> +		pr_err("kstack: failed to register rng callbacks.\n");
> +	return 0;
> +}
> +
> +arch_initcall(kstack_init);
> +#else
> +static u16 kstack_rng(void) { return 0; }
> +#endif /* CONFIG_RANDOMIZE_KSTACK_OFFSET */
> +
>  static void invoke_syscall(struct pt_regs *regs, unsigned int scno,
>  			   unsigned int sc_nr,
>  			   const syscall_fn_t syscall_table[])
> @@ -66,7 +119,7 @@ static void invoke_syscall(struct pt_regs *regs, unsigned int scno,
>  	 *
>  	 * The resulting 5 bits of entropy is seen in SP[8:4].
>  	 */
> -	choose_random_kstack_offset(get_random_u16() & 0x1FF);
> +	choose_random_kstack_offset(kstack_rng());
>  }
>  
>  static inline bool has_syscall_work(unsigned long flags)
> -- 
> 2.43.0
> 

-- 
Kees Cook

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ