linux-kernel - Re: [PATCH 1/2] x86/random: Retry on RDSEED failure

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <b8ce64d0-ce51-4355-8c76-34df75617136@intel.com>
Date: Wed, 14 Feb 2024 12:14:46 -0800
From: Dave Hansen <dave.hansen@...el.com>
To: "Jason A. Donenfeld" <Jason@...c4.com>,
 "Reshetova, Elena" <elena.reshetova@...el.com>
Cc: Theodore Ts'o <tytso@....edu>, Dave Hansen <dave.hansen@...ux.intel.com>,
 "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
 Thomas Gleixner <tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>,
 Borislav Petkov <bp@...en8.de>, "H. Peter Anvin" <hpa@...or.com>,
 "x86@...nel.org" <x86@...nel.org>,
 Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@...ux.intel.com>,
 "Nakajima, Jun" <jun.nakajima@...el.com>,
 Tom Lendacky <thomas.lendacky@....com>, "Kalra, Ashish"
 <ashish.kalra@....com>, Sean Christopherson <seanjc@...gle.com>,
 "linux-coco@...ts.linux.dev" <linux-coco@...ts.linux.dev>,
 "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 1/2] x86/random: Retry on RDSEED failure

On 2/14/24 09:21, Jason A. Donenfeld wrote:
> One clarifying question in all of this: what is the point of the "try 10
> times" advice? Is the "faster than the bus" statement actually "faster
> than the bus if you try 10 times"? Or is the "10 times" advice just old
> and not relevant.
> 
> In other words, is the following a reasonable patch?
> 
> diff --git a/arch/x86/include/asm/archrandom.h b/arch/x86/include/asm/archrandom.h
> index 02bae8e0758b..2d5bf5aa9774 100644
> --- a/arch/x86/include/asm/archrandom.h
> +++ b/arch/x86/include/asm/archrandom.h
> @@ -13,22 +13,16 @@
>  #include <asm/processor.h>
>  #include <asm/cpufeature.h>
>  
> -#define RDRAND_RETRY_LOOPS	10
> -
>  /* Unconditional execution of RDRAND and RDSEED */
>  
>  static inline bool __must_check rdrand_long(unsigned long *v)
>  {
>  	bool ok;
> -	unsigned int retry = RDRAND_RETRY_LOOPS;
> -	do {
> -		asm volatile("rdrand %[out]"
> -			     CC_SET(c)
> -			     : CC_OUT(c) (ok), [out] "=r" (*v));
> -		if (ok)
> -			return true;
> -	} while (--retry);
> -	return false;
> +	asm volatile("rdrand %[out]"
> +		     CC_SET(c)
> +		     : CC_OUT(c) (ok), [out] "=r" (*v));
> +	WARN_ON(!ok);
> +	return ok;
>  }

The key question here is if RDRAND can ever fail on perfectly good hardware.

I think it's theoretically possible for the entropy source health checks
to fail on perfectly good hardware for an arbitrarily long time.  But
the odds of this happening to the point of it affecting RDRAND are
rather small.

There's a reason that the guidance says: "the odds of ten failures in a
row are astronomically small" _instead_ of claiming the same about a
single RDRAND.

Given the scale that the kernel operates at, I think we should leave the
loop.