[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <52A2842D-AC23-4321-B06B-CDA082183862@zytor.com>
Date: Thu, 01 Feb 2024 11:02:32 -0800
From: "H. Peter Anvin" <hpa@...or.com>
To: Dave Hansen <dave.hansen@...el.com>,
        "Jason A. Donenfeld" <Jason@...c4.com>,
        "Theodore Ts'o" <tytso@....edu>,
        "Reshetova, Elena" <elena.reshetova@...el.com>,
        Dave Hansen <dave.hansen@...ux.intel.com>
CC: "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
        Thomas Gleixner <tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>,
        Borislav Petkov <bp@...en8.de>, "x86@...nel.org" <x86@...nel.org>,
        Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@...ux.intel.com>,
        "Nakajima, Jun" <jun.nakajima@...el.com>,
        Tom Lendacky <thomas.lendacky@....com>,
        "Kalra, Ashish" <ashish.kalra@....com>,
        Sean Christopherson <seanjc@...gle.com>,
        "linux-coco@...ts.linux.dev" <linux-coco@...ts.linux.dev>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 1/2] x86/random: Retry on RDSEED failure
On February 1, 2024 10:46:06 AM PST, Dave Hansen <dave.hansen@...el.com> wrote:
>On 2/1/24 10:09, Jason A. Donenfeld wrote:
>> Question ii) Just how DoS-able is RDRAND? From host to guest, where
>> the host controls scheduling, that seems easier, but how much so, and
>> what's the granularity of these operations, and could retries still
>> help, or not at all? What about from guest to guest, where the
>> scheduling is out of control; in that case is there a value of N for
>> which N retries makes it actually impossible to DoS? What about from
>> userspace to kernelspace; good value of N?
>
>So far, in practice, I haven't seen a single failure of RDRAND.  It's
>been limited to RDSEED.  In a perfect world, I'd change the architecture
>docs to say, "RDRAND only fails when the hardware breaks" and leave
>RDSEED defined to be the one that fails easily.
>
>Dealing with a fragile RDSEED seems like a much easier problem than
>dealing with a fragile RDRAND since RDSEED is used _much_ more sparingly
>in the kernel today.
>
>But I'm not sure if the hardware implementations fit into this perfect
>world I've conjured up.  We're going to wrangle up the folks at Intel
>who can hopefully tell me if I'm totally deluded.
>
>Has anyone seen RDRAND failures in practice?  Or just RDSEED?
>
>> Question iii) How likely is Intel to actually fix this in a
>> satisfactory way (see "specifying this is an interesting question" in
>> [1])? And if they would, what would the timeline even be?
>
>If the fix is pure documentation, it's on the order of months.  I'm
>holding out hope that some kind of anti-DoS claims like you mentioned:
>
>> Specifying this is an interesting question. What exactly might our
>> requirements be for a "non-broken" RDRAND? It seems like we have two
>> basic ones:
>> 
>> - One VMX (or host) context can't DoS another one.
>> - Ring 3 can't DoS ring 0.
>
>are still possible on existing hardware, at least for RDRAND.
The real question is: what do we actually need?
During startup, we could afford a *lot* of looping to collect enough entropy before giving up. After that, even if RDSEED fails 99% of the time, it will still produce far more entropy than a typical external randomness source. We don't want to loop that long, obviously (*), but instead try periodically and let the entropy accumulate.
(*) We *could* of course choose to aggressively loop in task context if there task would otherwise block on /dev/random.
Powered by blists - more mailing lists
 
