linux-kernel - Re: is "premature next" a real world rng concern, or just an academic exercise?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7EB51D84-90A4-4C97-9A81-14A8C32990F7@cornell.edu>
Date:   Wed, 11 May 2022 20:26:08 +0000
From:   Thomas Ristenpart <ristenpart@...nell.edu>
To:     Yevgeniy Dodis <dodis@...nyu.edu>
CC:     "Jason A. Donenfeld" <Jason@...c4.com>, tytso <tytso@....edu>,
        Nadia Heninger <nadiah@...ucsd.edu>,
        Noah Stephens-Dawidowitz <noahsd@...il.com>,
        Stefano Tessaro <tessaro@...washington.edu>,
        "torvalds@...ux-foundation.org" <torvalds@...ux-foundation.org>,
        "D. J. Bernstein" <djb@...yp.to>,
        "jeanphilippe.aumasson@...il.com" <jeanphilippe.aumasson@...il.com>,
        "jann@...jh.net" <jann@...jh.net>,
        "keescook@...omium.org" <keescook@...omium.org>,
        "gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>,
        Peter Schwabe <peter@...ptojedi.org>,
        "linux-crypto@...r.kernel.org" <linux-crypto@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: is "premature next" a real world rng concern, or just an academic
 exercise?

Hi all,
Thanks to Jason for CC’ing me on this fascinating thread. I was chatting with Jason and Nadia at RWC, and was also advocating for simplifying the design of /dev/urandom. My two cents at a high level:

1) I haven’t seen any compelling evidence that premature next attacks are a realistic threat.
2) Countermeasures against premature next complicate designs, slowing down useful entropy accumulation and need to be worked around to avoid exacerbating more pressing issues. 



On vulnerabilities:

My and others’ prior work on VM reset threat models [1] showed pretty convincingly via experiments that the /dev/(u)random pooled design leads to problems in settings where a VM is replayed multiple times from the same (full memory snapshot). In large part this is due to the pooling and entropy estimation mechanisms in the older design. The vulnerabilities also were demonstrated in FreeBSD and Windows 7 RNGs, however, due to subtleties presumably in their reseed scheduling.  The vulnerabilities described in the academic papers can be exploited in lab settings, but whether they are a pressing concern in practice remains unclear to me. At least, I haven’t heard any evidence of VM resets being exploited by threat actors. Nevertheless I think they should be fixed. Ted graciously chatted with us about it back then, but he concluded our particular suggested design was too slow (a fair assessment). 

VM resets also affect userland processes (in fact possibly even more so than the kernel), but that is perhaps off-topic. 

Boot time entropy holes that lead to factorable RSA public keys as per Nadia’s and her collaborators’ prior work does seem pretty immediately of practical risk!

Unlike the above, vulnerability to premature next attacks has no supporting experimental evidence that I’m aware of. (We refer to these as “tracking” attacks in [1].) As Jason pointed out and Yevgeniy added to, these require at least a compromise of RNG state (A in his labeling), and continued access to RNG outputs after losing access to ongoing RNG state (B in his labeling). But I would like to emphasize a slightly more granular set of requirements for the kinds of premature next attacks that Fortuna and other multi-pool mechanisms protect against, building off Jason and others’ observations:

A) RNG state compromise
A’) Attacker silently loses access to the RNG state, not due to a reboot or VM resumption
B) Attacker maintains continuous tap on RNG outputs 

I really think step A’ is a critical point here. By silently I mean the adversary somehow loses access without administrators knowing about compromise, and not due to an event that anyway reinitializes the RNG (reboot or VM resumption). This is critical for the threat model to make sense, since otherwise if the admin knows about the compromise, cleanup needs to occur which should include a reboot. Or if the attacker loses access but anyway we’re reinitializing the RNG, then we’re also fine.  I don’t know of any credible situations where A + A’ + B arise. 



On designs:

Fortuna, Whirlwind (from [1]), Yarrow, etc. and other multi-pool designs defend against premature next in a setting where: (1) you don’t know when a compromise starts and ends; and (2) you can’t generate entropy on demand. I don’t think other assumption is really reflective of practice — see discussion above for why (1) doesn’t seem to come up in threat models and for (2) CPU jitter dances seem to work pretty well (though performance is a question). 

Note that since we round-robin entropy measurements to different pools, these multi-pool designs can exacerbate the problem of quickly (re-)initializing during boot or VM resumption. I actually do not know how choice and parameterization of Fortuna, Yarrow affects the window of vulnerability in VM reset settings; we never measured this explicitly. But in any case I think we need an explicit, fast reinitialization step to be safe here (as discussed in Whirlwind design and Jason’s emails), and so a redesign should allow for fast entropy gathering loops, and those loops should immediately dump all their entropy into the RNG state for use. Thus you would need to have a code-path that avoids round-robining to slow pools in  a (re)initialization routine. 

To me the high-level design features that seems to check all the boxes, including importantly simplicity:

1) A single pool where opportunistic entropy measurements (interrupt timings, etc.) are folded in and that is used to generate outputs.
2) An explicit “generate entropy” routine that attempts to quickly generate  a large amount of entropy. Use this to (re)initialize the state upon system events like boot and VM resumption. The CPU jitter dance type mechanisms are a good bet, though someone should probably check that these work on low-end systems. 

Also I would advocate always folding in other sources of entropy (e.g., RDRAND) when available, performance allowing, in both 1 and 2. Given the above discussion, I don’t think it’s very important, but an extension of the above to provide some limitation of premature next concerns would be:

3) Periodically call 2. For example, when a CPU is otherwise idle. This would have same effect as Fortuna-style approaches without adding new buffers, etc. 

Details would need to be worked out, of course. Hope this was helpful and apologies that it got long,

Cheers,
  -Tom


[1] https://rist.tech.cornell.edu/papers/vmrng.pdf

> On May 9, 2022, at 11:55 AM, Yevgeniy Dodis <dodis@...nyu.edu> wrote:
> 
> resending in plain text... (hope got it right)
> 
> On Mon, May 9, 2022 at 11:15 AM Yevgeniy Dodis <dodis@...nyu.edu> wrote:
>> 
>> Hi Jason and all.
>> 
>> Thank you for starting this fascinating discussion. I generally agree with everything Jason said. In particular, I am not
>> 100% convinced that the extra cost of the premature next defense is justified.(Although Windows and MacOS are adamant it is
>> worth it :).)
>> 
>> But let me give some meta points to at least convince you this is not as obvious as Jason makes it sound.
>> 
>> 1) Attacking RNGs in any model is really hard. Heck, everybody knew for years that /dev/random is a mess
>> (and we published it formally in 2013, although this was folklore knowledge),  but in all these years nobody
>> (even Nadya's group :)) managed to find a practical attack. So just because the attack seems far-fetched, I do not think we should
>> lower our standards and do ugly stuff. Otherwise, just leave /dev/random the way it was before Jason started his awesome work.
>> 
>> 2) As Jason says, there are two distinct attack vectors needed to make the premature next attack.
>> A) compromising the state
>> B) (nearly) continuously observing RNG outputs
>> 
>> I agree with Jason's point that finding places where
>> -- A)+B) is possible, but
>> --- A)+A) is not possible,
>> is tricky. Although Nadya kind of indicated a place like that. VM1 and VM2 start with the same RNG state (for whatever
>> reason). VM1 is insecure, so can leak the state via A). VM2 is more secure, but obviously allows for B) through system
>> interface. This does not seem so hypothetical for me, especially in light of my mega-point 1) above -- almost any real-world
>> RNG attack is hard.
>> 
>> But I want to look at it from a different angle here. Let's ask if RNGs should be secure against A) or B) individually.
>> 
>> I think everybody agrees protection from B) is a must. This is the most basic definition of RNG! So let's just take itas
>> an axiom.
>> 
>> Protection against A) is trickier. But my read of Jason's email is that all his criticism comes exactly from this point.
>> If your system allows for state compromise, you have bigger problems than the premature next, etc. But let's ask ourselves
>> the question. Are we ready to design RNGs without recovery from state compromise? I believe nobody on this list would
>> be comfortable saying "yes". Because this would mean we don;t need to accumulate entropy beyond system start-up.
>> Once we reach the point of good initial state, and state compromise is not an issue, just use straight ChaCha or whatever other
>> stream cipher.
>> 
>> The point is, despite all arguments Jason puts, we all would feel extremely uncomfortable/uneasy to let continuous
>> entropy accumulation go, right?
>> 
>> This means we all hopefully agree that we need protection against A) and B) individually.
>> 
>> 3) Now comes the question. If we want to design a sound RNG using tools of modern cryptography, and we allow
>> the attacker an individual capability to enforce A) or B) individually, are we comfortable with the design where we:
>> * offer protection against A)
>> * offer protection against B)
>> * do NOT offer protection against A)+B), because we think it's too expensive given A)+B) is so rare?
>> 
>> I do not have a convincing answer to this question, but it is at least not obvious to me. On a good note, one worry
>> we might have is how to even have a definition protecting A), protecting B), but not protecting A)+B).
>> Fortunately, our papers resolve this question (although there are still theoretical annoyances which I do not
>> want to get into in this email). So, at least from this perspective, we are good. We have a definition with
>> exactly these (suboptimal) properties.
>> 
>> Anyway, these are my 2c.
>> Thoughts?
>> 
>> Yevgeniy
>> 
>> On Sun, May 1, 2022 at 7:17 AM Jason A. Donenfeld <Jason@...c4.com> wrote:
>>> 
>>> Hi Ted,
>>> 
>>> That's a useful analysis; thanks for that.
>>> 
>>> On Sat, Apr 30, 2022 at 05:49:55PM -0700, tytso wrote:
>>>> On Wed, Apr 27, 2022 at 03:58:51PM +0200, Jason A. Donenfeld wrote:
>>>>> 
>>>>> 3) More broadly speaking, what kernel infoleak is actually acceptable to
>>>>>   the degree that anybody would feel okay in the first place about the
>>>>>   system continuing to run after it's been compromised?
>>>> 
>>>> A one-time kernel infoleak where this might seem most likely is one
>>>> where memory is read while the system is suspended/hibernated, or if
>>>> you have a VM which is frozen and then replicated.  A related version
>>>> is one where a VM is getting migrated from one host to another, and
>>>> the attacker is able to grab the system memory from the source "host"
>>>> after the VM is migrated to the destination "host".
>>> 
>>> You've identified ~two places where compromises happen, but it's not an
>>> attack that can just be repeated simply by re-running `./sploit > state`.
>>> 
>>> 1) Virtual machines:
>>> 
>>> It seems like after a VM state compromise during migration, or during
>>> snapshotting, the name of the game is getting entropy into the RNG in a
>>> usable way _as soon as possible_, and not delaying that. This is
>>> Nadia's point. There's some inherent tension between waiting some amount
>>> of time to use all available entropy -- the premature next requirement
>>> -- and using everything you can as fast as you can because your output
>>> stream is compromised/duplicated and that's very bad and should be
>>> mitigated ASAP at any expense.
>>> 
>>> [I'm also CC'ing Tom Risenpart, who's been following this thread, as he
>>> did some work regarding VM snapshots and compromise, and what RNG
>>> recovery in that context looks like, and arrived at pretty similar
>>> points.]
>>> 
>>> You mentioned virtio-rng as a mitigation for this. That works, but only
>>> if the data read from it are actually used rather quickly. So probably
>>> /waiting/ to use that is suboptimal.
>>> 
>>> One of the things added for 5.18 is this new "vmgenid" driver, which
>>> responds to fork/snapshot notifications from hypervisors, so that VMs
>>> can do something _immediately_ upon resumption/migration/etc. That's
>>> probably the best general solution to that problem.
>>> 
>>> Though vmgenid is supported by QEMU, VMware, Hyper-V, and hopefully soon
>>> Firecracker, there'll still be people that don't have it for one reason
>>> or another (and it has to be enabled manually in QEMU with `-device
>>> vmgenid,guid=auto`; perhaps I should send a patch adding that to some
>>> default machine types). Maybe that's their problem, but I take as your
>>> point that we can still try to be less bad than otherwise by using more
>>> entropy more often, and not delaying as the premature next model
>>> requirements would have us do.
>>> 
>>> 2) Suspend / hibernation:
>>> 
>>> This is kind of the same situation as virtual machines, but the
>>> particulars are a little bit different:
>>> 
>>>  - There's no hypervisor giving us new seed material on resumption like
>>>    we have with VM snapshots and vmgenid; but
>>> 
>>>  - We also always know when it happens, because it's not transparent to
>>>    the OS, so at least we can attempt to do something immediately like
>>>    we do with the vmgenid driver.
>>> 
>>> Fortunately, most systems that are doing suspend or hibernation these
>>> days also have a RDRAND-like thing. It seems like it'd be a good idea
>>> for me to add a PM notifier, mix into the pool both
>>> ktime_get_boottime_ns() and ktime_get(), in addition to whatever type
>>> info I get from the notifier block (suspend vs hibernate vs whatever
>>> else) to account for the amount of time in the sleeping state, and then
>>> immediately reseed the crng, which will pull in a bunch of
>>> RDSEED/RDRAND/RDTSC values. This way on resumption, the system is always
>>> in a good place.
>>> 
>>> I did this years ago in WireGuard -- clearing key material before
>>> suspend -- and there are some details around autosuspend (see
>>> wg_pm_notification() in drivers/net/wireguard/device.c), but it's not
>>> that hard to get right, so I'll give it a stab and send a patch.
>>> 
>>>> But if the attacker can actually obtain internal state from one
>>>> reconstituted VM, and use that to attack another reconstituted VM, and
>>>> the attacker also knows what the nonce or time seed that was used so
>>>> that different reconstituted VMs will have unique CRNG streams, this
>>>> might be a place where the "premature next" attack might come into
>>>> play.
>>> 
>>> This is the place where it matters, I guess. It's also where the
>>> tradeoff's from Nadia's argument come into play. System state gets
>>> compromised during VM migration / hibernation. It comes back online and
>>> starts doling out compromised random numbers. Worst case scenario is
>>> there's no RDRAND or vmgenid or virtio-rng, and we've just got the good
>>> old interrupt handler mangling cycle counters. Choices: A) recover from
>>> the compromise /slowly/ in order to mitigate premature next, or B)
>>> recover from the compromise /quickly/ in order to prevent things like
>>> nonce reuse.
>>> 
>>> What is more likely? That an attacker who compromised this state at one
>>> point in time doesn't have the means to do it again elsewhere in the
>>> pipeline, will use a high bandwidth /dev/urandom output stream to mount
>>> a premature next attack, and is going after a high value target that
>>> inexplicably doesn't have RDRAND/vmgenid/virtio-rng enabled? Or that
>>> Nadia's group (or that large building in Utah) will get an Internet tap
>>> and simply start looking for repeated nonces to break?
>>> 
>>> Jason