[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140828140331.GA570@openwall.com>
Date: Thu, 28 Aug 2014 18:03:31 +0400
From: Solar Designer <solar@...nwall.com>
To: discussions@...sword-hashing.net
Subject: Re: [PHC] What Microsoft Would like from the PHC - Passwords14 presentation
On Thu, Aug 28, 2014 at 01:59:08PM +1000, Rade Vuckovac wrote:
> One recommendation to remedy some of
> the timing leaks is to keep everything in cache, which in practice is not a
> trivial task. The guess is that mentioned remedy can be similarly obtained
> if everything is kept outside of cache. That option is far from optimal in
> AES case. However in the password hashing realm (optimisation is not
> imperative) non-cacheable memory option may be proven effective:
>
> 1st it is relatively easy to allocate non-cacheable memory (comparing to
> keeping everything in cache).
"Relatively" easy, maybe. But not easy at all, except as a
proof-of-concept running on a specific machine, under a specific OS, and
with specific privileges.
> It is not portable but it seems that every major OS supports that option.
This also involves hardware support, which might or might not be present
and usable for our purpose.
Non-cacheable memory is generally for things such as DMA, so it should
guarantee non-cacheable semantics, but not necessarily timings. There
might still be some caching as long as it's transparent semantics-wise
with respect to DMA to/from off-CPU components on a given platform.
Then, what about TLBs? There may still be TLB hits/misses for the pages
even if the data on those pages is not cacheable. "TLB timing attacks",
anyone? This could be worked around by using exactly one "huge page",
which can be up to 1 GB on x86-64, but again this is hardware platform
specific and it requires OS support, configuration, and privileges.
> 2nd From attacker and defender point of view, a memory accessed randomly
> significantly larger (scrypt recommendation for example) than available
> caches, non-cacheable memory allocation might be more optimal than using
> cache and frequently swapping cache with RAM.
I'd expect this to help only in terms of avoiding cache thrashing, which
is relevant if you have other frequently accessed data in the caches.
Normally, this is what you'd use "non-temporal" loads and stores for,
instead of going for non-cacheable memory. To clarify: no, use of
non-temporal loads and stores doesn't guarantee cache timing attack
resistance, as it bypasses only some rather than all levels of caches,
and also as it's merely a hint, which CPUs are free to honor or ignore.
(yescrypt uses non-temporal loads for the "ROM", to help keep portions
of the "RAM" cached. In my testing, this provides a speedup on AMD, but
not on Intel CPUs. For the "RAM", there's no speedup from non-temporal
loads given yescrypt's access pattern, on the CPUs I tested.)
Alexander
Powered by blists - more mailing lists