lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Sat, 18 Apr 2015 23:42:06 +0300 From: Solar Designer <solar@...nwall.com> To: discussions@...sword-hashing.net Subject: Re: [PHC] "Attack on the iterative compression function" On Fri, Apr 17, 2015 at 10:05:47AM -0700, Bill Cox wrote: > Here's the outrageous claim they make against Yescrypt: > > for 1/4 memory, Yescrypt has a 1/2 "Time-memory product" > > In their previous table, they say that at a 1/4 memory attack, the attacker > must do 1135 times more computation. The time*memory defense as used by > the whole world other than the Argon team is therefore 283.75. This paper > is off by a factor of 567.5! > > I do not consider this a weakness of Yescrypt. I wish the Argon team would > start using proper terminology. As I'm sure you realize, it's the difference between computation time and best possible real time (assuming unlimited parallelism and no latency overhead on putting that parallelism to use). I think both metrics make sense and are useful for us to consider. I appreciate that analyses by the Argon team list C(q) and D(q) separately. Dmitry's "Tradeoff for Yescrypt" short paper does acknowledge that for their factor of 6 memory reduction "in practice the computational overhead will be probably too large". Per the first table, it'll be 2^19 for factor of 5, and it is not even listed for factor of 6. Thus, I am more concerned about the attack at lower TMTO ratios, 1/2 to 1/4 inclusive, where the required number of parallel cores isn't necessarily prohibitive per se. The latency overhead on putting those cores to use (including extra latency coming from the increased memory bandwidth usage on needing to fetch, if I am correct, roughly twice more data at the 1/4 ratio) may be more of a problem, and may defeat the attack in some cases. BTW, I think it's 1/4+3/64 memory (the paper mentions this detail, just not in the table), but that's close: 25% vs. 29.7%. Another detail is how the average (rather than peak) memory usage is reduced with TMTO. The recomputation depth grows the longer the algorithm runs, so the average (across the running time) is skewed towards a higher portion of the TMTO'd memory size. For yescrypt at t=0 without TMTO, the memory fills up at 3/4 of the running time, and the average usage is 5/8. If the 1/4+3/64 fills up e.g. at 1/2 of the running time (and does so linearly, even though this is certainly not the case), the average is 3/4 of that, which is higher than the non-TMTO 5/8. So we'd get (1/4+3/64)*3/4 / (5/8) = 35.6% average memory usage under TMTO, which isn't as bad as the 25% where we started. That's just an example. We might want to figure out the actual number. I do consider yescrypt's susceptibility to beneficial (at least under a simplified theoretical model) tradeoff attacks on its BlockMix a drawback of the current yescrypt design. I am thinking of addressing this through documentation, recommended parameter values (for r and t), and possibly a tweak (if permitted in PHC, or beyond PHC if yescrypt isn't selected as a winner). For example, it can be said (in revised documentation) that t=0 is recommended when only computation*memory tradeoffs need to be defeated, and t=1 or higher when also latency*memory tradeoffs need to be defeated under assumption that the attacker has abundant parallel computing resources (but limited memory). As to a possible tweak, I'd appreciate attacks on scrypt's shuffling and on the reverse order sub-blocks (as I suggested in another message). While it can be said that these are attacked by storing additional intermediate sub-blocks, that general statement isn't actionable for me to choose the more effective mitigation, nor to document exactly how effective or not it is. Another option is to adopt Argon2's compression function and introduce tunable parallelism (optional data dependencies) and pwxform rounds into it, although I doubt I'd reach bcrypt-like frequency of S-box lookups there (as I'd need to keep the BLAKE2b rounds as well, for full diffusion, or otherwise this would be pointless). If so, I'd also replace the uses of Salsa20 with BLAKE2b, but even then the overall complexity of full implementation would increase (as Salsa20 would also be needed when scrypt compatibility is compiled in). So it's not an option I like. Something like this may be more appropriate for a revision of Argon2 than for a revision of yescrypt. Alexander
Powered by blists - more mailing lists