phc-discussions - Re: [PHC] Another PHC candidates "mechanical" tests (ROUND2)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <55144882.4030501@larc.usp.br>
Date: Thu, 26 Mar 2015 14:57:22 -0300
From: Marcos Simplicio <mjunior@...c.usp.br>
To: discussions@...sword-hashing.net
Subject: Re: [PHC] Another PHC candidates "mechanical" tests (ROUND2)

On 26-Mar-15 13:30, Bill Cox wrote:
> On Thu, Mar 26, 2015 at 8:27 AM, Marcos Simplicio <mjunior@...c.usp.br>
> wrote:
> 
>>
>>>> Or more generic question: what about embedded world?
>>>>
>>>> PHC seems to be mainly about online services, where you have some big
>>>> Intel server hosting behind it but I would like to not forget about
>> embedded
>>>> systems.
>>>
>>> You simply scale down the cost settings accordingly.  yescrypt is
>>> designed to achieve decent attack resistance even at low settings (of
>>> course, I mean decent for those settings), including below 1 MB.
>>>
>>> Alternatively, there are PHC finalists that are well-suited specifically
>>> for such uses - that's Pufferfish and maybe POMELO - but yescrypt has
>>> the advantage of being a single scheme that is well-suited across the
>>> range from KBs to TBs (and beyond, when relevant).
>>>
>>> Lyra2 is less suitable for low sizes like this.
>>>
>>
>> Just for the sake of clarity: why exactly? One can use Lyra2 with
>> small-state sponges (e.g., Blake2s or even something smaller), for
>> example, and the Lyra2 wrapper around this sponge would provide similar
>> protection against attacks involving TMTO (if that is relevant) or
>> parallel attacks (e.g., with bandwidth usage).
>>
>> I mean, without further arguments, I'm not sure I can provide
>> counter-arguments... :)
>>
>> Anyhow we do have some research initiatives in our department in which
>> we are planning to use Lyra2 on embedded systems and see interesting
>> parameters/underlying sponges.
>>
>>
>> Marcos.
>>
> 
> I need to go carefully read the latest version, but IIRC, Lyra2 still does
> no small random reads in it's inner loop, right?  Once you get down to L1
> cache sized hashing, GPUs will dominate over CPUs, unless we do something
> GPUs are not good at.  While this may not always be true, currently GPUs
> are very slow at doing rapid small unpredictable reads. 

Well, we did include in Lyra2's core the extension that would read
rapidly and unpredictably from rows prev^0 and prev^1 (that are likely
in in L1 cache), but sincerely we could not see much of a difference in
modern GPUs coming specifically from this reading pattern (even though
we did observe slowdowns in an older GPU). We decided to keep the tweak
anyway, however, because it makes pipelining between rows more
complicated to achieve, and also considering older GPUs. We did not test
this issue extensively, though, since we decided to keep the extension
anyway (maybe we should revisit that).

Anyhow, I believe that more tests with (1) lower memory usages and (2)
different GPU-oriented techniques may help clarifying this point,
though, because so far we have no experimental data showing when Lyra2
starts being faster in GPUs than in CPUs.

> The 64-byte block
> size of Blake2b is as small a read as Lyra2 can do, isn't it?  

Not really: if one uses Blake2s's internal permutation, then the block
becomes 32 bytes (if I remember well, it is a 512-bit permutation); if
one uses an AES-like permutation, it will be a 16-byte internal state;
if one prefers smaller permutations, so be it. In summary: we have not
fixed this parameter at any point: it depends on the sponge you like the
most for your scenario (it could even be Keccak if you are using an ASIC
co-processor!)

> Also, there
> are repeated accesses in a row to the same address which GPUs do well, if
> I'm not mistaken.  

> This is why I rated Yescrypt stronger at GPU defense in
> general, but this is especially true at L1-cache sized hashes.  For larger
> hashes, GPUs loose anyway.  For Scrypt, the parity point is around 4MiB.
> For Yescrypt, I suspect it's closer to 16KiB.  Where is the parity point
> for Lyra2?

That is something we need to find out. Again, different sponges will
have to be considered, as a smaller permutation may be more adequate for
scenarios with lower memory sizes.

> 
> By the way, I am a fan of Lyra2.  It's excellent work, and as I said
> before, I will not be upset at all if Lyra2 is selected as the winner.
> However, Yescrypt does gain improved defense for it's additional complexity.
> 

To be completely honest: I have nothing against any candidate and will
be glad with any choice as winner(s), whether or not Lyra2 is among them
(from a pet project to a PHC finalist that aided in the discussion of
using a "reduced round primitive" is already something to be proud of).

What I'm trying to do is exactly to answer the questions that, as is the
main idea of this second phase, may lead to Lyra2's downfall or not. No
hard feelings either way: it is just how research in the area of
security is done :)