lists.openwall.net lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC Open Source and information security mailing list archives
Date: Mon, 30 Mar 2015 17:05:52 +0200
From: Dmitry Khovratovich <khovratovich@...il.com>
To: "discussions@...sword-hashing.net" <discussions@...sword-hashing.net>
Subject: Re: [PHC] Tradeoff cryptanalysis of Catena, Lyra2, and generic
memory-hard functions

HI Alexander,

I doubt that you really need to store <depth> of indices. Of course,
this depends on an implementation of TMTO, but in a naive one that I
have in mind needs only one stored index per missing block. Suppose
that you store M[0],M[1], M[3],M[4], but M[2] and M[5] are missing
needed to recompute M[5], i.e. \phi(5) = 2 (in this example). You look
up for M[2], it is missing too, so you go for M[0] and retrieve it.

So how many blocks that many indices (I ignore the possibility to
retrieve some of the indices from stored blocks).

Thus I do not understand why the block size is different before and

Dmitry

On Mon, Mar 30, 2015 at 4:16 PM, Solar Designer <solar@...nwall.com> wrote:
> Hi Dmitry,
>
> Thank you for commenting on this so promptly.
>
> On Mon, Mar 30, 2015 at 02:58:45PM +0200, Dmitry Khovratovich wrote:
>> you're certainly right, the stored indices must be taken into account
>> when counting memory for tradeoffs. For the default parameters of
>> Catena and Lyra2 that we considered, these indices occupy less than
>> 1/1000 of total memory, and we just (explicitly or implicitly) ignored
>> this small addend in the computation.
>
> OK, this feels realistic, and I hope your estimate is correct.
>
>> The same happened to other
>> "service" information that we use in the attacks (such as storing the
>> access complexities, which accounted to 1/6000 of memory, cf. page 9
>> of the report).
>
> Thank you for mentioning this detail in the report.
>
>> Indeed, smaller blocks would increase the proportion of indices in the
>> memory. To have them occupy, say, 1/10 of memory, let us consider 1 GB
>> memory size. 100 MB of 3-byte indices would imply 2^{25} blocks, i.e.
>
> (Ignoring the 24 vs. 25 minor "rounding" discrepancy for the purpose of
> this discussion.)
>
> This would imply we could refer to a total of 2^{25} blocks and other
> data structures (we'd need such indirect references for depth > 1
> attacks), including possible duplicates (same indices), in whatever data
> structure we'd use to mount the TMTO attack.
>
> That's not the total number of blocks there is.
>
> A question is then what TMTO ratio we could achieve while not needing
> more than 2^{25} indices.
>
>> 32-byte blocks.
>
> I think not, for the reasons above.
>
> We wouldn't be storing each block's index exactly once.  I guess we'd
> need to store roughly depth (e.g., 26.6 in the example I referred to
> before) indices per (pre-TMTO or post-TMTO?) block (including many
> duplicate index values, but this doesn't help us much if at all).
>
> Suppose we have an original memory size of 1 GB, a post-TMTO
> without-indices size of 100 MB (the 1/10 ratio corresponding to depth
> 26.6), and we spend another 100 MB on indices (bringing the actual TMTO
> ratio to only 1/5, so a significantly worse attack than we had estimated
> without considering the indices).  What block size would achieve this?
> We can fit 100 MB / (26.6*3) = ~1.25 million sets of 26.6 indices in our
> 100 MB allotted to them.  If the 26.6*3 = ~80 bytes are needed per
> pre-TMTO block, then a block size of 1 GB / 1.25 million = ~800 bytes
> achieves this.
>
> I made many guesses and assumptions here.  These might be about right or
> they might be way off.  I think this needs to be researched for real.
>
>> 32-byte blocks  would increase latency dramatically (my estimate would
>> be 3-5x at least), and make the scheme much slower and thus weaker.
>
> I agree there's a tradeoff here.
>
> My point is that for some of the larger TMTO ratios in your report, we
> might not have to go to a block size as low as this in order to
> significantly affect the effective TMTO ratio.
>
>> On the architectures where the memory latency is only  a few CPU
>> cycles (AVR?), smaller blocks can be useful.
>
> Right.  This would apply in cases where TMTO attacks would otherwise be
> relevant to most frequently accessed portions of data kept in a device's
> local memory even in an overall large-memory scheme (using external
> memory for most of its data), to fit more instances per device.
>
> Thanks again,
>
> Alexander

--
Best regards,
Dmitry Khovratovich