phc-discussions - Re: [PHC] my pre (if) submit proposal

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <43617150.1003222.1390212856205.open-xchange@email.1and1.com>
Date: Mon, 20 Jan 2014 04:14:16 -0600 (CST)
From: Steve Thomas <steve@...tu.com>
To: discussions@...sword-hashing.net
Subject: Re: [PHC] my pre (if) submit proposal

> On January 20, 2014 at 2:13 AM Krisztián Pintér <pinterkr@...il.com> wrote:
>
> Steve Thomas (at Monday, January 20, 2014, 12:37:58 AM):
>
> >> https://docs.google.com/document/d/18R-qEAmL9WWh5zhGeBlvI7C6ikBAz6TF7MEtfPJK7m0
>
> > This is broken:
> > * Mem is only read once. So once read you can discard it.
>
> > * In the paper you state f=-1 is a perfect choice. Except from i=0 to t/2-1
> > you
> > are just reading zeros. So you don't even need the second half of Mem.
>
>
> this is not broken and also not true. the number of zero-reads is m/2,
> independent of t. the number of unnecessary writes is also m/2. it is
> explained in the document. it is not a break, because it does not
> affect the minimum memory needed, and omitting reads and writes does
> not matter when the algorithm is cpu-bound.
>
> i considered some tuning to address this, but discarded for its
> awkwardness and negative impact on simplicity.

Wow how did I miss that with m and t. I'm sorry that's embarrassing... ahhhh
"Mem[i*r + j]" needs to be "Mem[(i*r + j) % m]". I just assumed that the loop
didn't buffer overflow. I should of known this because of "Mem[(i*r + j) * f]".
Oh
wait you defined your syntax that "% m" was implied.


> > * With (C * (t/2) ** (1/2)) ram it will take 2 times longer. C is the size
> > of the
> > context of your sponge function. In general this is (C * size ** (1/N)) ram
> > and
> > N times more computations. Max N is ln(size). I keep hearing about "parallel
> > cores" I believe it's the same thing or at least similar.
>
> it is not a break, and i'm not sure if true.
>
> not a break, because sequential memory hardness is not claimed.
>
> i don't understand your algorithm. by context, i assume you mean
> state. if you want to calculate a missing state, you need to evaluate
> a whole bunch of other states since it relies on r other states, and
> those also rely on other states, and so on.
>
> i don't have the math, but i believe with the maximum m, the cost is
> indeed n^1.5, but if you go higher with the t/m ratio, it should rise,
> hopefully toward n^2 as limit.
>
> it is questionable whether sequential memory hard functions exist
> without indexing by secret.

State S is b bits
R is r bits and C is c = b - r bits

I'm assuming f = -1

For m = t*r/2

mem = (3/2*(r+c)+2*r)*t**(1/2)
time = 2.5x

i values         normal cost  extra cost
i=0...m/2-1:     m/2          +0
i=m/2...m-1:     m/2          +m/2
i=m...3m/2-1:    m/2          +m
i=3m/2-1...2m-1: m/2          +3m/2


For m = t*r/4

mem = (7/2*(r+c)+2*r)*t**(1/2)
time = 4.5x

i values         normal cost  extra cost
i=0...m/2-1      m/2          +0
i=m/2...m-1      m/2          +m/2
i=m...3m/2-1     m/2          +m
i=3m/2-1...2m-1  m/2          +3m/2
i=2m...5m/2-1    m/2          +2m
i=5m/2...3m-1    m/2          +5m/2
i=3m...7m/2-1    m/2          +3m
i=7m/2-1...4m-1  m/2          +7m/2


How I got this

Always store every t**(1/2) states except the last m/2 since those
are never read. For m=t*r/2 that is 3/2*t**(1/2) states (S). For
regeneration you need 2*r*t**(1/2) squeezed outs (R).

For 1st m/2 use 0. This adds no extra work.

For 2nd m/2 regenerate blocks of t**(1/2) R's when needed. This adds m/2
extra work.

For 3rd m/2 regenerate blocks of t**(1/2) R's when needed then regenerate
blocks of t**(1/2) R's from the other R's. This adds m extra work.

For 4th m/2 regenerate blocks of t**(1/2) R's when needed then regenerate
blocks of t**(1/2) R's from the other R's regenerate blocks of t**(1/2) R's from
that xor-ing to the first block of R's. This adds 3m/2 extra work.

For 5th m/2 4x regenerate blocks of t**(1/2) R's. This adds 2m extra work.

For 6th m/2 5x regenerate blocks of t**(1/2) R's. This adds 5m/2 extra work.
...

You should note that you can do t**(1/3) and this increases the extra cost by 2x
and memory by 2x for the constant in front of (r+c).

m = t*r/2:
mem = (3*(r+c)+2*r)*t**(1/3)
time = 4x

m = t*r/4:
mem = (7*(r+c)+2*r)*t**(1/3)
time = 8x

-------

You should figure out a good value for f. I know it is not -1. My guess is the
next
smallest negative number that is GCD(m, f (mod m)) = 1.
Content of type "text/html" skipped