[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <715937984.107579.1390946276422.open-xchange@email.1and1.com>
Date: Tue, 28 Jan 2014 15:57:56 -0600 (CST)
From: Steve Thomas <steve@...tu.com>
To: discussions@...sword-hashing.net
Subject: Re: [PHC] my pre (if) submit proposal
Before I start I should fix the formulas. I messed up m vs m/r and had t**(1/2)
vs (m/r)**(1/2). Sorry about that I had a metal lapse after a brain fart.
For m = t*r/2
mem = (3/2*(r+c)+2*r)*(m/r)**(1/2)
time = 2.5x
i values normal cost extra cost
i=0...m/r/2-1 m/r/2 +0
i=m/r/2...m/r-1 m/r/2 +m/r/2
i=m/r...3m/r/2-1 m/r/2 +m/r
i=3m/r/2-1...2m/r-1 m/r/2 +3m/r/2
For m = t*r/4
mem = (7/2*(r+c)+2*r)*(m/r)**(1/2)
time = 4.5x
i values normal cost extra cost
i=0...m/r/2-1 m/r/2 +0
i=m/r/2...m/r-1 m/r/2 +m/r/2
i=m/r...3m/r/2-1 m/r/2 +m/r
i=3m/r/2-1...2m/r-1 m/r/2 +3m/r/2
i=2m/r...5m/r/2-1 m/r/2 +2m/r
i=5m/r/2...3m/r-1 m/r/2 +5m/r/2
i=3m/r...7m/r/2-1 m/r/2 +3m/r
i=7m/r/2-1...4m/r-1 m/r/2 +7m/r/2
In general
x = | 1 t/(m/r) ≤ 1.5
| 2 otherwise
sizeOfMem = ((t/(m/r) - 1/2) * (r+c) + x * r) * (m/r)**(1/2)
timeFactor = t/(m/r) + 1/2
And because I can't keep track of c, m, r, t (I'm so c-m-r-t
<https://www.youtube.com/watch?v=tcGQpjCztgA> [close enough :)]):
x = | 1 numberOfTimesMemoryIsRead ≤ 1.5
| 2 otherwise
sizeOfMem = ((numberOfTimesMemoryIsRead - 1/2) * sizeOfState + x * sizeOfOutput)
* (blocksOfMemory)**(1/2)
timeFactor = numberOfTimesMemoryIsRead + 1/2
*** Note ***
This is for r divides m. Which you specify should not be the case in the paper,
but when r does not divide m this only adds a small amount of extra time-memory.
> On January 20, 2014 at 6:35 PM Krisztián Pintér <pinterkr@...il.com> wrote:
>
> Steve Thomas (at Monday, January 20, 2014, 11:14:16 AM):
>
> > How I got this
> >
> > Always store every t**(1/2) states except the last m/2 since those
> > are never read.
>
> i might spend some more time on this later, but for now, i don't
> understand some things. one is with this quote above. it seems that
> the notion of state and "slot" in the mem array is mixed. the last m/2
> memory locations will not be read. this is the last m/r/2 states, as a
> state stores r words (which might be even 21 for keccak[c=256] and 64
> bit).
Yeah I worded that poorly I should have said "Store every (m/r)**(1/2)th
state except states for the last m/r/2 since those are never read."
That's odd I knew this but just had a mental lapse. Pretty much replace
m with m/r. Oh also it's not "t**(1/2)" it's "(m/r)**(1/2)".
> also note that this is true only for f=-1 and m = t*r/2. which is not
> the case i intend to use in practice. for other cases, it is a little
> bit more complicated.
I only looked at f=-1 because in your paper you state it as a perfect choice. I
only skimmed enough to find that. Also f=-1 should never be used regardless if
the sponge function is reverseable or not.
> > For 2nd m/2 regenerate blocks of t**(1/2) R's when needed. This adds m/2
> > extra work.
>
> i don't get what do you mean by "m/2 extra work". m/2 states can't be
> recalculated in m/2 steps. you always need the previous state, and
> then the previous to that, etc, since you need the C part. that leads
> to, on average, (m/2)*t^0.5 / 2, if i'm not mistaken. anyway, so far
> it does not seem to be a break, we still have O(^2) combined, ^1.5 (if
> optimal parallelization is possible), which is expected in this early
> phase.
* m/r/2 not m/2 (sorry)
You have to recalculate m/r/2 blocks of data.
m/r = 64
(m/r)**(1/2) = 8
* is a saved state it has all the info needed to get the next state
- is unknown data
+ recalculated data
0 mem is zero
< reading pos
> writting pos
Starting at:
<>
*-------*-------*-------*-------00000000000000000000000000000000
recalculate data (technically not needed this time since you just
calculated this data)
<>
*-------*-------*-------*+++++++00000000000000000000000000000000
< >
*-------*-------*-------*+++++++*-------000000000000000000000000
recalculate data
< >
*-------*-------*+++++++*-------*-------000000000000000000000000
< >
*-------*-------*+++++++*-------*-------*-------0000000000000000
recalculate data
< >
*-------*+++++++*-------*-------*-------*-------0000000000000000
< >
*-------*+++++++*-------*-------*-------*-------*-------00000000
recalculate data
< >
*+++++++*-------*-------*-------*-------*-------*-------00000000
< >
*++++++-*-------*-------*-------*-------*-------*-------*0000000
< >
*+++++--*-------*-------*-------*-------*-------*-------*+000000
...
> <
*-------*-------*-------*-------*-------*-------*-------*+++++++
> > For 3rd m/2 regenerate blocks of t**(1/2) R's when needed then regenerate
> > blocks of t**(1/2) R's from the other R's. This adds m extra work.
>
> this i don't follow at all. this phase seems to be exactly the same as
> the previous. we read single written locations, are we not?
You keep the original states
! is a new saved state it has all the info needed to get the next state
Starting at:
> <
*-------*-------*-------*-------*-------*-------*-------*+++++++
> <
*-------*-------*-------*-------*-------*-------*-------*+++++++
!-------
recalculate data
> <
*-------*+++++++*-------*-------*-------*-------*-------*-------
!-------
more recalculate data
> <
*-------*+++++++*-------*-------*-------*-------*+++++++*-------
!-------
> <
*-------*-------*-------*-------*-------*-------*+++++++*-------
!-------!-------
recalculate data
> <
*-------*-------*+++++++*-------*-------*+++++++*-------*-------
!-------!-------
> <
*-------*-------*-------*-------*-------*+++++++*-------*-------
!-------!-------!-------
...
<>
*-------*-------*-------*-------*-------*-------*-------*-------
!-------!-------!-------!+++++++
* Note you would keep the xored data in this part for the next part to save on
some caclutaltions.
> > You should note that you can do t**(1/3) and this increases the extra cost
> > by 2x
> > and memory by 2x for the constant in front of (r+c).
>
> and this is something i have no hope of understanding. how do you get
> these numbers? bonus question: what is the difference between catena
> and my access pattern that would cause this low exponent, while
> catena-n approaches 2?
m/r = 64
(m/r)**(1/3) = 4
Starting at:
<>
*---------------*---------------*0000000000000000000000000000000
recalculate states (technically not needed this time since you just
calculated these states)
<>
*---------------*---*---*---*---*0000000000000000000000000000000
recalculate data (technically not needed this time since you just
calculated this data)
<>
*---------------*---*---*---*+++*0000000000000000000000000000000
< >
*---------------*---*---*---*+++*---0000000000000000000000000000
recalculate data
< >
*---------------*---*---*+++*---*---0000000000000000000000000000
< >
*---------------*---*---*+++*---*-------000000000000000000000000
recalculate data
< >
*---------------*---*+++*---*---*-------000000000000000000000000
< >
*---------------*---*+++*---*---*-----------00000000000000000000
recalculate data
< >
*---------------*+++*---*---*---*-----------00000000000000000000
recalculate states
< >
*---*---*---*---*---------------*---------------0000000000000000
recalculate data
< >
*---*---*---*+++*---------------*---------------0000000000000000
...
As you can see you are doing two passes over the missing data to regenerate
Also ln(64) ≈ 4.159 so after you hit 64^(1/4) it will cost more memory and time.
Since we are dealing with discrete amounts it might be sooner.
I plan on looking at memory cheating for Catena some time soon. I believe the
way the access pattern works you can store every (2**N)th state. For the first
pass after initialization, each step you take you'll need to do (2**N)/2 steps
on average to get the data you are looking for. On the second pass you need to
do (2**N)/2 steps which need to do (2**N)/2 steps each so ((2**N)/2)**2 steps.
Third pass it's ((2**N)/2)**3 steps and so on. Catena-3 is three passes and is
considered the minimum. I think they are/were debating between Catena-3 and
Catena-4. Hmm I see Bill has an argument for Catena-2.
Content of type "text/html" skipped
Powered by blists - more mailing lists