phc-discussions - Re: [PHC] A review per day

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <5417155B.3040409@ciphershed.org>
Date: Mon, 15 Sep 2014 12:35:39 -0400
From: Bill Cox <waywardgeek@...hershed.org>
To: discussions@...sword-hashing.net
Subject: Re: [PHC] A review per day - Schvrch

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 09/02/2014 02:49 AM, Poul-Henning Kamp wrote:
> -------- In message
> <CAG+Gt9ZwKnCamXQEPP2iFSmN1HO6nBC8wqdg8QHQsS_o6BxLkQ@...l.gmail.com>
>
> 
, Rade Vuckovac writes:
> 
> I think Bill missed the point about Schvrch *big* time.
> 
> Cryptography is at its foundation about finding ways to mix up
> bits, to make them unrecognizable without loosing their entropy.
> 
> Once you dive into it, the actual palette of tools at our disposal 
> is much smaller than most people realize, and many of those tools 
> are even specific variants of more general tools from the same 
> palette.
> 
> Schvrch adds an entirely new tool to the palette -- no mean feat.
> 
> I have no idea how Bill could overlook this, but my guess is that 
> the compactness of what was proposed and the lack of orthodox 
> encryption primitives deceived him into not paying proper
> attention.
> 
> He should.
> 
> Schvrch was one of the submissions which made PHC worth the effort
> for me.
> 
> Schvrch's mathematical pedigree lists both Von Neumann[1] and 
> Wolfram, both were fascinated and frustrated by the seemingly 
> unlimited complexity arising out of trivially simple rules.
> 
> That of course is no guarantee of cryptographic utility.  Only
> time and analysis will tell if Schvrch's new tool is any good.
> 
> But Bills dismissal of mathematics which frustrated both Von
> Neumann and Wolfram as "just XORs states together" and concluding
> that "not much effort went into it" is not even wrong.
> 
> Poul-Henning
> 
> [1] The paper inexplicably fails to credit him.
> 

I finally got around to reading the paper behind Schvrch's hash
function.  The idea is cool, and I have to give the author credit for
thinking of using Wolfram's CA for hashing.  However, I prefer ARX
hashing, and ARX hashing with multiplications.  These primitives
(especially multiply) are harder to speed up in an ASIC.

Here's the right way to do Wolfram's Rule 30 CA in a way that may be
useful for hashing:

static uint128_t updateState(uint128_t a) {
    uint128_t lsb = a & 1;
    uint128_t msb = a >> 127;
    return ((a << 1) | msb) ^ (a | (a >> 1) | (lsb << 127));
}

This updates 128 cells in parallel.  We probably want this to be
extended to 256 bits, or 512, but this is the function I am testing.
I verified that it generates exactly the same pattern as the paper
shows for the first 22 states.  Note that it is a reversible
permutation, and would have to be combined with some sort of data
compression to be used for cryptographically irreversible hashing.

When I repeatedly call updateState 97 times, and then output the 16
byte state, it passes the dieharder tests, which of course means
nothing other than it is not obviously non-random, but at least I was
not able to rule it out as a candidate with simple tests for randomness.

This is very similar to ARX hashing.  It has a left and right
rotation, and an XOR, but an AND instead of ADD.  This nonlinear
operation is enough to thwart my attempts to take large shortcuts at a
boolean logic level, but not as easily as with adders.  I think I can
speed up this hashing in hardware by computing factors needed for 3
states ahead and XORing them together.  Enough factors cancel out that
I think I can speed it up by about 50%.  However, in reality, I would
not bother, since I can pipeline the heck out of this and compute
hashing for multiple passwords in parallel.  This thing will really
scream in silicon.  It is the most ASIC friendly hash primitive I've
seen, which caused me to think of Keccak.

I compared this to Keccak, and I see that Keccak already uses
Wolfram's CA!  From Wikipedia:

"χ
    Bitwise combine along rows, using a ← a ⊕ (¬b & c). To be precise,
a[i][j][k] ← a[i][j][k] ⊕ ¬a[i][j+1][k] & a[i][j+2][k]. This is the
only non-linear operation in SHA-3."

For a fixed k, this is just one of Wolframs Rule's, but it is
essentially the same as Rule 30.  Here's the LUT for Wolfram's Rule 30:

ABC Q
- -----
000 0
001 1
010 1
011 1
100 1
101 0
110 0
111 0

Keccak uses this:

ABC Q
- -----
000 0
001 1
010 0
011 0
100 1
101 0
110 1
111 1

This is just Wolfram's Rule 210.  This is trivially equivalent to Rule
30, since all we have to do is negate the output, and the 2nd input to
get the same exact truth table.

So, I think this is not a completely new idea.  No wonder Keccak won
the SHA3 competition!  It's incredibly ASIC friendly!  That's bad for
our purposes.

Reed-Muler cannonical forms show that AND-XOR is capable of computing
any Boolean function.  The ROTATE is needed to for mixing between
bits.  This is minimal in the sense that we can't drop any of the
three operations and still be capable of computing arbitrary multi-bit
functions.

Schvrch also tries to introduce if-then-else branching as a novel way
to mix data.  I think AntCrypt took this to an extreme where it might
be useful for GPU defense (if rewritten properly), but Schvrch got the
branching wrong, and it does not do what the author intended.  Except
in the stir function, which introduces an ADD operation, the Wolfram
CA can be factored out as a separate 256-bit CA, sped up using a
parallel loop like above, and applied as a post-process to the state
values, which can be computed in constant time, regardless of the
number of rounds.  This is not only much faster, but avoids any
data-based branching.

On the positive side, the code above has no branching on the data,
which otherwise might leak password data through the icache, it
executes Wolfram's CA many times faster, and can work well in SIMD units.

The place I would most likely want to use Wolfram's CA 30 is in a
rewrite of AntCrypt.  AntCrypt has a great idea, but poor
implementation.  I was thinking of replacing their function list with
randomly generated ARX combinations.  I think adding AND-ROTATE-XOR
functions would be a good idea.  I'm still not sure I like the idea of
branching based on the data, though, especially right from the start.
 Maybe a hybrid algorithm could work OK, where we don't do data
dependent branching or memory addressing for the first half.

Bill
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iQIcBAEBAgAGBQJUFxVYAAoJEAcQZQdOpZUZGoUP/ic48+V4CRCOyAAoGQTj4iXP
MbnyxgSVa7XvoH1208DbvIxgQb2r4EiaN12jybK6PRDQ8OXjqgEfQNbId0JmWrYM
etxgCdFVrJWC5k+johxXvpAdB2W6KCnppuerYy+oJC0jbYEELQeGfAYdr7j9WvmR
L52Jk7iLm1ppecS327C6HHi6M8JprJ1vbNunpF6ei2IziIT7GjLOJkTe5oTnf5fH
x9HdZUGmxIO0iYx9dbtiChVlFj/xhIB0gZ+E9GzPopV8hbdCIoZT4vbFUhNeWT9M
BPrT6uU2DTGyyXFa6jnA432iw2+hVO8sbAPBZCww6HLxp1/aKqdQIBdCaskh3eSP
zzRcexubHH3ETMueepnC/GEYZDn6JnwQgL5Zpno4qIGt8U8hONdR2lANMD2Iip7l
PXDFyI4nD4UMdxV+zz+9NRqgGnyh4PoPYSwoPr96PVs2Lv6hscUBwlyQcxeaQ+kh
QzqC4qYQtABQI/Jcy6VxQld49peCqlGpZWxVYQNGdW8Legh2bYpKUqjcy9lrYbLA
1PlMxS5IL4ji7Vh2emG1t6U7YNBIxghgakIBrt22OIm6e+jZ3uPpoxluliOedaqR
dJsf8rq/ke2VI6dgrdYcNbOHXXGe9bZrAbCnjXZ915bUeEt/P1aXQ6gvYl5Vft1z
ZUV7d/hY7Kyg3wsMZvte
=iVZ/
-----END PGP SIGNATURE-----