phc-discussions - Security of these two hash functions?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAOLP8p5F_w=4Z-V2Dn7gdby8AS-QkX4nDhHUjg-ZQ-j9TWhXCw@mail.gmail.com>
Date: Fri, 21 Feb 2014 17:32:41 -0500
From: Bill Cox <waywardgeek@...il.com>
To: discussions@...sword-hashing.net
Subject: Security of these two hash functions?

TigerKDF, as I'm calling the successor to NoelKDF, uses two hash
functions.  The first is a multiplication intensive compute-hardened
loop that is hard to speed up even in custom hardware.  It hashes 256
bits of state in a long permutation, with applications of Blake2s
after every "block", which is typically 4000 iterations long:

// Do low memory-bandwidth multiplication hashing.
static void multHash(uint32_t hash[8], uint32_t numblocks, uint32_t repetitions,
        uint32_t *multHashes, uint32_t multipliesPerBlock, uint32_t
parallelism) {
    uint32_t state[8];
    hashWithSalt(state, hash, parallelism);
    for(uint32_t i = 0; i < numblocks*2; i++) {
        for(uint32_t j = 0; j < multipliesPerBlock * repetitions; j++) {
            // This is reversible, and will not lose entropy
            state[j&7] = (state[j&7]*(state[(j+1)&7] | 1)) ^
(state[(j+2)&7] >> 1);
        }
        // Apply a crypt-strength hash to the state and broadcast the result
        hashStateWithSalt(state, i);
        memcpy(multHashes + 8*i, state, 32);
    }
}

May main question for this hash function is whether an attacker might
be able to avoid doing every serial multiplication, and thus speed it
up.  With a multiply, XOR, and right shift, I think the answer is no.

There's no danger of entropy loss between Blake2s hashes, since it's a
permutation.  It does not even have to be a very good hash function,
so long as there are no short-cuts for computing it.  However, I did
some basic testing on it, and it seems to generate reasonable results.
 Other than states where all 8 values start as either 0 or 1, I have
not found any states that result in a short loop, where the states
repeat every after a short number of iterations.  I have checked many
thousands of initial random starting points for loops less than 4
million long, and hundreds of random starting points for loops less
than 2^28 long.  I have not found a loop yet.  If it does get stuck in
a short loop, the blake2s hash will kick it out the next block, so all
I need is for such short loops to be somewhat rare, like < 1% they
should be longer than the iterations per block.

The second hash function simply fills memory as rapidly as possible:

static inline void hashBlocks(uint32_t state[8], uint32_t *mem,
uint32_t blocklen, uint32_t subBlocklen,
        uint64_t fromAddr, uint64_t toAddr) {
    uint64_t prevAddr = toAddr - blocklen;
    uint32_t numSubBlocks = blocklen/subBlocklen;
    uint32_t mask = numSubBlocks - 1;
    uint32_t *f = mem + fromAddr;
    uint32_t *t = mem + toAddr;
    for(uint32_t i = 0; i < numSubBlocks; i++) {
        uint32_t *p = mem + prevAddr + subBlocklen*(*f & mask);
        for(uint32_t j = 0; j < subBlocklen/8; j++) {
            for(uint32_t k = 0; k < 8; k++) {
                state[k] = (state[k] + *p++) ^ *f++;
                state[k] = (state[k] >> 24) | (state[k] << 8);
                *t++ = state[k];
            }
        }
    }
}

Each iteration, the state is added to a pseudo-random 256-bit
sub-block somewhere in the "prev" memory block, XOR-ed with the next
256-bits in the "from" block, and rotated-right by 8.  Between block
hashes, it's passed through Blake2s, just like in the multiplication
hash.

When I take out the Blake2 hashing, and the data written to memory
passes the dieharder tests, and also results in the expected number of
32-bit collisions in the last states written to memory between many
runs.  With the Blake2s hash thrown in, I think it should be fine.
Attackers may be able to detect that the data written to memory was
generated by TigerKDF, but they still have to generate all that data
to compute the hash.

Can anyone see any weaknesses?

Bill