lists.openwall.net  lists / announce owlusers owldev johnusers johndev passwdqcusers yescrypt popa3dusers / osssecurity kernelhardening musl sabotage tlsify passwords / cryptdev xvendor / Bugtraq FullDisclosure linuxkernel linuxnetdev linuxext4 linuxhardening PHC  
Open Source and information security mailing list archives
 

Date: Fri, 7 Feb 2014 12:35:09 0600 (CST)
From: Steve Thomas <steve@...tu.com>
To: discussions@...swordhashing.net
Subject: Re: [PHC] Re: NoelKDF ready for submission
> On February 6, 2014 at 11:38 PM Bill Cox <waywardgeek@...il.com> wrote:
>
> On Thu, Feb 6, 2014 at 10:48 PM, Steve Thomas <steve@...tu.com> wrote:
> > hash = PBKDF2_SHA256(password, salt, 1, hashSize)
> > ...
> > return hash ^ lastHashSizeBytesOfMem
> > Make a guess and xor "PBKDF2_SHA256(password, salt, 1, hashSize)" to the
> > hash
> > you are trying to crack. You now have the assumed last bytes of mem.
>
> Holy cow! These hashed bits are not meant to be cryptographically
> indistinguishable from random, so with a correct password guess, you
> might be able to check if it's correct just by analyzing the hashed
> memory bits.
>
> I added another PBKDF2SHA256 as a postprocess to the XORed hash.
> This should defeat such attacks, shouldn't it?
Yes this is why I said to make xorIntoHash a cryptographic hash function. Doing
it after xorIntoHash is fine. You just need to prevent leaking data about mem.
> > Let's say the hash is 256 bits that's 8, 32 bit numbers. We can reverse this
> > data modulo 4 to 7, 2 bit numbers from "mem[fromAddr+j]":
> >
> > memHash[i+1] % 4 = (memHash[i] * (mem[prevAddr+blockSize8+i]  3) +
> > mem[fromAddr+blockSize8+i]) % 4
> > memHash[i+1] % 4 = (memHash[i] * 3 + mem[fromAddr+blockSize8+i]) % 4
> > mem[fromAddr+blockSize8+i] % 4 = (memHash[i+1]  memHash[i] * 3) % 4
> >
> > mem[fromAddr+blockSize8] % 4 = (memHash[1]  memHash[0] * 3) % 4
> > mem[fromAddr+blockSize7] % 4 = (memHash[2]  memHash[1] * 3) % 4
> > mem[fromAddr+blockSize6] % 4 = (memHash[3]  memHash[2] * 3) % 4
> > mem[fromAddr+blockSize5] % 4 = (memHash[4]  memHash[3] * 3) % 4
> > mem[fromAddr+blockSize4] % 4 = (memHash[5]  memHash[4] * 3) % 4
> > mem[fromAddr+blockSize3] % 4 = (memHash[6]  memHash[5] * 3) % 4
> > mem[fromAddr+blockSize2] % 4 = (memHash[7]  memHash[6] * 3) % 4
> >
> > mem[fromAddr2+blockSize8] % 4 = (mem[fromAddr+blockSize7] 
> > mem[fromAddr+blockSize8] * 3) % 4
> > mem[fromAddr2+blockSize7] % 4 = (mem[fromAddr+blockSize6] 
> > mem[fromAddr+blockSize7] * 3) % 4
> > mem[fromAddr2+blockSize6] % 4 = (mem[fromAddr+blockSize5] 
> > mem[fromAddr+blockSize6] * 3) % 4
> > mem[fromAddr2+blockSize5] % 4 = (mem[fromAddr+blockSize4] 
> > mem[fromAddr+blockSize5] * 3) % 4
> > mem[fromAddr2+blockSize4] % 4 = (mem[fromAddr+blockSize3] 
> > mem[fromAddr+blockSize4] * 3) % 4
> > mem[fromAddr2+blockSize3] % 4 = (mem[fromAddr+blockSize2] 
> > mem[fromAddr+blockSize3] * 3) % 4
> > ...
> >
> > Now do the first half in modulo 4. 1/16 memory plus the extra 31% to 25%
> > memory
> > reduction and this removes the multiplication. You can do this in AVX2 and
> > get
> > 128 different password guesses running at a time.
> >
> > Vectorized version of modulo 4 of value = value*3 + mem[fromAddr + i]:
> > tmp = ((value & 0x5555...) << 1) ^ value
> > value = (tmp ^ mem[fromAddr + i]) ^ ((tmp & mem[fromAddr + i] & 0x5555...)
> > << 1)
> >
> > Now that we have the first half we just check if we can find this in the
> > first
> > half:
> > mem[fromAddr7+blockSize8] % 4
> >
> > mem[fromAddr6+blockSize8] % 4
> > mem[fromAddr6+blockSize7] % 4
> >
> > mem[fromAddr5+blockSize8] % 4
> > mem[fromAddr5+blockSize7] % 4
> > mem[fromAddr5+blockSize6] % 4
>
> Wow. So, modulo 4, the multiply goes away and you developed an attack
> from there that can be highly parallel and memory efficient. Nice.
>
> With an additional application of H at the end of the loop to the hash
> value, this attack becomes unfeasible, doesn't it?
Yes. I was going to also suggest doing the following, but I don't see how doing
the first half modulo 4 (or some other) would benefit the attacker now.
value = ROTATE_LEFT(value*(mem[prevAddr + i]  3) + mem[fromAddr + i], 1);
> Is there any
> reason to apply H to more of the data than the last hashlen bits of
> each thread's memory, or is a postprocess application of H to the
> existing XORhash just as good? This is what my revised version
> currently does, but if there's a reason to change this, I will.
I originally said to do more than just hashlen bits, but I don't know if that's
really necessary.
Content of type "text/html" skipped
Powered by blists  more mailing lists