[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <540673E7.8010109@ciphershed.org>
Date: Tue, 02 Sep 2014 21:50:31 -0400
From: Bill Cox <waywardgeek@...hershed.org>
To: discussions@...sword-hashing.net
Subject: Re: [PHC] A review per day - Parallela
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
The basic idea of Parallela: use your GPU for defensive purposes.
Basically, my notes were not very accurate. First, this is primarily
an algorithm designed to make defensive use of your GPU, and maybe
FPGAs. It runs a very large number of parallel SHA512 hashes
(3*5*128*loopCount), and then does as many of these in series as you
like. I am a fan of defensive use of GPUs. Against GPU farms, it
puts you basically on par with them.
The code is bigger than I'd thought, but mostly due to portability
issues. It includes hundreds of lines of stdint.h and inttypes.h to
port between GNU and Visual C++. On the positive side, it looks like
a lot of porting work has already been done. On the down side, for a
primarily GPU algorithm I'd really like to see some CUDA code (or the
other one).
The main function is in a file only 150 lines long, and it was easy to
read. Here's a simple dump of my notes:
- - Minor grip: why not use m_cost for the serial parameter? Encoding
two time values in t_cost is a pain.
- - Deals with practical details, like zeroing out before returning an
error code. This might help inform a user that there is something
that needs attention
- - Feeds salt into SHA512 first rather than salt. Someone was saying
that's preferred, but I don't know why, when our crypto-hash
primitives are solid little mega-scrambling machines.
- - line 145: "// TODO: find a secure wipe function" - I use blake2's:
/* prevents compiler optimizing out memset() */
static inline void secure_zero_memory( void *v, size_t n )
{
volatile uint8_t *p = ( volatile uint8_t * )v;
while( n-- ) *p++ = 0;
}
- - No threads or openmp support, because it's designed for GPUs.
Where's the CUDA? No point benchmarking yet.
I take issue with one statement in the paper:
"The attacker-defender ratio is 1. Any advancements in cracking are
advancements for the defender. If ASICs come out that can crack this
hash they more than likely can be used by the defender."
I doubt the NSA is going to sell you their ASICs! This is OK defense
for a system with a GPU, where we worry that SHA512 hashing by the
attacker is also done with GPUs. However, we have SHA256 BitCoin
ASICs that blow away GPUs. SHA512 hashing is just twice the
transistors per core. ASICs will rule here, and we still can't buy
defensive SHA256 ASICs. There's probably some secret law about that.
There is also a paralleKDF function that is capable of outputting
variable length data. I am not sure that there need to be two
functions here. It seems like they could be combined into one.
That's pretty much all I found to talk about in Parallela. The idea
is a good one. I'm not sure I can really do a lot more. There's
really no point in running Dieharder checks on SHA512 output, and
without CUDA code and a GPU, I can't really benchmark it, SFAIK.
Bill
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAEBAgAGBQJUBnPjAAoJEAcQZQdOpZUZZK8P/ioozmLWMkfGeReqVufwBGK9
6EyqH+OUR0Z7j/OCeYSKHsb5c7tIgY4divOd0OCcbzci3AigzHURL5DSJO7GTtXm
1qI0KycLmTqAMjCANnAhiMhhq8KJFy2MdedNc2Z2QqEIIAeyso14o0Hu9J3QoeAB
c6oK0GM/t9gwYPagq1g/LXghaULL84xX5rgfZpjr+F9K6pXC09+2ODpCZO2hk5uo
NSIBy+mUFkkPM0Pa9djE7/q957jUEmvIFDaTLvWaZN4c502pQOrFGYkAGCy/2rMx
mD0cTzH2HY3n8aDiT7mxZojIQF0Vk9HlSSXTxLxnk3PN8JG+vfkWMa6Wn42WlP8t
VVjHXP9D90cxCEuwnWZFeb9R3lwOg31cKNMpWJOEAbL4Sb7N/P86mou67Y7qyPc6
mo4VQSGOhL/pdbDWTSZz/kxV/pKgqQO3ZOaOKWnoG7H7ya1sQcjzZQU1XwDwlIVH
JZgxy8wkBCmlBI5TFvl9v8NErr5k3fKCAjApUd6s8NdTh2DV61EroXV7uC1ihXa/
icvBBJ7lmrF4hiTjuVCMuIVTenMZF4EqjOtW6kEJy2yG0nfxNXkR8jw81sDf+iTn
x6MDH+A7mswXnnG5fgA+w5I9Dre52fYoAOhB+xxRbTLcu59QKCEB+PPANGVCQjED
E9MvpEKf7A2FYLXA3fyt
=WJqv
-----END PGP SIGNATURE-----
Powered by blists - more mailing lists