[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAOLP8p6HALKLv=ck5ekmx4Xgbg8m1sppYQvaxYqM5Ku2Hv4kHQ@mail.gmail.com>
Date: Mon, 20 Jan 2014 08:38:50 -0500
From: Bill Cox <waywardgeek@...il.com>
To: discussions@...sword-hashing.net
Subject: Re: [PHC] Native server relief support for password hashing in browsers
On Sun, Jan 19, 2014 at 9:41 PM, Steve Thomas <steve@...tu.com> wrote:
>> On January 19, 2014 at 6:47 AM Bill Cox <waywardgeek@...il.com> wrote:
>> I would hate to see entries like Blakerypt bow out because of Catena.
>> If I drop out, it will be because of excellent real-world work like
>> escrypt. Catena has some excellent theoretical work, but until I see
>> an implementation that begins to approach the security of Scrypt, I
>> don't see it as a real contender.
>
> So I guess you've seen this ( https://github.com/cforler/catena) and
> benchmarked both of them. What were your results? I've been meaning
> to do this for a while. Along with others.
Thanks for the link! I don't know why I had not seen it before. I
just ran catena-blake2b-opt-test on my development machine, and I got
this:
runtime: 0.296 seconds
garlic: 18
lambda: 3
memory: 64*2^18 = 16MB
memory access: 5 reads, 4 writes: 9*16MB = 144MB
bandwidth: 487 MB/s
I could have some of these numbers wrong... blake2 runs a bit faster
than SHA512, so I only benchmarked blake2b.
For comparison, the default build of scrypt on the same machine
(running 64-bit arch linux) fills 500MB in 1.8 seconds, doing 1 read
and 1 write. If fills 5X more memory in the same time, though it's
memory bandwidth is only 16% higher.
The main difference between the two seems to be that Catena does 9
read/write per memory location, while Scrypt does 2.
>> The Catena paper blew me away and I stole some of their ideas
>> wholesale. However, it seems like an academic solution at this stage.
>> It has nothing to support modern memory architectures, no useful
>> parallelism, and no high performance hashing. Of course, all of that
>> can be added, but until I see a real-world solution, it remains
>> theoretical work in my mind.
>
> Catena uses Blake2b which has parallelism in it. SHA512 was added
> because there was a PHP implementation that used SHA512. SHA512 is
> built into PHP and will be faster than Blake2b written in PHP.
The implicit parallelism in Blake2b and other hashing algorithms is a
bad thing for memory-hard KDFs, IMO. Any parallelism that our CPUs
can't take advantage of is an advantage to the attacker. By "useful
parallelism", I mean a way to spread a hashing algorithm across many
CPUs so they can work in parallel. Ideally, we match the parallelism
to the machine doing the hashing, whether it's a single-core ARM
processor, or a 4096-core GPU. It's nice to have that built into the
hashing algorithm, IMO, though it's a minor issue. You can always
wrap the hashing algorithm with an outer loop to get the parallelism,
but then you give an attacker a free time-memory tradeoff.
IMO, this implementation of Catena provides questionable security
compared to Scrypt, while escrypt seems to take security to a new
level. The anti-timing attack is super-cool, but is it worth dropping
memory usage 5X? It seems to me that there are 3 issues holding
Catena's performance back, but all of them could be fixed:
- Cantena-3 only uses 1/4 of the memory it computes. Why not have a
final Catena row that hashes data from all 4 previous rows?
- small hashing size (64 bytes) means cache misses will dominate for
memory >> cache size. Why not hash 1024 bytes at a time like Scrypt?
- Even blake2b is too slow for this competition, though I seem to
remain in the minority in feeling this way. What's wrong with r[i] =
r[i-1]*(prevRow[reverse(i)] | 1) + 12345?
Bill
Powered by blists - more mailing lists