[<prev] [next>] [day] [month] [year] [list]
Message-ID: <54145B11.40308@ciphershed.org>
Date: Sat, 13 Sep 2014 10:56:17 -0400
From: Bill Cox <waywardgeek@...hershed.org>
To: discussions@...sword-hashing.net
Subject: A review per day - Centrifuge
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Centrifuge is not an entry I would choose to have in the second round.
First, it is too slow to replace Scrypt, and is specific to
authentication servers in that it depends on AES-NI instructions for
reasonable performance.
There is currently a bug that need to be fixed before it can be
considered secure, but given it's other limitations, it would still
not be competitive.
Here are the benchmarks.
4MiB benchmark:
PHC> time phs-centrifuge 0 17
4194304 bytes
ac 3a 6e 73 8f 76 3d 49
53 8b 04 2c 94 23 10 df
68 6c 4d 0c 77 df 4b 09
a7 b9 dc db 61 ee 77 4c 32 (octets)
real 0m0.116s
user 0m0.104s
sys 0m0.012s
Anything above 10ms is not speed competitive with Scrypt.
1GiB benchmark:
PHC> time phs-centrifuge 0 25
1073741824 bytes
b1 40 f1 18 de fd fe 76
b9 45 76 e5 cc 2e c3 bf
6c c1 07 32 cd 6c a0 83
ac 29 db 50 3d cd d4 f0 32 (octets)
real 0m28.685s
user 0m28.629s
sys 0m0.064s
28 seconds is so slow that it is impractical for password authentication.
Issues
- ------
There is a bug on line 175:
out[j] = (uint8_t)(S[out[j]] + ptr[j]);
Surely the author meant:
out[j] ^= (uint8_t)(S[out[j]] + ptr[j]);
Without this fix, the output and can be computed efficiently with
constant memory.
Centrifuge is only usable at memory sizes that fit into cache, such as
the 4MiB benchmark above. There is no strategy to make efficient use
of external DRAM.
Centrifuge shares Scrypts TMTO problem, with it's 4X reduction in
time*memory defense. Centrifuge fills memory linearly, and then does
only *predictable* reads,
meaning that a TMTO can make massive use of parallelism for
recomputations. Therefore, Centrifuge is *not* memory hard.
Centrifuge uses t_cost to slow down memory initialization. Since all
memory is filled with AES-cfb, there is no absolutey no need for
mixing the S-box while filling
memory. A simple counter feeding into parallel callse to AES will do,
and run considerably faster than AES-cfb. Even if for some reason we
want to fill
memory with only cryptographically pseudo-random data, why use block
chaining? The AES-NI instruction unit can do 6 computations in
parallel, but
block-chaining forces it to do every computation serially. The code
could be simplified, speed up a lot, and made more secure at the same
time, because average memory use increases when filling memory is
fast. This is simply a poor memory initialization algorithm.
Just like in TwoCats, the t_cost implementation is broken. Memory
bandwidth starts low because the algorithm is so slow, but it goes to
zero with increasing p_time. This could allow an attacker to use an
SSD drive efficiently with high enough t_cost, dramtically lowering
the cost of attacks.
Centrifuge is not timing resistant from the start. A simple cache
timing attack can be used at the very beginning, causing both time and
memory to go away for an attacker. In contrast, Scrypt's first loop
is cache-timing resistant, and only the second loop is unpredictable.
For some reason the output is hashed with AES once for every byte of
memory used. Unless t_cost is very high, this will dominate runtime.
However, by increasing t_cost, an attacker enjoys reduced memory
bandwidth, which is already way too low.
Because dominates computation time, GPUs will do well against
Centrifuge. They may not have dedicated AES hardware, but 4096
guesses in parallel more than make up for that. Use of only
sequential AES is also good for GPU attacks, since the defender cannot
make use of his 6-deep AES pipeline.
t_cost is broken just like in TwoCats. As t_cost increases, memory
bandwidth decreases, eventually allowing for SSD storage at a much
cheaper cost to an attacker. However, it does increase CPU work, so
t_cost should be set high enough that it just begins to slow down
hashing, and no more, to maximize compute-time hardness.
S update is highly parallizable, contrary to a statement in the paper.
The output has slight entropy loss. The input to AES is a sequence of
outlen bytes, all different. The output from AES is a new sequence
that is cryptographically scrambled. However, each byte copied to the
output buffer is different. For a 256 bit output, about 3 bits worth
of possibilities are lost, making it a 253 bit strength. This is no
big deal, but at a 2048 bit output, over half of the bit strength is
lost. Again probably not a real issue.
Summary
- -------
Centrifuge is one of those algorithms that pack weaknesses densly into
very little code. If we're looking for only the strongest half to
make it into the next round, dropping Centrifuge is an easy, call, IMO.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAEBAgAGBQJUFFsNAAoJEAcQZQdOpZUZqtUP/RqQvsxYH6QZb2eBnpzSeu7s
aUdalGxtnkcStAjoaRuD3oWWW8w5W1UHoNUNDC0NEWd0I71BK8RTdMkKDC2JiK1C
yAxU2cat2j84wV4HzPyFXuXoVu/py3KPZnJrE6KDgQjQWFNhWZ8WFTymPdU/HOjo
SknBruwOYU8pPqO630gpBIPBpgkIrwl8HSEdpu54EZxkIxjoPs8IRK7TNoL6h9Ek
iXJRI9HlPVUwmpBE9VW0y8HbJI9cQbPbB5vNNlGgO4ldNoa8Es04b5Gk6LQ6+3ty
i+GFjWo6GltJVV4oI6ePiDAKusAXjrlmHgDZd5of9W2FQVq3EtaWYD+ajfFYH3mp
AS3QT6OVDkjdtbnYMZDY4147UjLHE+XV6eAMwjMrvfjO2jCmikX5/Msy3jCXv64x
J60Gz/aLt0CDNRFpmt8K3DZFbdG7WJ3hBd/D1Uykze+boyYVwjuRkqz89nQ0F0tP
XcSIeQhvVdkH6mLKheNZ/aNgf1sFkqkzqjv81qJ2449dD6GXvgEKN8xu9CXUV72D
hhfmCEQfZdTZJTTA2jTQAjRUkuVRSE3HWd60xutBLCzlb+1u2pjBvsNOiPWuldKF
iBxfc7xo5YU6OEaettKhA3FTm6tlcQ08sdmXztzaMM1JNNZ9Q38R2VDG3+Anq5bh
4ZD7lR2U7CNlmoYf/FBe
=tcj0
-----END PGP SIGNATURE-----
Powered by blists - more mailing lists