lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [day] [month] [year] [list]
Date: Sat, 13 Sep 2014 10:56:17 -0400
From: Bill Cox <>
Subject: A review per day - Centrifuge

Hash: SHA1

Centrifuge is not an entry I would choose to have in the second round.

First, it is too slow to replace Scrypt, and is specific to
authentication servers in that it depends on AES-NI instructions for
reasonable performance.
There is currently a bug that need to be fixed before it can be
considered secure, but given it's other limitations, it would still
not be competitive.

Here are the benchmarks.

4MiB benchmark:

PHC> time phs-centrifuge 0 17
4194304 bytes
ac 3a 6e 73 8f 76 3d 49
53 8b 04 2c 94 23 10 df
68 6c 4d 0c 77 df 4b 09
a7 b9 dc db 61 ee 77 4c      32 (octets)

real    0m0.116s
user    0m0.104s
sys 0m0.012s

Anything above 10ms is not speed competitive with Scrypt.

1GiB benchmark:

PHC> time phs-centrifuge 0 25
1073741824 bytes
b1 40 f1 18 de fd fe 76
b9 45 76 e5 cc 2e c3 bf
6c c1 07 32 cd 6c a0 83
ac 29 db 50 3d cd d4 f0      32 (octets)

real    0m28.685s
user    0m28.629s
sys 0m0.064s

28 seconds is so slow that it is impractical for password authentication.

- ------

There is a bug on line 175:
    out[j] = (uint8_t)(S[out[j]] + ptr[j]);
Surely the author meant:
    out[j] ^= (uint8_t)(S[out[j]] + ptr[j]);
Without this fix, the output and can be computed efficiently with
constant memory.

Centrifuge is only usable at memory sizes that fit into cache, such as
the 4MiB benchmark above.  There is no strategy to make efficient use
of external DRAM.

Centrifuge shares Scrypts TMTO problem, with it's 4X reduction in
time*memory defense.  Centrifuge fills memory linearly, and then does
only *predictable* reads,
meaning that a TMTO can make massive use of parallelism for
recomputations.  Therefore, Centrifuge is *not* memory hard.

Centrifuge uses t_cost to slow down memory initialization.  Since all
memory is filled with AES-cfb, there is no absolutey no need for
mixing the S-box while filling
memory.  A simple counter feeding into parallel callse to AES will do,
and run considerably faster than AES-cfb.  Even if for some reason we
want to fill
memory with only cryptographically pseudo-random data, why use block
chaining?  The AES-NI instruction unit can do 6 computations in
parallel, but
block-chaining forces it to do every computation serially.  The code
could be simplified, speed up a lot, and made more secure at the same
time, because average memory use increases when filling memory is
fast.  This is simply a poor memory initialization algorithm.

Just like in TwoCats, the t_cost implementation is broken.  Memory
bandwidth starts low because the algorithm is so slow, but it goes to
zero with increasing p_time.  This could allow an attacker to use an
SSD drive efficiently with high enough t_cost, dramtically lowering
the cost of attacks.

Centrifuge is not timing resistant from the start.  A simple cache
timing attack can be used at the very beginning, causing both time and
memory to go away for an attacker.  In contrast, Scrypt's first loop
is cache-timing resistant, and only the second loop is unpredictable.

For some reason the output is hashed with AES once for every byte of
memory used.  Unless t_cost is very high, this will dominate runtime.
 However, by increasing t_cost, an attacker enjoys reduced memory
bandwidth, which is already way too low.

Because dominates computation time, GPUs will do well against
Centrifuge.  They may not have dedicated AES hardware, but 4096
guesses in parallel more than make up for that.   Use of only
sequential AES is also good for GPU attacks, since the defender cannot
make use of his 6-deep AES pipeline.

t_cost is broken just like in TwoCats.  As t_cost increases, memory
bandwidth decreases, eventually allowing for SSD storage at a much
cheaper cost to an attacker.  However, it does increase CPU work, so
t_cost should be set high enough that it just begins to slow down
hashing, and no more, to maximize compute-time hardness.

S update is highly parallizable, contrary to a statement in the paper.

The output has slight entropy loss.  The input to AES is a sequence of
outlen bytes, all different.  The output from AES is a new sequence
that is cryptographically scrambled.  However, each byte copied to the
output buffer is different. For a 256 bit output, about 3 bits worth
of possibilities are lost, making it a 253 bit strength.  This is no
big deal, but at a 2048 bit output, over half of the bit strength is
lost.  Again probably not a real issue.

- -------

Centrifuge is one of those algorithms that pack weaknesses densly into
very little code.  If we're looking for only the strongest half to
make it into the next round, dropping Centrifuge is an easy, call, IMO.
Version: GnuPG v1


Powered by blists - more mailing lists