phc-discussions - What TrueCrypt (probably) wants from the PHC

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <53F7B211.30409@ciphershed.org>
Date: Fri, 22 Aug 2014 17:11:45 -0400
From: Bill Cox <waywardgeek@...hershed.org>
To: discussions@...sword-hashing.net
CC: CipherShed Developers List <devs@...ts.ciphershed.org>
Subject: What TrueCrypt (probably) wants from the PHC

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I am now a member now of the CipherShed Project Management Committee
(PMC), which is developing a fork of TrueCrypt. I can't actually speak
for the PMC officially, so this is just my opinion...

TrueCrypt has unique requirements that are quite different than those
for authenticating a user over the Internet. TrueCrypt has been
downloaded 30 million times. No other application secures as many
BitCoin wallets. It had the largest market share in the personal
encryption space, edging out Microsoft by a wide margin. Amazon's only
supported cloud encryption solution is TrueCrypt. While big geeks like
us probably don't often use it or need it, no other personal
encryption product has ever impacted the masses so strongly.

The basic threat model is pretty simple. It's probably a lot like the
case of encrypting a Word document, if that Word document's security
were worth somewhere from a hundred bucks up to the value of your
life. The most significant scenario is an attacker has access to a
full copy of an encrypted volume and simply wants to brute-force guess
the password. This leads me to my first request:

- - The PHC winner should do an *excellent* job protecting the password!

Guessing the password not only endangers a Windows user's BitCoin
wallet, but also his gmail account, and any place else he uses his
password. Having weakly hashed passwords is the single largest
security hole found by the TrueCrypt Audit so far:

https://opencryptoaudit.org/reports/iSec_Final_Open_Crypto_Audit_Project_TrueCrypt_Security_Assessment.pdf

It is so bad that I started talking about forking TrueCrypt about a
year ago. VeraCrypt beat me to it!

https://veracrypt.codeplex.com/

There are several issues the PHC is concerned about that I care far
less about:

- - Hashing doesn't have to run quickly. About 0.2 to 2 seconds is fine.
- - Hashing can use a significant portion of the machines memory
- - Hashing doesn't need to have a simple API. We can figure it out.
- - It does not have to be similar to PBKDF2
- - Our side channel attack concerns are lower, but not zero

Generally, our users are running Windows on their own machine, and
side-channel attacks are less of a concern than the usual more direct
malware attacks. However, we don't want malicious Chrome plugins to be
able to guess the user's password through a cache-timing attack. Also,
in theory we're used a lot on Amazon Cloud, and cache timing attacks
are an important threat there. However, we have to put protecting
those 30 million downloaders first.

Our basic needs for password hashing are very roughly:

- - A typical user with an 8 character password of mostly lower case
letters and maybe a digit and/or punctuation should expect to be able
to protect a $10,000 BitCoin wallet
- - A user who is afraid for his life over the contents of his encrypted
volume should have his somewhat randomish 20 character password with
some pumctuation and digits be strong enough to protect him from even
a government-scale brute force attack.
- - The hash function must have a mode where it can be used without any
apparently non-random parameters. Salt is OK, but not garlic, for example.
- - The PHC winner needs to provide stronger defense than Scrypt against
brute force guessing attacks, or we'll just use Scrypt.
- - Obviously, it has to have a C API.

I had originally assumed that the cost to an attacker would be roughly
proportional to an attacker's runtime per guess times the average
memory required per guess. However, some of the PHC related papers
have convinced me that power may dominate an attacker's cost.
Therefore, I want all three :-)

- - An attacker should have to spend comparable time per guess as a user
- - An attacker should have to use comparable memory per guess as a user
- - An attacker should have to spend comparable power per guess as a user

By comparable, I don't mean it has to be < 2X different, but 10X
starts to hurt. Every factor of 10 is *very* important. Since this
code has to be secure, and since this stuff is *very* hard to audit, I
also want:

- - The efficient version of the hashing algorithm should be simple to
read and audit

If meeting all these goals at the same time means that I need to tune
multiple parameters to my machine, that's OK in some of our use cases.
However, I don't want to have to make the user guess what the
parameters should be. Therefore, for highly tunable algorithms I want:

- - There needs to be a C API for auto-picking any complex parameters

Using machine specific parameters is dangerous for most applications,
but less so for CipherShed, because users are not supposed to make
copies of volumes in the first place. You create them on one machine,
and that's where they live... forever. The correct way to transfer a
TrueCrypt volume to a new machine is to create a new volume on the
destination machine, and then copy the files over. The case where
CipherShed remembers various auto-generated parameters is new, and
will be controversial. We may want to take the Audit's recommendation
and encode one or more new parameters in the clear in the volume
header. I am looking forward to debating this issue with the
CipherShed developers. However, even if we do offer clear-text
parameters in volume headers, many of our users will require volume
headers where we don't. The scenerio where our users get killed if
they are discovered to even have a TrueCrypt volume is a case we are
concerned about.

TrueCrypt volumes are currently essentially indistinguishable from
true random data (with some unfortunate caveats mentioned in the
audit). Because TrueCrypt has to open volumes with no idea what the
encryption parameters might be, TrueCrypt currently tries every
combination of supported encryption and password hash to see if any
work. That's OK, because it only takes a few milliseconds to do them
all. However, with multiple memory-hard options, guessing the correct
one could take too long. In this case, I think Catena's "garlic" could
work for us. We would have TrueCrypt first try all the old
combinations and when they fail, we would guess a low garlic level
appropriate for a mobile phone, and then increase the garlic until we
either would use more memory than the machine has or run for too long.
If the password is correct, this should only take about twice as long
to discover the correct garlic level.

Another problem is we face because of our no-visible-parameters rule
is that we can't support a lot of memory-hard hash functions. Therefore:

- - Please choose *one* high-memory usage successor to Scrypt

And here is my strong bias, having benchmarked all the PHC entires: If
Lyra2, Yescrypt, or TwoCats is chosen, please don't also choose one of
the other two. Why one of these three? Because of my rule that it must
be better for CipherShed's purposes than Scrypt. Therefore:

- - Please insure that there is exactly *one* winner that has benchmarks
an analysis proving it has better brute-force password guessing
defense than Scrypt

If someone shows any of the other PHC entries perform as well as
Scrypt against brute force guessing attacks, then please add them to
the list of potential Script successors. However, I didn't find any
others in my tests, though some were close and with "tweaks" could
easily get there.

I do not believe this is a bias, but you mightl:

- - Please do not try to replace Scrypt with a purely
cache-timing-resistant algorithm

Cache-timing resistance is desirable in this case, but since
resistance to brute-force guessing attacks is inherently much lower
for such algorithms, it will be very difficult to create a worthy
Scrypt successor that does not do unpredictable memory reads.

Another obvious requirement, once I reveal my desire to port
CipherShed to Android one day:

- - Please do not rely on Intel specific instructions to the point that
performance sucks on ARM

Also, after actually at least scanning the code of every entry, with
readability and portability varying all over the map:

- - Please do not use Windows or Linux specific API calls or compiler
features

There were a couple entries with several Windows C++ specific syntax
issues, and they were a PITA to compile. I haven't ported any to
Windows yet, but I see it's going to be a challenge for some. Another
no-no is code that makes it hard to build API wrappers for PHC entries
in Python or other languages:

- - Please make your API out of standard C functions (export "C" for C++
entries).Python doesn't run the C pre-processor

However, that's not really a CipherShed issue :-)

I keep hearing "You all should have done X or Y", but in reality,
there are some seriously freaking brilliant guys here with amazing
work. I have been more impressed with the overall quality of the
algorithms than the with the reviews of them so far. I may dump on
your algorithm, but please don't take that as an indication that I
don't fully respect your work.

That's about all I can think of for now :-)

Bill