phc-discussions - Re: [PHC] Specification of a modular crypt format (2)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGiyFdfn8z=wSuT1VqgHJ-kR9S9cWnE+54xoCqhLXsgE2oAPRA@mail.gmail.com>
Date: Mon, 28 Sep 2015 13:23:59 +0000
From: Jean-Philippe Aumasson <jeanphilippe.aumasson@...il.com>
To: discussions@...sword-hashing.net
Subject: Re: [PHC] Specification of a modular crypt format (2)

Thanks!

Copied the draft to
https://docs.google.com/document/d/1QJva4xsY3eHNm2YiT0yPS9spRqd-N8NamYRpF5lkIzM/edit?usp=sharing

Fixing typos:

"  -- The first character is either a '-' sign, or an ASCII digits"
to
"  -- The first character is either a '-' sign, or an ASCII digit"

"     ** p       Degree of parallelism, between 1 and 255.
                Value is an integer in decimal, over 1 or 2 digits."
to
"     ** p       Degree of parallelism, between 1 and 255.
                Value is an integer in decimal, over 1 or 3 digits."

Other comments:

"with a strcmp() call": should we expect all strings to be null-terminated?

add "The identifier for Argon2ds is 'argon2ds'"?



On Mon, Sep 28, 2015 at 1:13 PM Thomas Pornin <pornin@...et.org> wrote:

> Hello all,
>
> here are my revised specification for the modular crypt format, and
> example code. Here are the changes since the last version:
>
> ** While producers must still follow the unique, deterministic encoding,
>    consumers are officially allowed not to verify that the string is
>    correct. This makes decoding simpler (e.g. no need to check for extra
>    leading zeros in decimal integers) but increases the risk of
>    dissemination of local variants. This also has some consequences
>    which are worth mentioning in the Unix crypt() API; I have added a
>    section about it.
>
> ** I switched back to "normal" Base64, albeit without the '=' padding
>    signs. For C implementations, the choice between standard Base64, and
>    the crypt() variant, should be neutral since the standard library does
>    not offer any implementation of either; in any case, it is easily
>    reimplemented, as the example code demonstrates. For other languages
>    (e.g. C# or PHP), using standard Base64 may allow reusing the
>    facilities of the language framework.
>
>    I still mandate suppression of the '=' padding signs, because
>    otherwise they may make some encodings ambiguous in the presence of
>    optional parameters. For most languages, removing the '=' signs upon
>    encoding is a simple one-liner (e.g. '.Replace("=", "")' in C#).
>
> ** Functions are now supposed to specify the expected order of parameters
>    instead of following a single, universal rule. (It turned out that
>    nobody likes lexicographic ordering.)
>
> ** Parameters can be optional. The function specification must tell which
>    parameters are optional and which are not; for optional parameters,
>    they MUST specify the default value, and an optional parameter MUST
>    be omitted if its value is equal to the default.
>
>    These rules preserve determinism while allowing optional features
>    without overly increasing the hash string length. In Argon2i, I
>    define the m, t and p parameters to be mandatory, and the keyid and
>    data parameters to be optional.
>
> ** Output length can vary. The function must define a default length, to
>    be used unless a very good reason not to do so applies. A sane
>    minimum is still enforced. The output length is NOT made a parameter
>    since this would be redundant with the output itself, and also
>    because it really should not be parameterized in normal conditions.
>
> ** The parameters for Argon2i have been expanded with the keyid
>    (identifier for the key, optional) and the data (associated data,
>    optional).
>
> ** The example code has been aligned to the new specification. I also
>    added an explicit license (actually, public domain dedication under
>    Creative Commons "CC0" -- this seems to be the closest to public
>    domain I can achieve without dying).
>
> -----------------------------------------------------------------------
>
> I have read all the comments made on this list, and I thank everybody
> for such comments; I tried to include most of them, but, unavoidably,
> choices had to be made, because it seems to be a mathematical
> impossibility to comply to all these conflicting requirements.
>
> Notably, Alexander argued for a much more compact format, at the expense
> of readability (depending on who is doing the reading, of course). In
> the specification draft, I chose to concentrate on readability rather
> than compactness. This does not mean that a compact string is not a
> worthwhile goal; only that I cannot achieve readability and compactness
> with a single specification. Maybe another, extra specification should
> be written for the "compactness" case.
>
>
> An important point to be made is that the most extreme compactness can
> be achieved through function-specific rules. For instance, consider
> encoding of a "time cost". This is an integer, and Alexander has done
> some nice research on how to generically encode an integer into a
> sequence of characters. However, a time cost does not need to be able to
> be any integer from 1 to 2^32-1. Consider bcrypt: its number of
> iterations is intrinsically restricted to be equal to 2^x for some
> integer x. This is very artificial (there is no structural requirement
> for the iterations to be a power of two; this is a simple loop counter)
> but it allows for encoding the time cost with a single, short integer.
>
> Forcing the time cost to be a power of 2 may be a bit inflexible (though
> I do not hear people clamouring about it when they use bcrypt). A
> slightly more configurable option is to define the time cost to be
> equal to a*2^b where a = 2 or 3, and b is an integer. If we want to
> encode such a cost over _one_ character, convert the character to a
> 6-bit value x, and say that: t = (2 + (x & 1)) << (x >> 1)
> This allows 64 possible time costs, scaled from 2 to 3*2^31.
>
> The time cost here is just an example. My point is that very compact
> encodings can be obtained by looking at the semantics of the parameters,
> which are, by definition, specific to each function. Generic rules on
> integer encoding, however smart they are, won't yield the maximum
> possible compactness.
>
>
> With the rules in the specification draft here enclosed, a "typical"
> Argon2i hash string would look something like this:
>
>
>  $argon2i$m=1024,t=50000,p=4$t003K73k/Bomtg8iHN/K4w$KH64roXLeU8kXGNeZXchGBcpmrJXT6NB1fw82bMs5Pk
>
> i.e. a total of 94 characters in that example. Most notably, systems
> that support the "SHA-512 crypt" variant (from glibc) must accept
> strings of length up to 106 characters ("$6$" header, salt up to 16
> characters, an extra "$", then 86 characters of Base64);
> application-allocated buffers should thus already be up to the task. If
> we want Argon2i strings to be still smaller, then I would be tempted to
> reduce the hash output length (128 bits would already be a lot of
> overkill for password authentication; 256 bits seem just wasteful).
>
>
>         --Thomas Pornin
>

Content of type "text/html" skipped