[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20151011012815.GA8938@openwall.com>
Date: Sun, 11 Oct 2015 04:28:15 +0300
From: Solar Designer <solar@...nwall.com>
To: discussions@...sword-hashing.net
Subject: Re: [PHC] Interest in specification of modular crypt format
On Sun, Sep 27, 2015 at 03:55:52PM +0300, Alexander Cherepanov wrote:
> On 2015-09-25 04:32, Solar Designer wrote:
> >Is this unjustified complexity or is it a good way to achieve several
> >goals at once (very compact encoding, no two ways to encode a number,
> >use of only the B64 alphabet with no need for an "end of number" char,
> >no way to hit 2^31)?
>
> The problem with this approach is that the encoding is closely tied to
> the range of allowed values. Argon2 spec says that the range for p is
> 1..255 while m and t can go upto 2^32-1. Should they use the same
> encoding? Should it be the same as for (ye)scrypt?
Yeah, having the encoding tied to the range of allowed values is both
good and bad. As an option, Argon2 specifically could choose to limit
its parameters to 2^30. (No, I am not suggesting this currently.)
> Relatively widely used encoding for similar cases is Variable-length
> quantity[1]. There is even "Base64 VLQ" in Source Map[2] but it encodes
> a sign in a least significant bit which could inconvenient if we want to
> encode unsigned integers only.
>
> [1] https://en.wikipedia.org/wiki/Variable-length_quantity
> [2]
> https://docs.google.com/document/d/1U1RGAehQwRypUTovF1KRlpiOFze0b-_2gc6fAH0KY0k
Thank you! Besides the sign bit issue, this existing "Base 64 VLQ"
appears to allow for trailing chars with all-zero data bytes, which
wouldn't alter the value (since it's a little-endian encoding).
So it does not force deterministic encoding.
"Base 64 VLQ
The VLQ is a Base64 value, where the most significant bit (the
6th bit) is used as the continuation bit, and the "digits" are encoded
into the string least significant first, and where the least significant
bit of the first digit is used as the sign bit. Note: The values that
can be represent by the VLQ Base64 encoded are limited to 32 bit
quantities until some use case for larger values is presented."
I just found this:
"BASE64 VLQ CODEC (COder/DECoder) AND SOURCEMAP V3 MAPPINGS PARSER"
http://murzwin.com/base64vlq.html
and experimented with it. There are many ways to encode 0:
A
B
gA
ggA
gggA
as well as non-zero numbers, e.g. all of these decode to 16:
gB
ghA
ghgA
and all of these to 17:
iB
ihA
ihgA
Also, there exist invalid encodings that a robust parser will need to
reject. "Invalid VLQ64 encoding, continuation bit set at last character
while decoding" as that JavaScript parser says e.g. for "g" and "gg".
So I am not happy about this specific encoding, although it's something
we could use if we were to use the combination of RFC Base64 charset and
compact encoding for numbers, which as it appears we're not going to.
Alexander
Powered by blists - more mailing lists