phc-discussions - Re: [PHC] Interest in specification of modular crypt format

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAAyV7nF_u72Aq1GhfRHvEnK-82DeSWe2T63P3svhR3chTASgew@mail.gmail.com>
Date: Sun, 27 Sep 2015 16:46:48 -0400
From: Anthony Ferrara <ircmaxell@...il.com>
To: discussions@...sword-hashing.net
Subject: Re: [PHC] Interest in specification of modular crypt format

Alexander,

On Sep 27, 2015 2:52 PM, "Solar Designer" <solar@...nwall.com> wrote:
>
> Hi Anthony,
>
> Thank you for sharing your opinion.
>
> On Sun, Sep 27, 2015 at 09:40:50AM -0400, Anthony Ferrara wrote:
> > Please, can the encoding be kept human-readable?
>
> Maybe.  At least I've counted your vote. ;-)
>
> > Going with one-off encodings does look elegant,
>
> What do you mean by one-off here?  That no other hashing scheme would
> use the same encoding?  Ideally, we'd come up with an encoding, whether
> human-readable or not, that multiple schemes would adopt.

Unless I misread, it seemed like there was a suggestion to change the
encoding per parameter. Parameters that have no valid 0 would be encoded
where the first representable value is 1.

If that was not the case, then never mind.

> > but it makes the implementation significantly more error prone.
>
> Does it?  Maybe that depends on the kind of errors you care about vs.
> disregard.  In my proposed numeric encoding, I am focusing on having
> implementation errors detectable in normal encoding/decoding of valid
> inputs, or being impossible (e.g., can't overflow an int, can't encode
> the same parameter more than once).  With a human-readable encoding like
> Thomas', there are less critical potential errors that, on the other
> hand, would tend to go undetected for long.  Even if the spec has lots
> of MUST's and MUST NOT's, implementations will end up parsing strings
> like p=1,t=2,p=3 (is it p=1 or p=3, then? parsing will vary between
> implementations, and only a handful will actually reject such strings).

I would say that this is simply an error and the spec should define it as
such. A few test vectors for the error case would be good there.

> > People are even confused by the fact that the salt-boundary in bcrypt
> > occurs mid-character (leading to only 4 possible character values for
> > the last salt character).
>
> Yes, and I have no good solution for that one (short of the workaround
> of using multiples of 6 for bit sizes of salts and hashes, or for
> salts+hashes combined as Alexander Cherepanov mentioned to me off-list).

For the record, I am not saying that problem should be "solved". Just to
keep in mind that making things more clever and more efficient on packing
is fine, but it may make the generation of the sale less intuitive.
Something to keep in mind.

> So it's a problem common for compact and verbose encodings, and not a
> reason to choose one over the other.
>
> > Instead, can we focus on readability?
>
> I'd be happy to, but it comes with drawbacks.  Such as:
>
> > For parameters that have well
> > defined defaults, simply don't require that parameter to be specified.
>
> This means the encoding is not deterministic.  A parameter can be
> omitted, or it can be explicitly included with its default value.
>
> That's bad.  There's value in having canonical encodings, and in there
> being only one canonical encoding for a given set of parameters, salt,
> and hash.
>
> If we change "don't require that parameter to be specified" to "require
> that parameter not to be specified", that's better, but then many
> parsers will not actually enforce this rule.  Maybe that's a minor
> enough issue to accept it.  (The issue with multiple specification of a
> parameter is worse.)

Fair enough. Though I don't know which I would rather weight for. My
instinct is to make the implementation more difficult to make the lives of
the users easier. But that is not a strict tradeoff. More of something to
keep in mind as a rule of thumb.

> > For those that don't have well defined defaults, then always require
> > them. Yes, it is a little bit more parsing overhead on the crypt()
> > implementation, but it greatly reduces the overhead on the calling
> > code. Especially for those building wrappers (which happens all the
> > time).
> >
> > I do like the approach of specifically identifying parameter that was
> > brought up in the other thread. Something along the lines for scrypt:
> >
> >     $7$n=14;r=8;p=1$ALotOfSalt$Hash
> >
> > But since p=1 could reasonably be default, the following is identical:
> >
> >     $7$n=14;r=8$ALotOfSalt$Hash
> >
> > Whether a parameter is power-of-2 or base 10 would depend on the
> > algorithm. Also, please don't define the parameter order as rigid.
> > It's simple enough to implement arbitrary order, so why put arbitrary
> > restrictions?
>
> Deterministic canonical encoding.  Side-stepping the issue of multiple
> specification of a parameter (as long as the parser actually rejects
> encodings with parameters appearing in unexpected order).
>
> > So my overall suggestion is make it easier on the developer using the
> > API, rather than the one writing it. After all, there are going to be
> > FAR more people writing a salt string then there will be writing the
> > backend implementation. Optimize for the greater use-case...
>
> This makes sense to me.  We will need to provide an API for generating
> salt strings (or "setting strings", as they're called for bsdicrypt and
> on) anyway, but you have a valid point that, especially if the encoding
> is simple for an application developer to understand, many developers
> will be generating such strings on their own.
>
> For PHP and such, I think this pretty much implies the encoding will
> also need to use RFC Base64.  Then it's just sprintf of some decimal
> numbers and some base64()'s.  So my point remains that Thomas' proposed
> encoding is neither here nor there.

Completely fair. I am less concerned around the encoding of the salt than
the encoding and specification of the parameters. If a high level api is
provided for generating the salt, then awesome. That helps a lot. Though I
would still suggest keeping it readable.

There are a ton of tradeoffs involved, and I appreciate what this team is
doing. Whatever the end result is, it will be adopted.

So thanks!

Anthony

> Thanks again,
>
> Alexander

Content of type "text/html" skipped