[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1FE622E4-640B-4FC2-8C0D-5A9EF928B180@goldmark.org>
Date: Fri, 6 Mar 2015 19:29:38 -0600
From: Jeffrey Goldberg <jeffrey@...dmark.org>
To: discussions@...sword-hashing.net
Subject: Re: [PHC] PHC output specifics
On 2015-03-06, at 4:46 PM, Marsh Ray <maray@...rosoft.com> wrote:
> I agree you both are correct in that without Unicode normalization the encoding is underspecified. But I guarantee you that no developer is going to read the PHC spec and say "Oh, I should go implement Unicode normalization now”.
True. It will be useful only for those who have already recognized the need to settle upon an encoding.
> What will happen instead is our encoding recommendation will be ignored altogether. Many developers in US-AU-NZ will say "just use ASCII" without realizing that doesn't actually mean anything in practice. (I used to have on my desk a book 1 inch 2.54 cm thick with variations on ASCII-based code pages and encoding schemes.)
Not to quibble, but one quibble is that I suspect almost all of the variation was outside of the 7-bit, printable, characters. But your point is taken.
> Worst of all, yet another generation of web developers and users around the world will grow up with avoidable limitations on their password character sets because of bad interoperability.
Yes. And I agree that we aren’t going to be able to fix that.
> My guess is there are only a few development teams in the world that are invested into Unicode deep enough that they are willing to put normalization in their product and those teams probably don't need advice from us how to do it. Yes, there are open source libraries for this, but this means even more code handling these secret in memory. I doubt any Unicode libraries implement normalization in a side channel resistant manner.
Hmm. That is a point I hadn’t considered. (Because of the specifics of the applications that I work on, if you are close enough to employ side channel attacks, you are already far too close. So those are “out of scope”, and I’m sufficiently self-centered to not consider the fact that others have different needs than I do.)
>
> So how about this wording:
>
> "For best interoperability of credentials, character data
> SHOULD be a UTF-8 encoded sequence of [cite: ISO 10646] characters.
> [cite: Unicode] aware applications that wish to perform normalization
> SHOULD normalize to [normalization form TBD] before UTF-8 encoding.”
I’m happy with that wording.
Cheers,
-j
Powered by blists - more mailing lists