[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20050307205837.GE18839@excession.spiral-arm.org>
Date: Mon, 7 Mar 2005 20:58:37 +0000
From: James Youngman <james+yahoo@...ession.spiral-arm.org>
To: Michael Roitzsch <amalthea@...enet.de>
Cc: bugtraq@...urityfocus.com
Subject: Re: thoughts and a possible solution on homograph attacks
On Mon, Mar 07, 2005 at 06:25:31PM +0100, Michael Roitzsch wrote:
> Hi security community,
>
> this is my first publication I post on Bugtraq, so please be patient with me.
>
> Since the recent problems with IDN, I wanted to clear up my thoughts on
> homograph attacks, so I sorted everything in an article which also contains
> what I believe to be an easy and general solution.
Guts are :-
|| I propose to present the user with a dialog showing the text to be
|| validated and an input field, into which the user has to type in the
|| given text again. The user is told, if both texts match precisely and
|| what this means: If the typed text's internal representation matches
|| the given text bit-by-bit, trust can be established. If it does not
|| match, the user is told to re-check for typing errors and not to
|| establish trust.
Problems with this approach:-
1. The user will see this as an irritation and won't percieve it as
helping them keep their computer secure. Hence they will want to
turn the feature off. People select usability over security unless
they clearly understand the security problem and the usability
difference is manageable.
2. The earlier description says that the quoted actions should occur
"Whenever the user has to validate textual information to establish
trust," but it is my belief that that even includes the case where
the user starts a blank browser and then pastes a URL into the
address box.
3. "matches the given text bit-by-bit" can give spurious negative
results in many circumstances. This particularly applies to those
languages making use of Unicode "combining marks". Combining
characters are additional marks following a main character that are
essentially "decorations" to it. Usually the order of combining
characters is significant, and the user would be able to see the
difference between the orderings (and therefore will know what
order to type their characters to get the same effect they see).
However, there are also other types of combining characters that
have the same rendering no matter what order they're presented in.
This is very similar though to the original problem and it is not
clear to me that domain names containing such combining characters
should be allowed (otherwise there will be two alternative Unicode
code point sequences that appear the same but are actually
different domain names, as I understand things).
I suspect that where a domain name contains a sequence of combining
characters of differing combining classes, the right thing to do is
to allow as a registered domain name only the renderings in which
the marks are encoded in the "canonical order". See section 3.11
of the Unicode Standard, version 4.0. Apologies if I have
misunderstood this area of Unicode, it's a bit complicated and I
don't have a history of immersion in Unicode. The summary of my
third point then is possibly that worrying about the possible
differences of ordering of combining marks is probably the
responsibility of whoever oversees the registration of the IDN, and
probably isn't something we can be expected to solve in every piece
of client software.
The length of this email is out of proportion to its usefulness. Sorry.
James.
Powered by blists - more mailing lists