full-disclosure - Re: [WEB SECURITY] Unicode Left/Right Pointing Double Angel Quotation Mark bypass?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Date: Fri, 5 Jun 2009 16:30:30 -0700
From: "Arian J. Evans" <arian.evans@...chronic.com>
To: Stephen de Vries <stephen@...steddelight.org>, 
	Full-Disclosure <full-disclosure@...ts.grok.org.uk>, 
	"websecurity@...appsec.org" <websecurity@...appsec.org>
Subject: Re: [WEB SECURITY] Unicode Left/Right Pointing
	Double Angel Quotation Mark bypass?

response inline

On Thu, Jun 4, 2009 at 11:23 PM, Stephen de Vries
<stephen@...steddelight.org> wrote:
>
> Hi Arian,
>> http://jeremiahgrossman.blogspot.com/2009/06/results-unicode-leftright-pointing.html
>
> Was there a common library or framework in all the vulnerable sites that was
> responsible for this?

Excellent question. Chris's response covers part of this, but I will
add below where I disagree with his 99% excellent response.

Long and short: Yes I think so, though Blind Black Box testing only
informs you is that the culprit is guilty, not *who* the culprit is.
:)

There are three classes of these issues, and they all occur for
different reasons. Chris already excellently addressed this,
ironically using three categories as well, though I label them a bit
differently:

1. Valid but Alternate Encodings that are normalized
2. Literal Transcodings that occur to avoid one issue (security,
false-familiar, name-collision, etc.) while creating a new
vulnerability
3. Interpreters Bugs that were truly unintended

---

1. Valid Alternate Unicode Encodings that are Normalized:

Some of these issues, like Fullwidth encoding, use valid and
legitimate Unicode representations that the software normalizes to a
canonical form. However uncommon and unexpected the encoding may be --
when you find these they tend to be broadly spread in an application,
and my speculation is that they are the result of a framework or
near-universally used library.

These are the most common issues. (though not in the dataset I
reported yesterday.) They tend to be "vendor" or "open source
framework" issues.

2. Transliteral transcodings (a=A because they look the same) including:

+ in Unicode terms "whole & mixed-script confusables"
+ in language terms "false-familiars"
+ as Chris described "best-fit mappings".

All names == the same. Which, incidentally, is the problem. :)

These are usually found in one specific location, usually specific to
a function or set of functions in an application, from what I see.

My further speculation is that they are the result of a specific
library (or emergent behavior due to a specific combination of
libraries) to facilitate a specific function in that location. Chris
addressed most of the "why" in his response, and I agree most of what
Chris said.

These are, to a degree, Unicode problems insomuch as I think their
recommendations cause some of them. To solve the "confusables" and
"false-familiars" problems that are leveraged for multiple types of
fraudulent and criminal activity (phishing, luring, fraud and stalking
on social networks) Unicode recommended a set of practices to minimize
or avoid name-collisions for Security's Sake:

Unicode Security Considerations
http://unicode.org/reports/tr36/

Unicode Security Mechanisms
http://unicode.org/reports/tr39/

These security recommendations lead to another set of security issues:
the increase in available characters that can be used to launch any
given type of syntax attack.

These are the next most-common issues we see of the 3 types.

These happen for a variety of reasons: sometimes custom coding, and
increasingly in Europe we see libraries and functions baked into
frameworks and packages to do these types of things, sometimes
transparently to the end-developer using the framework.

3. Bizarre Behavior By Interpreters:

And some are simply the result of interpreter bugs/unfathomable
behavior. None of the examples I have given so far are ones that I put
in the #3 category. Yosuke Hasegawa linked in Jeremiah's blog post on
this gives some really good examples of what I mean here.

Namely -- sometimes interpreters (browsers like IE) do really weird or
unsafe things for no clear good reason. I spend very little time
researching these as my focus is on web software and not compiled code
interpreters, so I/we/WhiteHat tend to only find these through
accident.

There is a fuzzy line here as some interpreter bugs allow you to
exploit an application with #3, but it's not my specialty. At WhiteHat
we have been heavily researching #1 and #2 and actually have a wealth
of exploit data to share. The problem was bigger than we thought so
have continued to expand the scope and range of our testing in these
areas, which sometimes allow you to stumble upon a #3 issue.

These are all simply my speculations...to be clear.

I do not feign expertise on this subject like I do when it comes to
motorcycles, mistresses, and martinis.

---
Arian Evans

_______________________________________________
Full-Disclosure - We believe in it.
Charter: http://lists.grok.org.uk/full-disclosure-charter.html
Hosted and sponsored by Secunia - http://secunia.com/