[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <242a0a8f0705221027s51f05baepc8d63eab9a531140@mail.gmail.com>
Date: Tue, 22 May 2007 13:27:41 -0400
From: "Brian Eaton" <eaton.lists@...il.com>
To: "Arian J. Evans" <arian.evans@...chronic.com>
Cc: Full-Disclosure <full-disclosure@...ts.grok.org.uk>,
Web Security <websecurity@...appsec.org>
Subject: Re: [WEB SECURITY] Re: noise about full-width
encoding bypass?
On 5/21/07, Arian J. Evans <arian.evans@...chronic.com> wrote:
<snip>
> I can theorize why some of the crazy things in the wild exist, but in the
> end they may be simple control-c/v artifacts.
>
> (As Napoleon said: "Never ascribe to malice what one can ascribe to
> incompetence.")
No doubt. =)
What surprises me is that not all codepage conversion libraries are
doing the same thing with this data. I've tested a few, and some of
them are canonicalizing full-width unicode to ASCII equivalents, and
others are not. Where we run into trouble is where one component
doing input validation uses one technique for canonicalization, and
another component trying to do the actual work is using a different
technique. Figuring out exactly what different application platforms
are doing would help to figure out how much of a problem this poses in
the real world.
Somebody ought to put together a test suite for this, just to see what
different vendors have done.
(At first I was of the opinion that doing such conversions was a
dangerous misfeature, but it actually has some fairly important
applications. For example, doing full text indexing of character data
from different sources requires that you canonicalize first...)
Regards,
Brian
_______________________________________________
Full-Disclosure - We believe in it.
Charter: http://lists.grok.org.uk/full-disclosure-charter.html
Hosted and sponsored by Secunia - http://secunia.com/
Powered by blists - more mailing lists