full-disclosure - Re: Google's robots.txt handling

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAJN6cdf5EFOrGbkWC42jfDNbqwSyyzZ9DK6tSWj_UDs6FePLnA@mail.gmail.com>
Date: Thu, 13 Dec 2012 14:18:03 +1100
From: Patrick Webster <patrick@...hack.com>
To: full-disclosure@...ts.grok.org.uk
Subject: Re: Google's robots.txt handling

I wouldn't consider this an issue. If Google didn't do this, someone
else would have (e.g. my rather old http://www.aushack.com/robanukah/
does it but I never bothered to index the web at large). I believe it
was suggested to Shodan and others, so it was only a matter of time.

If anything, Google is raising awareness by including it in their
results (which I noticed cropped up about 6 months (?) ago).

It is also worth noting that some organisations (and some security
appliances) use it for bait. E.g. robots.txt = Disallow: /database.bak
and as soon as a request is seen the IP is blacklisted permanently,
because their behaviour either means that a spider is disobeying
robots, or more than likely it is a human poking around where they
shouldn't be.

Should Google index it? Probably not - but then you're back to point
#1, if they didn't someone else would have - and Google does a better
job at it, so by all means...

Interestingly, Google indexes their own sites
https://www.google.com/search?q=inurl:robots.txt+filetype%3Atxt+site%3Agoogle.com.
At least they're not playing double standards.

My only questions is *why* did they suddenly decide to include this?
I'd hazard a guess that they released new & improved indexing code,
and this was a by-product of their improvement (perhaps related to the
TXT file-type?).

-Patrick

_______________________________________________
Full-Disclosure - We believe in it.
Charter: http://lists.grok.org.uk/full-disclosure-charter.html
Hosted and sponsored by Secunia - http://secunia.com/