lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 13 Dec 2012 12:52:27 +0000
From: Philip Whitehouse <>
To: Mario Vilas <>
Cc: "" <>
Subject: Re: Google's robots.txt handling

I restate my email's second point.

Google is indexing robots.txt because (from all the examples I can see) robots.txt doesn't contain a line to disallow indexing of robots.txt

It is possible that some web sites provide actual content in a file that happens to be called robots.txt (e.g a website concerned with AI development).

Could Google do better by removing the file? Sure. But as webmasters haven't told them not to, even though they have provided other files not to index, Google is doing exactly what they were asked.

Maybe the R.E.S. should state that a valid robots.txt should not be indexed.

Incidentally Bing shows the same behaviour - in fact the Google file is the 4th hit even without any of the file type classifiers.

Philip Whitehouse

On 13 Dec 2012, at 11:40, Mario Vilas <> wrote:

> That paragraph says pretty much the exact opposite of what you understood.
> Also, could we please stop refuting points nobody even made in the first place? OP never claimed this to be a vulnerability, nor ever said robots.txt is a proper security mechanism to hide files in public web directories.
> All OP said was the way robots.txt is indexed allows for some Google dorks to be made, and it may be a good idea to avoid that. Clearly it's not the discovery of the century, but it seems fairly reasonable to me... I don't get what all this fuzz is about.
> On Wed, Dec 12, 2012 at 12:18 PM, Christoph Gruber <> wrote:
>> On 12.12.2012 at 00:23 "Lehman, Jim" <> wrote:
>> > It is possible to use white listing for robots.txt. Allow what you want google to index and deny everything else. That way google doesn't make you a goole dork target and someone browsing to your robots.txt file doesn't glean any sensitive files or folders. But this will not stop directory bruting to discover your publicly exposed sensitive data, that probably should not be exposed to the web in the first place.
>> Maybe I misunderstood something, but do you really think that "sensitive" can be hidden in "secret" directories on publicly reachable web servers?
>> --
>> Christoph Gruber
>> By not reading this email you don't agree you're not in any way affiliated with any government, police, ANTI- Piracy Group, RIAA, MPAA, or any other related group, and that means that you CANNOT read this email.
>> By reading you are not agreeing to these terms and you are violating code 431.322.12 of the Internet Privacy Act signed by Bill Clinton in 1995.
>> (which doesn't exist)
>> _______________________________________________
>> Full-Disclosure - We believe in it.
>> Charter:
>> Hosted and sponsored by Secunia -
> -- 
> “There's a reason we separate military and the police: one fights the enemy of the state, the other serves and protects the people. When the military becomes both, then the enemies of the state tend to become the people.”
> _______________________________________________
> Full-Disclosure - We believe in it.
> Charter:
> Hosted and sponsored by Secunia -

Content of type "text/html" skipped

Full-Disclosure - We believe in it.
Hosted and sponsored by Secunia -

Powered by blists - more mailing lists