lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 13 Dec 2012 17:37:52 -0500
From: Jeffrey Walton <noloader@...il.com>
To: Philip Whitehouse <philip@...uk.com>
Cc: "full-disclosure@...ts.grok.org.uk" <full-disclosure@...ts.grok.org.uk>
Subject: Re: Google's robots.txt handling

On Thu, Dec 13, 2012 at 7:52 AM, Philip Whitehouse <philip@...uk.com> wrote:
> I restate my email's second point.
>
> Google is indexing robots.txt because (from all the examples I can see)
> robots.txt doesn't contain a line to disallow indexing of robots.txt
>
> It is possible that some web sites provide actual content in a file that
> happens to be called robots.txt (e.g a website concerned with AI
> development).
>
> Could Google do better by removing the file? Sure. But as webmasters haven't
> told them not to, even though they have provided other files not to index,
> Google is doing exactly what they were asked.
>
Webmasters don't have to in the US - the Computer Fraud and Abuse Act
(CFAA) means Google (et al) must operate within the authority granted
by the webmasters. If that means the webmasters decide they don't want
their site crawled, then Google (et al) has exceeded its authority and
broken US Federal law. Just ask Weev.

This system needs a submission based whitelist.

Jeff

_______________________________________________
Full-Disclosure - We believe in it.
Charter: http://lists.grok.org.uk/full-disclosure-charter.html
Hosted and sponsored by Secunia - http://secunia.com/

Powered by blists - more mailing lists