[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170926201112.GA26968@whir>
Date: Tue, 26 Sep 2017 20:11:12 +0000
From: Eric Wong <e@...24.org>
To: Marc Herbert <Marc.Herbert@...el.com>
Cc: Junio C Hamano <gitster@...ox.com>,
Andy Lowry <andy.work@...owry.com>, Jeff King <peff@...f.net>,
git <git@...r.kernel.org>,
Christian Kujau <lists@...dbynature.de>, josh@...htriplett.org,
michael.w.mason@...el.com, linux-kernel@...r.kernel.org
Subject: Re: BUG in git diff-index
Marc Herbert <Marc.Herbert@...el.com> wrote:
> PS: I used NNTP and http://dir.gmane.org/gmane.comp.version-control.git
> to quickly find this old thread (what could we do without NNTP?). Then
> I googled for a web archive of this thread and Google could only find
> this one: http://git.661346.n2.nabble.com/BUG-in-git-diff-index-tt7652105.html#none
> Is there a robots.txt to block indexing on
> https://public-inbox.org/git/1459432667.2124.2.camel@dwim.me ?
There's no blocks on public-inbox.org and I'm completely against
any sort of blocking/throttling. Maybe there's too many pages
to index? Or the Message-IDs in URLs are too ugly/scary? Not
sure what to do about that...
Anyways, I just put up a robots.txt with Crawl-Delay: 1, since I
seem to recall crawlers use a more conservative delay by default:
==> https://public-inbox.org/robots.txt <==
User-Agent: *
Crawl-Delay: 1
I don't know much about SEO other than keeping a site up and
responsive; so perhaps there's more to be done about getting
things indexed...
Powered by blists - more mailing lists