Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 4 Mar 2001 23:34:50 +0100
From:      Wolfram Schneider <wosch@panke.de.freebsd.org>
To:        Nik Clayton <nik@FreeBSD.ORG>
Cc:        Stefan `Sec` Zehl <sec@42.org>, doc@FreeBSD.ORG
Subject:   Re: cvs commit: www/en Makefile
Message-ID:  <20010304233450.D1647@paula.panke.de.freebsd.org>
In-Reply-To: <20010304105618.A300@canyon.nothing-going-on.org>; from nik@FreeBSD.ORG on Sun, Mar 04, 2001 at 10:56:18AM %2B0000
References:  <200102241031.f1OAVTZ82598@freefall.freebsd.org> <20010225064044.A68105@canyon.nothing-going-on.org> <20010227122027.A2079@paula.panke.de.freebsd.org> <20010227121401.A2631@canyon.nothing-going-on.org> <20010228224508.A2745@paula.panke.de.freebsd.org> <20010228233653.A1692@canyon.nothing-going-on.org> <20010303173639.B25057@matrix.42.org> <20010304105618.A300@canyon.nothing-going-on.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On 2001-03-04 10:56:18 +0000, Nik Clayton wrote:
> On Sat, Mar 03, 2001 at 05:36:39PM +0100, Stefan `Sec` Zehl wrote:
> > Symlinks _are_ evil. The alternate paths will (eventually) get linked
> > somewhere. This will induce more load by the Webspiders which find
> > everthing twice. These alternate locations will pollute the caches, too.
> > The pages will show up in duplicate. 
> 
> robots.txt solves all these problems.

robots.txt works for the big search engines (most  of them ...)

There are so many broken robots in the world which ignoring
the robots exclusion standard. In freefalls httpd.conf you will find
a long list of broken robots implementations by robot name
which are blocked to access /cgi

There is also a list of sites which is completley banned. This
robots use a standard UserAgent name 'Netscape'.

I guess the robots traffic on www.freebsd.org is up to 30% of
the total traffic.


> > And last of all, you can't tell its
> > a symlink which means this breaks down when mirroring via wget/webcopy.
> 
> You shouldn't be mirroring like that, you should be pulling down the
> www/ and doc/ repositories, and building the site locally.

Tell this user Joe. User Joe use a Windows or Linux box to 
mirror documents from www.freebsd.org. We cannot stop user Joe
from doing this.

-Wolfram

-- 
Wolfram Schneider <wosch@freebsd.org> http://wolfram.schneider.org

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-doc" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20010304233450.D1647>