Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 5 Mar 2001 18:46:34 +0000
From:      Nik Clayton <nik@freebsd.org>
To:        Wolfram Schneider <wosch@panke.de.freebsd.org>
Cc:        Nik Clayton <nik@FreeBSD.ORG>, Jun Kuriyama <kuriyama@imgsrc.co.jp>, doc@FreeBSD.ORG
Subject:   Re: cvs commit: www/en Makefile
Message-ID:  <20010305184634.A8128@canyon.nothing-going-on.org>
In-Reply-To: <20010304232247.C1647@paula.panke.de.freebsd.org>; from wosch@panke.de.freebsd.org on Sun, Mar 04, 2001 at 11:22:48PM %2B0100
References:  <200102241031.f1OAVTZ82598@freefall.freebsd.org> <20010225064044.A68105@canyon.nothing-going-on.org> <20010227122027.A2079@paula.panke.de.freebsd.org> <7mwva9y48r.wl@waterblue.imgsrc.co.jp> <20010301145623.A3225@canyon.nothing-going-on.org> <20010304232247.C1647@paula.panke.de.freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help

--VS++wcV0S1rZb1Fb
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

Wolfram,

I've rolled all your replies up in to one message to make the threading
easier.

On Sun, Mar 04, 2001 at 11:22:48PM +0100, Wolfram Schneider wrote:
> On 2001-03-01 14:56:23 +0000, Nik Clayton wrote:
> > I'd like to=20
> >=20
> >   1.  Agree that all documentation installs in the website under a
> >       single point.  Currently, I prefer docs/ (or doc/).
>=20
> currently the single point is / and for the translated contents /<lang>/

No it's not.  If it was, we wouldn't have the tutorials/ directory.

> if you change this for the english pages to
> /docs/en_US.ISO_8859-1/books/handbook/
>=20
>  - where will be the location of the japanese handbook?
>=20
> 1. /docs/ja_JP.eucJP/books/handbook
> 	or
> 2. /ja/docs/ja_JP.eucJP/books/handbook
>=20
> IMHO both are confusing. with 1) we would have *two* /<lang>
> subdirectories on the homepage, /<lang>/ and /docs/<lang>/
> A link from /ja/docproj/who.html would point outside the /ja/
> prefix to /docs/ja_JP.eucJP/books/handbook/

Personally?  I would prefer everything under docs/, mirroring the layout
of the doc/ repo (actually, we could just call the directory doc/
instead of docs/).

Practically, I suspect ja/ will need to be grandfathered into any new
scheme as well.

I also think that in the translated websites, the default links should
point to the local language version.  So if you're looking at the
Japanese web site, /handbook should take you to the Japanese Handbook.

I have no idea whether this is a popular point of view or not.

> >   2.  Use symlinks to grandfather in existing shortcuts (/FAQ,
> >       /handbook, and a handful of others) so that existing URLs work.
>=20
> No.

    mkdir handbook
    cd handbook
    for i in ../doc/en_US.ISO_8859-1/books/handbook/* ; do
        ln -s $i `basename $i` ;
    done

so then we symlink to files rather than the directory (which you say is
OK a little later on).

[ That's not quite correct, because some documents will have
subdirectories that we'll need to recurse in to, but it's close, and you
get the idea ]

> > Much, much, much longer term I'd like to consider moving the documentat=
ion=20
> > off on to its own subdomain, doc.freebsd.org or similar.  That's a
> > sufficiently big project that I don't want to go anywhere near it at the
> > moment, as we'd just get bogged down.
>=20
> Why do you want move the documentation off the main web site?=20

Fantasies about docs.sun.com.  I suggest that any sort of change in
direction to go to that sort of model is best done on a completely fresh
web site, with no historial baggage.

> Who would use the FreeBSD web site if there is no real contents???

People looking for ports
                   PRs
		   announcements
		   mailing list info
		   a search engine
		   cvsweb
		   vendor lists
		   project information
                   ...

> There is already a docs.freebsd.org site for rare used documentation
> and historically documents (info pages, 44BSD docs, mailing lists).

Is there?

Oh, so there is.  Where is the CVS repo for this then?

I've just tried this:

    http://www.freebsd.org/info

goes to docs.freebsd.org/info

    http://www.uk.freebsd.org/info

gives "Error 404".  As does every other mirror I tried.

This is *exactly* the sort of situation I want to avoid.  We may as well
not have mirrors if we make it so difficult for them to mirror the
content properly.

> > We should also periodically monitor the error logs, as people learn and
> > bookmark the new URLs.  Suppose that, right now, .../FAQ/ gets 10,000
> > hits a month (I've got no idea what the true figure is).  Eventually
> > that'll drop, as the new URLs become commonplace.  We could agree that
> > when the figure drops to something like 50 hits a month (which could
> > take a year or more) we replace .../FAQ/ with a message that says "This
> > content has moved to...".
>=20
> >From my experience, people does not fix bookmarks.
> I updated some weeks ago the apache web server on freefall. I did
> a little experiment and removed all old redirects to test
> how many dead links are in use and if users care about broken links.
>=20
> There are many dead links in use and some users complaints.
> So I added most redirects back. Some of the broken links are
> dead since 4 or 5 years ...

How many complaints? =20

> On 2001-03-01 14:56:23 +0000, Nik Clayton wrote:
> > On Thu, Mar 01, 2001 at 11:04:52PM +0900, Jun Kuriyama wrote:
> > > At Tue, 27 Feb 2001 12:20:27 +0100,
> > > Wolfram Schneider wrote:
> > > > This is confusing and not acceptable. A page which can
> > > > be read on a web server will be read (Murphys Law). This will
> > > > increase the robots load by several ten-thousend page views per day!
> > > > In general, never use symlinks to directories on a web server.
> > >=20
> > > I support Wolfram about this.  We should avoid symlinks as much as we
> > > can.  This breaks search engine's result by returning same contents
> > > with multiple URLs.
> >=20
> > Way, way, way too late:
> >=20
> >     http://www.freebsd.org/news/
> >     http://www.freebsd.org/news/index.html
> >     http://www.freebsd.org/news/news.html
> >=20
> > and other examples (commercial/, copyright/, docproj/, gallery/,
> > internal/, projects/, search/, security/).
>=20
> This are symlinks to *FILES*. This is harmless. If we have 20
> symlinks to files, and 20 robots per day, this would be=20
> additional 400 HTTP hits.
=20
> Search engines knowns how to deal with directory listings
> (/news/ <-> /news.index.html) This is a common case.

This isn't the search engine doing this, it's the webserver sending back
a redirect.

> A symlink to a directory is in practice a recursivly copy of
> the directory and duplicate the contents. One symlink may
> add serveral hundreds or thousands new files to the server!

I'm (seriously) curious about how that could occur.  Could you
elaborate?

> On 2001-02-28 23:36:53 +0000, Nik Clayton wrote:
> > Let's continue this on -doc.  It would be great if you could start by
> > outlining what your plans are for the website for the next 6-12 months,
> > what you think it's shortcomings are, and how you are planning on
> > getting them fixed.
>=20
> my todo list for the next 3 months:
>=20
> 1. Security
>  =20
>   move www.freebsd.org to a new machine with better security
>   (jail, firewall, no user cgi scripts)

Absolutely.

> 2. Search
>=20
>   find a replacement for freewais. While freewais is doing
>   95% of the job, the missing 4% are sometimes annoying.
>   I prefer the Altavista Search Developer Kit, but I don't know
>   if we can get a free license from Altavista/Compaq. A commercially
>   license is to expensive.=20

What's the missing 4%?  I know we get complaints about certain features,
but what do you want to see fixed?  If we can get a (simple) specification
put together, that might be enough to let someone else do the work.

> 3. internal documentation
>=20
>   write a FreeBSD webmaster FAQ
> 	- FreeBSD web server architecture
> 	- the running services (cgi scripts, databases)
> 	- administration and maintaince scripts, configuration files etc.
> 	- ...

Please.

> there are some minor tasks, e.g. fixing the slow portindex perl script
> or replace commercial.raw with commercial.xml

Yep.

> On 2001-03-04 10:56:18 +0000, Nik Clayton wrote:
> > On Sat, Mar 03, 2001 at 05:36:39PM +0100, Stefan `Sec` Zehl wrote:
> > > Symlinks _are_ evil. The alternate paths will (eventually) get linked
> > > somewhere. This will induce more load by the Webspiders which find
> > > everthing twice. These alternate locations will pollute the caches, t=
oo.
> > > The pages will show up in duplicate.=20
> >=20
> > robots.txt solves all these problems.
>=20
> robots.txt works for the big search engines (most  of them ...)
>=20
> There are so many broken robots in the world which ignoring
> the robots exclusion standard. In freefalls httpd.conf you will find
> a long list of broken robots implementations by robot name
> which are blocked to access /cgi

Our mirrors aren't benefiting from this list.

> There is also a list of sites which is completley banned. This
> robots use a standard UserAgent name 'Netscape'.
>
> I guess the robots traffic on www.freebsd.org is up to 30% of
> the total traffic.

What is the total traffic?

> > > And last of all, you can't tell its
> > > a symlink which means this breaks down when mirroring via wget/webcop=
y.
> >=20
> > You shouldn't be mirroring like that, you should be pulling down the
> > www/ and doc/ repositories, and building the site locally.
>=20
> Tell this user Joe. User Joe use a Windows or Linux box to=20
> mirror documents from www.freebsd.org. We cannot stop user Joe
> from doing this.

Shit happens.  How many people do we think are doing this, based on the
logs?

> On 2001-03-01 23:04:52 +0900, Jun Kuriyama wrote:
> > One point we should consider is we have much documents than good old
> > days ((c) Wolfram :-)).  But I think root namespace (such as /FAQ,
>=20
> More documents does not necessarily mean that is harder to
> maintain them - it will just took longer to compile the=20
> documents ;-)
>=20
> There is no reason that we cannot make
>=20
> 	$ cvs co handbook; cd handbook; make all
>=20
> work again. I guess it is less than 1 day work to implement
> this feature again.

That's like expecting to check out a src/bin/cp and build it without a
fully populated /usr/include or /usr/lib.  Not going to happen.
=20
> I'm sure this would improve the quality of the commits because
> it is easier to test a simple change (and no excuse anymore to
> not test the patch).

    make lint

N
--=20
FreeBSD: The Power to Serve             http://www.freebsd.org/
FreeBSD Documentation Project           http://www.freebsd.org/docproj/

          --- 15B8 3FFC DDB4 34B0 AA5F  94B7 93A8 0764 2C37 E375 ---

--VS++wcV0S1rZb1Fb
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.4 (FreeBSD)
Comment: For info see http://www.gnupg.org

iEYEARECAAYFAjqj3wkACgkQk6gHZCw343V7awCfTrtwmTmqWbPSLtTsDPVBufgS
XHcAn0PwYObGoROOwOWXU0JDeuIJhHP0
=I5x/
-----END PGP SIGNATURE-----

--VS++wcV0S1rZb1Fb--

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-doc" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20010305184634.A8128>