Date: Mon, 08 Dec 1997 20:48:21 -0700 From: Duane Wessels <wessels@nlanr.net> To: John Fieber <jfieber@indiana.edu> Cc: www@FreeBSD.ORG, kostas@nlanr.net Subject: Re: URNs, Mirror sites, and Squid Message-ID: <199712090348.UAA28336@surf> In-Reply-To: Your message of Mon, 08 Dec 1997 20:04:07 -0500
next in thread | raw e-mail | index | archive | help
John Fieber writes: >On Mon, 8 Dec 1997, Duane Wessels wrote: > >> For more details, see http://squid.nlanr.net/Squid/urn-support.html, >> or please us for clarification. > >Www.freebsd.org has the "FreeBSD" pages, which are mirrored >around the world, but it also has quite a few pages that are not >mirrored and consequently requests for those should not go to a >mirror, personal home page (/~foobar/...) for example but there >are some others. Do you have a canned script that has a >relatively simple framework for handling these sorts of >exceptions? I do now. It makes the script a bit more complex, but not too much I hope. In fact, here's a script which I think will work for the FreeBSD site. I just copied your list of mirrors from your home page. Interestingly, the your mirrors list brings out a small problem. Some of the entries end with 'index.html' or 'freebsd.html'. This simple script assumes that you can do straight mappings and substring replacements. We don't want to map 'urn:www.freebsd.org:/foo' to 'http://www.xx.freebsd.org/index.html/foo'. I guess we'll have to see how much of a problem that really becomes. Duane W. ============================================================================== #!/usr/local/bin/perl print "content-type: text/plain\r\n"; print "Expires: ", &http_time(time+3600), "\r\n"; print "\r\n"; if ($ENV{'REQUEST_METHOD'} eq "POST") { read(STDIN, $request, $ENV{'CONTENT_LENGTH'}); } elsif ($ENV{'REQUEST_METHOD'} eq "GET" ) { $request = $ENV{'QUERY_STRING'}; } $request = &url_decode($request); # # special hack; turn 'urn:foo' into 'urn:foo:/' # but this doesn't yet work with Squid (1.2.beta9), i.e. Squid # won't call this script unless the second colon is present. # $request .= ':/' unless ($request =~ /([^:]+):([^:]+):/); $state = 0; while (<DATA>) { chop; s/#.*//; next unless (/./); if ($state == 0) { next if (/^\s/); # skip indented lines $URN = $_; $state = 1 unless (index($request, $URN, 0) < 0); } if ($state == 1) { next unless (/^\s/); # skip non-indented lines $state = 2; } if ($state == 2) { last unless (/^\s/); # exit on next non-indented line s/^\s+//; $URL = $_; print $URL . substr($request, length($URN)) . "\n"; } } exit 0; sub url_decode { local($_) = @_; tr/+/ /; s/%(..)/pack("c",hex($1))/ge; $_; } sub http_time { local($t) = @_; local(@T) = gmtime($t); local(@WD) = ('Sun', 'Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat'); local(@MO) = ( 'Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'); sprintf "%s, %d %s %d %02d:%02d:%02d GMT", $WD[$T[6]], $T[3], $MO[$T[4]], $T[5], $T[2],$T[1],$T[0]; } # The follwing lines are read as <DATA> above. The data consists of any # number of "sections." Each "section" consists of two parts: first URN # prefixes, then URL prefixes indented with whitespace. e.g.: # # URN1 # URN2 # URL1 # URL2 # URL3 # URL4 # # If one of the specified URN prefixes matches the requested URN, then # then, for every listed URL prefix, the URL prefix is substituted for # the URN prefix in the request. We only process one "section" for each # URN request. Thus, more-specific subsets of URN-space should be # specified before less-specific ones. __END__ # # Twiddle directories are not mirrored urn:www.freebsd.org:/~ urn:www.freebsd.org:/%7e http://www.freebsd.org/%7e # # mirrors for our Web/HTTP site. # urn:www.freebsd.org:/ http://www.ar.freebsd.org/ http://www.au.freebsd.org/FreeBSD/ http://www2.au.freebsd.org/ http://www3.au.freebsd.org/ http://www.br.freebsd.org/www.freebsd.org/ http://www2.br.freebsd.org/www.freebsd.org/ http://www3.br.freebsd.org/ http://www.br.freebsd.org/ http://www2.br.freebsd.org/ http://www.ca.freebsd.org/ http://www.cz.freebsd.org/ http://sunsite.auc.dk/www.freebsd.org/ http://www.ee.freebsd.org/ http://www.fi.freebsd.org/ http://www.fr.freebsd.org/ http://www.de.freebsd.org/ http://www.de.freebsd.org/de/ http://www.hu.freebsd.org/ http://www.hu.freebsd.org/hu/ http://www.is.freebsd.org/ http://www.ie.freebsd.org/ http://www.it.freebsd.org/ http://www.jp.freebsd.org/www.freebsd.org/ http://www.jp.freebsd.org/ http://www.kr.freebsd.org/ http://www.lv.freebsd.org/ http://www.nl.freebsd.org/ http://www.pl.freebsd.org/ http://www.pt.freebsd.org/ http://www2.pt.freebsd.org/ http://www3.pt.freebsd.org/ http://www.ru.freebsd.org/ http://www2.ru.freebsd.org/ http://www3.ru.freebsd.org/ http://www.za.freebsd.org/ http://www2.za.freebsd.org/ http://www.se.freebsd.org/www.freebsd.org/ http://www.tw.freebsd.org/ http://www.ua.freebsd.org/ http://www2.ua.freebsd.org/ http://www.uk.freebsd.org/ http://www.freebsd.org/ http://www6.freebsd.org/ http://www7.freebsd.org/ http://www2.freebsd.org/
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199712090348.UAA28336>