Date: Fri, 13 Mar 1998 15:50:31 -0500 From: "Gary Palmer" <gpalmer@FreeBSD.ORG> To: Donald Burr <dburr@POBoxes.com> Cc: FreeBSD Ports <freebsd-ports@FreeBSD.ORG>, FreeBSD Questions <freebsd-questions@FreeBSD.ORG> Subject: Re: Squid: Proxying for fun and profit Message-ID: <21582.889822231@gjp.erols.com> In-Reply-To: Your message of "Fri, 13 Mar 1998 07:20:37 PST." <XFMail.980313072037.dburr@POBoxes.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Donald Burr wrote in message ID <XFMail.980313072037.dburr@POBoxes.com>: > -----BEGIN PGP SIGNED MESSAGE----- > The catch, though, is that I don't want this automatic fetching to cross > site boundaries. For example, let's say I'm indexing > http://www.freebsd.org, and I get along to a page mentioning a new device > driver doohickey by Acme Computer (http://www.acme.com/). I would like it > to skip over www.acme.com --ie only index www.freebsd.org pages. > Obviously, this is so that my index thing doesn't run wild and try and > download the entire Web to my computer, which I don't want! [I do > have a lot of disk space, but not THAT much! -- like Steven Wright said, > "You can't have everything -- where would you put it?"] > > Is there anything available (either in ports, or a Perl script that > someone hacked up, etc.) that will do this? Use a hacked up version of webcopy that doesn't write to disk. You can make webcopy use your proxy host, and it won't walk outside the hostname or path that you start it on. That'll preload the pages on your proxy very nicely. Gary -- Gary Palmer FreeBSD Core Team Member FreeBSD: Turning PC's into workstations. See http://www.FreeBSD.ORG/ for info To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-questions" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?21582.889822231>