From owner-freebsd-questions@FreeBSD.ORG Sun Jul 19 06:17:16 2009 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C4897106566C for ; Sun, 19 Jul 2009 06:17:16 +0000 (UTC) (envelope-from jjah@cloud.ccsf.cc.ca.us) Received: from cloud.ccsf.cc.ca.us (cloud.ccsf.cc.ca.us [147.144.1.212]) by mx1.freebsd.org (Postfix) with ESMTP id AE7D88FC0A for ; Sun, 19 Jul 2009 06:17:16 +0000 (UTC) (envelope-from jjah@cloud.ccsf.cc.ca.us) Received: from cloud.ccsf.cc.ca.us (localhost.ccsf.cc.ca.us [127.0.0.1]) by cloud.ccsf.cc.ca.us (8.14.2/8.14.2) with ESMTP id n6J6HHZI011521; Sat, 18 Jul 2009 23:17:17 -0700 (PDT) (envelope-from jjah@cloud.ccsf.cc.ca.us) Received: from localhost (jjah@localhost) by cloud.ccsf.cc.ca.us (8.14.2/8.14.2/Submit) with ESMTP id n6J6HGwk011518; Sat, 18 Jul 2009 23:17:17 -0700 (PDT) (envelope-from jjah@cloud.ccsf.cc.ca.us) Date: Sat, 18 Jul 2009 23:17:16 -0700 (PDT) From: "Joe R. Jah" To: Karl Vogel In-Reply-To: <20090718233424.45B48B7D9@kev.msw.wpafb.af.mil> Message-ID: <20090718231230.S10250@cloud.ccsf.cc.ca.us> References: <20090718233424.45B48B7D9@kev.msw.wpafb.af.mil> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Cc: freebsd-questions@freebsd.org Subject: Re: OT: wget bug X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 19 Jul 2009 06:17:17 -0000 On Sat, 18 Jul 2009, Karl Vogel wrote: > Date: Sat, 18 Jul 2009 19:34:24 -0400 (EDT) > From: Karl Vogel > To: freebsd-questions@freebsd.org > Subject: Re: OT: wget bug > > >> On Sat, 18 Jul 2009 09:41:00 -0700 (PDT), > >> "Joe R. Jah" said: > > J> Do you know of any workaround in wget, or an alternative tool to ONLY > J> download newer files by http? > > "curl" can help for things like this. For example, if you're getting > just a few files, fetch only the header and check the last-modified date: > > me% curl -I http://curl.haxx.se/docs/manual.html > HTTP/1.1 200 OK > Proxy-Connection: Keep-Alive > Connection: Keep-Alive > Date: Sat, 18 Jul 2009 23:24:24 GMT > Server: Apache/2.2.3 (Debian) mod_python/3.2.10 Python/2.4.4 > Last-Modified: Mon, 20 Apr 2009 17:46:02 GMT > ETag: "5d63c-b2c5-1a936a80" > Accept-Ranges: bytes > Content-Length: 45765 > Content-Type: text/html; charset=ISO-8859-1 > > You can download files only if the remote one is newer than a local copy: > > me% curl -z local.html http://remote.server.com/remote.html > > Or only download the file if it was updated since Jan 12, 2009: > > me% curl -z "Jan 12 2009" http://remote.server.com/remote.html > > Curl tries to use persistent connections for transfers, so put as many > URLs on the same line as you can if you're looking to mirror a site. I > don't know how to make curl do something like walking a directory for a > recursive download. > > You can get the source at http://curl.haxx.se/download.html Thank you Karl. I already have curl installed, but I don't believe it can get an entire website by giving it the base URL. Regards, Joe -- _/ _/_/_/ _/ ____________ __o _/ _/ _/ _/ ______________ _-\<,_ _/ _/ _/_/_/ _/ _/ ......(_)/ (_) _/_/ oe _/ _/. _/_/ ah jjah@cloud.ccsf.cc.ca.us