From owner-freebsd-questions@FreeBSD.ORG Sat Jul 18 23:35:25 2009 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BCECA1065670 for ; Sat, 18 Jul 2009 23:35:25 +0000 (UTC) (envelope-from vogelke@hcst.com) Received: from beta.hcst.com (beta.hcst.com [192.52.183.241]) by mx1.freebsd.org (Postfix) with ESMTP id 7FC1F8FC0C for ; Sat, 18 Jul 2009 23:35:25 +0000 (UTC) (envelope-from vogelke@hcst.com) Received: from beta.hcst.com (localhost [127.0.0.1]) by beta.hcst.com (8.13.8/8.13.8/Debian-3) with ESMTP id n6INZOAW003268 for ; Sat, 18 Jul 2009 19:35:24 -0400 Received: (from vogelke@localhost) by beta.hcst.com (8.13.8/8.13.8/Submit) id n6INZOJE003267; Sat, 18 Jul 2009 19:35:24 -0400 Received: by kev.msw.wpafb.af.mil (Postfix, from userid 32768) id 45B48B7D9; Sat, 18 Jul 2009 19:34:24 -0400 (EDT) To: freebsd-questions@freebsd.org In-reply-to: <20090718093237.Y19472@cloud.ccsf.cc.ca.us> (jjah@cloud.ccsf.cc.ca.us) Organization: Oasis Systems Inc. X-Disclaimer: I don't speak for the USAF or Oasis. X-GPG-ID: 1024D/711752A0 2006-06-27 Karl Vogel X-GPG-Fingerprint: 56EB 6DBF 4224 C953 F417 CC99 4C7C 7D46 7117 52A0 Message-Id: <20090718233424.45B48B7D9@kev.msw.wpafb.af.mil> Date: Sat, 18 Jul 2009 19:34:24 -0400 (EDT) From: vogelke+unix@pobox.com (Karl Vogel) Subject: Re: OT: wget bug X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: vogelke+unix@pobox.com List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 18 Jul 2009 23:35:26 -0000 >> On Sat, 18 Jul 2009 09:41:00 -0700 (PDT), >> "Joe R. Jah" said: J> Do you know of any workaround in wget, or an alternative tool to ONLY J> download newer files by http? "curl" can help for things like this. For example, if you're getting just a few files, fetch only the header and check the last-modified date: me% curl -I http://curl.haxx.se/docs/manual.html HTTP/1.1 200 OK Proxy-Connection: Keep-Alive Connection: Keep-Alive Date: Sat, 18 Jul 2009 23:24:24 GMT Server: Apache/2.2.3 (Debian) mod_python/3.2.10 Python/2.4.4 Last-Modified: Mon, 20 Apr 2009 17:46:02 GMT ETag: "5d63c-b2c5-1a936a80" Accept-Ranges: bytes Content-Length: 45765 Content-Type: text/html; charset=ISO-8859-1 You can download files only if the remote one is newer than a local copy: me% curl -z local.html http://remote.server.com/remote.html Or only download the file if it was updated since Jan 12, 2009: me% curl -z "Jan 12 2009" http://remote.server.com/remote.html Curl tries to use persistent connections for transfers, so put as many URLs on the same line as you can if you're looking to mirror a site. I don't know how to make curl do something like walking a directory for a recursive download. You can get the source at http://curl.haxx.se/download.html -- Karl Vogel I don't speak for the USAF or my company If lawyers are disbarred and clergymen defrocked, doesn't it follow that electricians can be delighted, musicians denoted, cowboys deranged, models deposed, tree surgeons debarked and dry cleaners depressed?