Date: Sat, 18 Jul 2009 19:34:24 -0400 (EDT) From: vogelke+unix@pobox.com (Karl Vogel) To: freebsd-questions@freebsd.org Subject: Re: OT: wget bug Message-ID: <20090718233424.45B48B7D9@kev.msw.wpafb.af.mil> In-Reply-To: <20090718093237.Y19472@cloud.ccsf.cc.ca.us> (jjah@cloud.ccsf.cc.ca.us)
next in thread | previous in thread | raw e-mail | index | archive | help
>> On Sat, 18 Jul 2009 09:41:00 -0700 (PDT), >> "Joe R. Jah" <jjah@cloud.ccsf.cc.ca.us> said: J> Do you know of any workaround in wget, or an alternative tool to ONLY J> download newer files by http? "curl" can help for things like this. For example, if you're getting just a few files, fetch only the header and check the last-modified date: me% curl -I http://curl.haxx.se/docs/manual.html HTTP/1.1 200 OK Proxy-Connection: Keep-Alive Connection: Keep-Alive Date: Sat, 18 Jul 2009 23:24:24 GMT Server: Apache/2.2.3 (Debian) mod_python/3.2.10 Python/2.4.4 Last-Modified: Mon, 20 Apr 2009 17:46:02 GMT ETag: "5d63c-b2c5-1a936a80" Accept-Ranges: bytes Content-Length: 45765 Content-Type: text/html; charset=ISO-8859-1 You can download files only if the remote one is newer than a local copy: me% curl -z local.html http://remote.server.com/remote.html Or only download the file if it was updated since Jan 12, 2009: me% curl -z "Jan 12 2009" http://remote.server.com/remote.html Curl tries to use persistent connections for transfers, so put as many URLs on the same line as you can if you're looking to mirror a site. I don't know how to make curl do something like walking a directory for a recursive download. You can get the source at http://curl.haxx.se/download.html -- Karl Vogel I don't speak for the USAF or my company If lawyers are disbarred and clergymen defrocked, doesn't it follow that electricians can be delighted, musicians denoted, cowboys deranged, models deposed, tree surgeons debarked and dry cleaners depressed?
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20090718233424.45B48B7D9>