From owner-freebsd-questions@FreeBSD.ORG Sat Jul 18 16:41:00 2009 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2C81E1065680 for ; Sat, 18 Jul 2009 16:41:00 +0000 (UTC) (envelope-from jjah@cloud.ccsf.cc.ca.us) Received: from cloud.ccsf.cc.ca.us (cloud.ccsf.cc.ca.us [147.144.1.212]) by mx1.freebsd.org (Postfix) with ESMTP id 161BE8FC12 for ; Sat, 18 Jul 2009 16:40:59 +0000 (UTC) (envelope-from jjah@cloud.ccsf.cc.ca.us) Received: from cloud.ccsf.cc.ca.us (localhost.ccsf.cc.ca.us [127.0.0.1]) by cloud.ccsf.cc.ca.us (8.14.2/8.14.2) with ESMTP id n6IGf07x022312; Sat, 18 Jul 2009 09:41:00 -0700 (PDT) (envelope-from jjah@cloud.ccsf.cc.ca.us) Received: from localhost (jjah@localhost) by cloud.ccsf.cc.ca.us (8.14.2/8.14.2/Submit) with ESMTP id n6IGf0DI022309; Sat, 18 Jul 2009 09:41:00 -0700 (PDT) (envelope-from jjah@cloud.ccsf.cc.ca.us) Date: Sat, 18 Jul 2009 09:41:00 -0700 (PDT) From: "Joe R. Jah" To: Andrew Brampton In-Reply-To: Message-ID: <20090718093237.Y19472@cloud.ccsf.cc.ca.us> References: <20090717144049.C35992@cloud.ccsf.cc.ca.us> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Cc: freebsd-questions@freebsd.org Subject: Re: OT: wget bug X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 18 Jul 2009 16:41:00 -0000 On Sat, 18 Jul 2009, Andrew Brampton wrote: > Date: Sat, 18 Jul 2009 12:52:07 +0100 > From: Andrew Brampton > To: Joe R. Jah > Cc: freebsd-questions@freebsd.org > Subject: Re: OT: wget bug > > 2009/7/17 Joe R. Jah : > > > > Hello all, > > > > I want to wget a site at regular intervals and only get the updated pages, > > so I use the this wget command line: > > > > wget -b -m -nH http://host.domain/Directory/file.html > > > > It works fine on the first try, but it fails on subsequent tries with the > > following error message: > > > > --8<-- > > Connecting to host.domain ... connected. > > HTTP request sent, awaiting response... 401 Unauthorized > > Authorization failed. > > --8<-- > > This to me seems like the remote server is replying with 401. Perhaps > wget is sending the If-Modified-Since HTTP header, and the remote > server does not support this. I would confirm this by running tcpdump > (or wireshark) to sniff the traffic and see what the remote server is > replying with. > > If the remote server is truly returning 401, then you might either > need to use an alternative tool, or configure wget differently. > > Hope this helps > Andrew Thank you Andrew. Yes the server is truly returning 401. I have already reconfigured wget to download everything regardless of their timestamp, but it's a waste of bandwidth, because most of the site is unchanged. Do you know of any workaround in wget, or an alternative tool to ONLY download newer files by http? Regards, Joe -- _/ _/_/_/ _/ ____________ __o _/ _/ _/ _/ ______________ _-\<,_ _/ _/ _/_/_/ _/ _/ ......(_)/ (_) _/_/ oe _/ _/. _/_/ ah jjah@cloud.ccsf.cc.ca.us