Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 18 Jul 2009 09:41:00 -0700 (PDT)
From:      "Joe R. Jah" <jjah@cloud.ccsf.cc.ca.us>
To:        Andrew Brampton <brampton+freebsd@gmail.com>
Cc:        freebsd-questions@freebsd.org
Subject:   Re: OT: wget bug
Message-ID:  <20090718093237.Y19472@cloud.ccsf.cc.ca.us>
In-Reply-To: <d41814900907180452p29244911nd2570909e7274791@mail.gmail.com>
References:  <20090717144049.C35992@cloud.ccsf.cc.ca.us> <d41814900907180452p29244911nd2570909e7274791@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, 18 Jul 2009, Andrew Brampton wrote:

> Date: Sat, 18 Jul 2009 12:52:07 +0100
> From: Andrew Brampton <brampton+freebsd@gmail.com>
> To: Joe R. Jah <jjah@cloud.ccsf.cc.ca.us>
> Cc: freebsd-questions@freebsd.org
> Subject: Re: OT: wget bug
>
> 2009/7/17 Joe R. Jah <jjah@cloud.ccsf.cc.ca.us>:
> >
> > Hello all,
> >
> > I want to wget a site at regular intervals and only get the updated pages,
> > so I use the this wget command line:
> >
> > wget -b -m -nH http://host.domain/Directory/file.html
> >
> > It works fine on the first try, but it fails on subsequent tries with the
> > following error message:
> >
> > --8<--
> > Connecting to host.domain ... connected.
> > HTTP request sent, awaiting response... 401 Unauthorized
> > Authorization failed.
> > --8<--
>
> This to me seems like the remote server is replying with 401. Perhaps
> wget is sending the If-Modified-Since HTTP header, and the remote
> server does not support this. I would confirm this by running tcpdump
> (or wireshark) to sniff the traffic and see what the remote server is
> replying with.
>
> If the remote server is truly returning 401, then you might either
> need to use an alternative tool, or configure wget differently.
>
> Hope this helps
> Andrew

Thank you Andrew.  Yes the server is truly returning 401.  I have already
reconfigured wget to download everything regardless of their timestamp,
but it's a waste of bandwidth, because most of the site is unchanged.

Do you know of any workaround in wget, or an alternative tool to ONLY
download newer files by http?

Regards,

Joe
-- 
     _/   _/_/_/       _/              ____________    __o
     _/   _/   _/      _/         ______________     _-\<,_
 _/  _/   _/_/_/   _/  _/                     ......(_)/ (_)
  _/_/ oe _/   _/.  _/_/ ah        jjah@cloud.ccsf.cc.ca.us



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20090718093237.Y19472>