Date: Thu, 14 Oct 1999 23:27:47 -0700 (PDT) From: odip@bionet.nsc.ru To: freebsd-gnats-submit@freebsd.org Subject: ports/14343: [patch] wget-1.5.3 failed to continue retrieving files from true HTTP/1.1 web servers Message-ID: <19991015062747.C58C9152ED@hub.freebsd.org>
next in thread | raw e-mail | index | archive | help
>Number: 14343 >Category: ports >Synopsis: [patch] wget-1.5.3 failed to continue retrieving files from true HTTP/1.1 web servers >Confidential: no >Severity: critical >Priority: high >Responsible: freebsd-ports >State: open >Quarter: >Keywords: >Date-Required: >Class: change-request >Submitter-Id: current-users >Arrival-Date: Thu Oct 14 23:30:01 PDT 1999 >Closed-Date: >Last-Modified: >Originator: Dmitry Grigorovich >Release: 3.2-RELEASE >Organization: Institute of Cytolog and Genetics >Environment: FreeBSD ghost.bionet.nsc.ru 3.2-RELEASE FreeBSD 3.2-RELEASE #3: Thu Sep 16 17:40:21 NOVST 1999 root@ghost.bionet.nsc.ru:/usr/src/sys/compile/ODIP i386 >Description: Under some condition port of wget-1.5.3 failed to continue retrieving files from true ( wide-implemented ) web servers, which supports HTTP/1.1 ( File patch-aa in port of wget is dirty hack which is incorrect, original version of wget is ok ) First condition: using time-stamping ( the switch -N of wget ) Second condition: connection is lost during retrieving After connection lost wget trying to continue retrieving file. Wget send "GET" request to server with additional headers "If-Modified-Sience" and "Range:", full debug listing showed in topic "How to repeat the problem". As original file is not modified, then true HTTP/1.1 web server as consistent with RFC 2616 topic 14.25 answer "HTTP/1.1 304 Not Modified". Wget break downloading file in spite of only first 4555 bytes are downloaded !!! I repeat that bug contains in file patch-aa of port of wget. This file patch file http.c of original wget in order to adding generating "If-Modified-Sience" header and processing it. But logic of file http.c of wget have elaborate design and modifing it is difficulty task !!! I note that I test problem on web servers such as: apache-1.2.6, apache-1.3.6, apache-1.3.9, two IIS4 servers and other. Some servers like one of IIS4 not correctly process header "If-Modified-Sience", but second IIS4 web server correctly process it consequently wget failed to continue :( >How-To-Repeat: Save url of file (http://www.apache.org/dist/apache_1.3.6.tar.gz ) in file url2 Note that web server MUST BE true HTTP/1.1 ! Try to downloading file and then we need to emulate connection lost during downloading file :) May be simplest way is cable pull out, but I make firewall ipfw to emulate connection lost. To fast progress in emulation we run wget with small timeout - 30 seconds ( switch -T 30 ) Second, we need small pause after connection lost, to connection restore to cable set into or firewall rules remove ( switch -w 30 ) Ok, running command like "wget -d -N -i url2 -T 30 -w 30" Wait while wget downloading begin of file, then emulate connection lost Wait about 30 seconds while wget detect timeout. Then wget will be paused 30 second. In that time we need restore connection. After pause wget trying to continue retrive and op - server answer "304 Not Modified" and we don't receive file !!! The process is not simplest, but I include full debug listing of my test: ---------------------> odip@ghost$ wget -d -N -i url2 -T 30 --dot-style=micro -w 30 DEBUG output created by Wget 1.5.3 on freebsd3.2. Loaded url2 (size 47). parseurl ("http://www.apache.org/dist/apache_1.3.6.tar.gz") -> host www.apache.o rg -> opath dist/apache_1.3.6.tar.gz -> dir dist -> file apache_1.3.6.tar.gz -> ndir dist --11:58:39-- http://www.apache.org:80/dist/apache_1.3.6.tar.gz => `apache_1.3.6.tar.gz' Connecting to www.apache.org:80... Created fd 3. connected! ---request begin--- GET /dist/apache_1.3.6.tar.gz HTTP/1.0 User-Agent: Wget/1.5.3 Host: www.apache.org:80 Accept: */* ---request end--- HTTP request sent, awaiting response... HTTP/1.1 200 OK Date: Fri, 15 Oct 1999 04:58:41 GMT Server: Apache/1.3.10-dev (Unix) ApacheJServ/1.0 PHP/3.0.6 Cache-Control: max-age=86400 Expires: Sat, 16 Oct 1999 04:58:41 GMT Last-Modified: Tue, 23 Mar 1999 22:50:08 GMT ETag: "230ad6-14f078-36f81aa0" Accept-Ranges: bytes Content-Length: 1372280 Connection: close Content-Type: application/x-tar Content-Encoding: x-gzip Length: 1,372,280 [application/x-tar] 0K -> ........ ........ ........ ........ ... [ 0%] Closing fd 3 11:59:19 (126.93 B/s) - Read error at byte 4555/1372280 (Operation timed out). R etrying. --11:59:49-- http://www.apache.org:80/dist/apache_1.3.6.tar.gz (try: 2) => `apache_1.3.6.tar.gz' Connecting to www.apache.org:80... Created fd 3. connected! ---request begin--- GET /dist/apache_1.3.6.tar.gz HTTP/1.0 User-Agent: Wget/1.5.3 Host: www.apache.org:80 Accept: */* Range: bytes=4555- If-Modified-Since: Fri, 15 Oct 1999 04:59:14 GMT ---request end--- HTTP request sent, awaiting response... HTTP/1.1 304 Not Modified Date: Fri, 15 Oct 1999 04:59:51 GMT Server: Apache/1.3.10-dev (Unix) ApacheJServ/1.0 PHP/3.0.6 Connection: close ETag: "230ad6-14f078-36f81aa0" Expires: Sat, 16 Oct 1999 04:59:51 GMT Cache-Control: max-age=86400 Length: unspecified Closing fd 3 Last-modified header missing -- time-stamps turned off. 11:59:52 (0.00 B/s) - `apache_1.3.6.tar.gz' saved [0] >Fix: No problem. Remove file patch-aa from patch directory from port of wget. Rebuild port of wget and reinstall it. In topic "How to repeat the problem" I described the test procedure. After removing file patch-aa and rebuiling I testing again. Next text is full debug listing Now server answer "HTTP/1.1 206 Partial Content" and file continue retrived. -------------> odip@ghost$ ./wget -d -N -i url2 -T 30 --dot-style=micro -w 30 DEBUG output created by Wget 1.5.3 on freebsd3.2. Loaded url2 (size 47). parseurl ("http://www.apache.org/dist/apache_1.3.6.tar.gz") -> host www.apache.o rg -> opath dist/apache_1.3.6.tar.gz -> dir dist -> file apache_1.3.6.tar.gz -> ndir dist --12:01:43-- http://www.apache.org:80/dist/apache_1.3.6.tar.gz => `apache_1.3.6.tar.gz' Connecting to www.apache.org:80... Created fd 3. connected! ---request begin--- GET /dist/apache_1.3.6.tar.gz HTTP/1.0 User-Agent: Wget/1.5.3 Host: www.apache.org:80 Accept: */* ---request end--- HTTP request sent, awaiting response... HTTP/1.1 200 OK Date: Fri, 15 Oct 1999 05:01:44 GMT Server: Apache/1.3.10-dev (Unix) ApacheJServ/1.0 PHP/3.0.6 Cache-Control: max-age=86400 Expires: Sat, 16 Oct 1999 05:01:44 GMT Last-Modified: Tue, 23 Mar 1999 22:50:08 GMT ETag: "230ad6-14f078-36f81aa0" Accept-Ranges: bytes Content-Length: 1372280 Connection: close Content-Type: application/x-tar Content-Encoding: x-gzip Length: 1,372,280 [application/x-tar] 0K -> ........ ........ ........ [ 0%] Closing fd 3 12:02:22 (87.29 B/s) - Read error at byte 3107/1372280 (Operation timed out). Re trying. --12:02:52-- http://www.apache.org:80/dist/apache_1.3.6.tar.gz (try: 2) => `apache_1.3.6.tar.gz' Connecting to www.apache.org:80... Created fd 3. connected! ---request begin--- GET /dist/apache_1.3.6.tar.gz HTTP/1.0 User-Agent: Wget/1.5.3 Host: www.apache.org:80 Accept: */* Range: bytes=3107- ---request end--- HTTP request sent, awaiting response... HTTP/1.1 206 Partial Content Date: Fri, 15 Oct 1999 05:02:53 GMT Server: Apache/1.3.10-dev (Unix) ApacheJServ/1.0 PHP/3.0.6 Cache-Control: max-age=86400 Expires: Sat, 16 Oct 1999 05:02:53 GMT Last-Modified: Tue, 23 Mar 1999 22:50:08 GMT ETag: "230ad6-14f078-36f81aa0" Accept-Ranges: bytes Content-Length: 1369173 Content-Range: bytes 3107-1372279/1372280 Connection: close Content-Type: application/x-tar Content-Encoding: x-gzip Length: 1,372,280 (1,369,173 to go) [application/x-tar] 0K -> ,,,,,,,, ,,,,,,,, ,,,,,,,, ........ ........ ........ [ 0%] 6K -> ........ ...^C odip@ghost$ ------------------------> Sorry for big report of problem ! >Release-Note: >Audit-Trail: >Unformatted: To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-ports" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?19991015062747.C58C9152ED>